web scraping - NameError: name 'hxs' is not defined when using Scrapy -

September 15, 2014

i have launched scrapy shell , have pinged wikipedia.

scrapy shell http://en.wikipedia.org/wiki/main_page

i confident step correct, judging verbose nature of scrapy's response.

next, i'd see happens when write

hxs.select('/html').extract()

at point, error:

nameerror: name 'hxs' not defined

what problem? know scrapy installed fine, has accepted url destination, why there issue witht hxs command?

i suspect using version of scrapy doesn't have hxs on shell anymore.

use sel instead (deprecated after 0.24, see below):

$ scrapy shell http://en.wikipedia.org/wiki/main_page >>> sel.xpath('//title/text()').extract()[0] u'wikipedia, free encyclopedia'

or, of scrapy 1.0, should use selector object of response, it's .xpath , .css convenience methods:

$ scrapy shell http://en.wikipedia.org/wiki/main_page >>> response.xpath('//title/text()').extract()[0] u'wikipedia, free encyclopedia'

fyi, quote using selectors in scrapy documentation:

... after shell loads, you’ll have response available response shell variable, , attached selector in response.selector attribute.
...
querying responses using xpath , css common responses include 2 convenience shortcuts: response.xpath() , response.css():

>>> response.xpath('//title/text()')
[<selector (text) xpath=//title/text()>]
>>> response.css('title::text')
[<selector (text) xpath=//title/text()>]

Search This Blog

UIO

web scraping - NameError: name 'hxs' is not defined when using Scrapy -

Comments

Post a Comment

Popular posts from this blog

How to dequeue messages from RabbitMQ in a scheduled time -

Python Kivy ListView: How to delete selected ListItemButton? -

ruby - How do I merge two hashes into a hash of arrays? -