web scraping - NameError: name 'hxs' is not defined when using Scrapy -
i have launched scrapy shell , have pinged wikipedia.
scrapy shell http://en.wikipedia.org/wiki/main_page
i confident step correct, judging verbose nature of scrapy's response.
next, i'd see happens when write
hxs.select('/html').extract()
at point, error:
nameerror: name 'hxs' not defined
what problem? know scrapy installed fine, has accepted url destination, why there issue witht hxs command?
i suspect using version of scrapy doesn't have hxs on shell anymore.
use sel instead (deprecated after 0.24, see below):
$ scrapy shell http://en.wikipedia.org/wiki/main_page >>> sel.xpath('//title/text()').extract()[0] u'wikipedia, free encyclopedia' or, of scrapy 1.0, should use selector object of response, it's .xpath , .css convenience methods:
$ scrapy shell http://en.wikipedia.org/wiki/main_page >>> response.xpath('//title/text()').extract()[0] u'wikipedia, free encyclopedia' fyi, quote using selectors in scrapy documentation:
... after shell loads, you’ll have response available
responseshell variable, , attached selector inresponse.selectorattribute.
...
querying responses using xpath , css common responses include 2 convenience shortcuts:response.xpath(),response.css():
>>> response.xpath('//title/text()')
[<selector (text) xpath=//title/text()>]
>>> response.css('title::text')
[<selector (text) xpath=//title/text()>]
Comments
Post a Comment