web scraping - NameError: name 'hxs' is not defined when using Scrapy -
i have launched scrapy shell , have pinged wikipedia.
scrapy shell http://en.wikipedia.org/wiki/main_page
i confident step correct, judging verbose nature of scrapy's response.
next, i'd see happens when write
hxs.select('/html').extract()
at point, error:
nameerror: name 'hxs' not defined
what problem? know scrapy installed fine, has accepted url destination, why there issue witht hxs
command?
i suspect using version of scrapy doesn't have hxs
on shell anymore.
use sel
instead (deprecated after 0.24, see below):
$ scrapy shell http://en.wikipedia.org/wiki/main_page >>> sel.xpath('//title/text()').extract()[0] u'wikipedia, free encyclopedia'
or, of scrapy 1.0, should use selector object of response
, it's .xpath
, .css
convenience methods:
$ scrapy shell http://en.wikipedia.org/wiki/main_page >>> response.xpath('//title/text()').extract()[0] u'wikipedia, free encyclopedia'
fyi, quote using selectors in scrapy documentation:
... after shell loads, you’ll have response available
response
shell variable, , attached selector inresponse.selector
attribute.
...
querying responses using xpath , css common responses include 2 convenience shortcuts:response.xpath()
,response.css()
:
>>> response.xpath('//title/text()')
[<selector (text) xpath=//title/text()>]
>>> response.css('title::text')
[<selector (text) xpath=//title/text()>]
Comments
Post a Comment