💾 Archived View for chirale.org › 2013-01-04_920.gmi captured on 2024-05-12 at 15:24:05. Gemini links have been rewritten to link to archived content

View Raw

More Information

-=-=-=-=-=-=-

Scrapy on Debian 6

Debian 6 comes with Scrapy 8 as downloadable packages on apt. Here a quick howto to get this spider works on Debian

Scrapy 8

sudo apt-get install python-scrapy cd mkdir mydir cd mydir scrapy-ctl startproject anime export SCRAPY_SETTINGS_MODULE=anime.settings export PYTHONPATH=/home/YOURHOMEHERE/mydir

If you’ve already a bot but you, to run your spider thanks to point 6 and 7 you can simply type:

scrapy-ctl crawl example.com

Otherwise, now you can follow the howto on tutorial section of Scrapy 8 or this awesome howto by Pravin Paratey to write your own bot, but remember to use the scrapy-ctl command instead of the .py version and to add all your spiders to SCRAPY_SETTINGS_MODULE and PYTHONPATH.

Scrapy 8

awesome howto by Pravin Paratey

To list your available (and correctly configured) spider, just type:

scrapy-ctl list

If a bot doesn’t appear here, you have an issue on point 6 or 7 or you have a misconfigured spider, i.e. I was forgetting the SPIDER part on bottom of my spider and I was using domain instead of domain_name on my script, see Pravin’s howto to write correct Scrapy 0 code.

https://web.archive.org/web/20130104000000*/http://doc.scrapy.org/en/0.8/intro/tutorial.html

https://web.archive.org/web/20130104000000*/http://doc.scrapy.org/en/0.8/intro/tutorial.html

https://web.archive.org/web/20130104000000*/http://pravin.insanitybegins.com/posts/writing-a-spider-in-10-mins-using-scrapy