EzDev.org

pyspider

A Powerful Spider(Web Crawler) System in Python. pyspider


Can Scrapy be replaced by pyspider?

I've been using Scrapy web-scraping framework pretty extensively, but, recently I've discovered that there is another framework/system called pyspider, which, according to it's github page, is fresh, actively developed and popular.

pyspider's home page lists several things being supported out-of-the-box:

  • Powerful WebUI with script editor, task monitor, project manager and result viewer

  • Javascript pages supported!

  • Task priority, retry, periodical and recrawl by age or marks in index page (like update time)

  • Distributed architecture

These are the things that Scrapy itself doesn't provide, but, it is possible with the help of portia (for Web UI), scrapyjs (for js pages) and scrapyd (deploying and distributing through API).

Is it true that pyspider alone can replace all of these tools? In other words, is pyspider a direct alternative to Scrapy? If not, then which use cases does it cover?

I hope I'm not crossing "too broad" or "opinion-based" line.


Source: (StackOverflow)