Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
awesome-crawler
A collection of awesome web crawler,spider in different languages
https://github.com/asciimoo/awesome-crawler
Last synced: 5 days ago
JSON representation
-
Python
- Scrapy - A fast high-level screen scraping and web crawling framework.
- django-dynamic-scraper - Creating Scrapy scrapers via the Django admin interface.
- scrapy-cluster - Uses Redis and Kafka to create a distributed on demand scraping cluster.
- distribute_crawler - Uses scrapy,redis, mongodb,graphite to create a distributed spider.
- pyspider - A powerful spider system.
- Demiurge - PyQuery-based scraping micro-framework.
- Scrapely - A pure-python HTML screen-scraping library.
- Scrapy-Redis - Redis-based components for Scrapy.
- cola - A distributed crawling framework.
-
C#
- SimpleCrawler - Simple spider base on mutithreading, regluar expression.
- ccrawler - Built in C# 3.5 version. it contains a simple extension of web content categorizer, which can saparate between the web page depending on their content.
-
Java
- websphinx - Website-Specific Processors for HTML information extraction.
Sub Categories