Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/lin-zone/awesome

a list of some interesting repositories, tools
https://github.com/lin-zone/awesome

List: awesome

Last synced: 29 days ago
JSON representation

a list of some interesting repositories, tools

Awesome Lists containing this project

README

        

# Awesome

## Scrapy Distributed

* [crawlab](https://github.com/crawlab-team/crawlab) - 基于Golang的分布式爬虫管理平台
* [Gerapy](https://github.com/Gerapy/Gerapy) - Distributed Crawler Management Framework Based on Scrapy, Scrapyd, Django and Vue.js
* [scrapydweb](https://github.com/my8100/scrapydweb) - ScrapydWeb: Web app for Scrapyd cluster management

## Scrapy Middleware

* [scrapy-mongodb](https://github.com/sebdah/scrapy-mongodb) - MongoDB pipeline for Scrapy
* [scrapy-splitvariants](https://github.com/scrapy-plugins/scrapy-splitvariants) - Scrapy spider middleware to split an item into multiple items using a multi-valued key
* [scrapy-proxies](https://github.com/aivarsk/scrapy-proxies) - Random proxy middleware for Scrapy
* [scrapy-fake-useragent](https://github.com/alecxe/scrapy-fake-useragent) - Random User-Agent middleware based on fake-useragent
* [scrapy-selenium](https://github.com/clemfromspace/scrapy-selenium) - Scrapy middleware to handle javascript pages using selenium
* [scrapy-crawlera](https://github.com/scrapy-plugins/scrapy-crawlera) - Crawlera middleware for Scrapy
* [crawlera](https://scrapinghub.com/crawlera) - The World's Smartest Proxy Network
* [scrapy-deltafetch](https://github.com/scrapy-plugins/scrapy-deltafetch) - Scrapy spider middleware to ignore requests to pages containing items seen in previous crawls
* [scrapy-random-useragent](https://github.com/cnu/scrapy-random-useragent) - Scrapy Middleware to set a random User-Agent for every Request.
* [scrapy-crawl-once](https://github.com/TeamHG-Memex/scrapy-crawl-once) - Scrapy middleware which allows to crawl only new content
* [scrapy-magicfields](https://github.com/scrapy-plugins/scrapy-magicfields) - Scrapy middleware to add extra fields to items, like timestamp, response fields, spider attributes etc.

## command line

* [cleo](https://github.com/sdispater/cleo) - Cleo allows you to create beautiful and testable command-line interfaces.

## HTML parser

* [scrapely](https://github.com/scrapy/scrapely) - A pure-python HTML screen-scraping library

## Crawler

* [Douyin-Bot](https://github.com/wangshub/Douyin-Bot) - Python 抖音机器人,论如何在抖音上找到漂亮小姐姐
* [SinaSpider](https://github.com/LiuXingMing/SinaSpider) - 新浪微博爬虫(Scrapy、Redis)
* [ECommerceCrawlers](https://github.com/DropsDevopsOrg/ECommerceCrawlers) - 实战多种网站、电商数据爬虫
* [examples-of-web-crawlers](https://github.com/shengqiangzhang/examples-of-web-crawlers) - 一些非常有趣的python爬虫例子,对新手比较友好,主要爬取淘宝、天猫、微信、豆瓣、QQ等网站
* [PythonCrawler](https://github.com/yhangf/PythonCrawler) - 用python编写的爬虫项目集合
* [amemv-crawler](https://github.com/loadchange/amemv-crawler) - 下载指定的抖音号的视频,抖音爬虫
* [course-crawler](https://github.com/Foair/course-crawler) - 中国大学MOOC、学堂在线、网易云课堂、好大学在线、爱课程 MOOC 课程下载
* [zhihu_crawler](https://github.com/SmileXie/zhihu_crawler) - Crawler of zhihu.com
* [awesome-spider](https://github.com/facert/awesome-spider) - 爬虫集合
* [python-spider](https://github.com/Jack-Cherish/python-spider) - Python3网络爬虫实战
* [Anti-Anti-Spider](https://github.com/luyishisi/Anti-Anti-Spider) - 处理反爬
* [FunpySpiderSearchEngine](https://github.com/mtianyan/FunpySpiderSearchEngine) - Scrapy 1.6.0爬取数据 + ElasticSearch6.8.0+Django2.2搜索引擎
* [spider163](https://github.com/chengyumeng/spider163) - 抓取网易云音乐热门评论
* [Python-Spider](https://github.com/lb2281075105/Python-Spider) - Python 爬虫
* [ScrapyProject](https://github.com/cuanboy/ScrapyProject) - Scrapy实战项目合集

## Tools

* [qrcode](https://github.com/sylnsfar/qrcode) - Python 艺术二维码生成器

## utils

* [queuelib](https://github.com/scrapy/queuelib) - Collection of persistent (disk-based) queues