Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/gnemoug/sina_reptile
获取新浪微博1000w用户的基本信息和每个爬取用户最近发表的50条微博,使用python编写,多进程爬取,将数据存储在了mongodb中
https://github.com/gnemoug/sina_reptile
Last synced: 4 months ago
JSON representation
获取新浪微博1000w用户的基本信息和每个爬取用户最近发表的50条微博,使用python编写,多进程爬取,将数据存储在了mongodb中
- Host: GitHub
- URL: https://github.com/gnemoug/sina_reptile
- Owner: gnemoug
- Created: 2012-09-10T01:01:02.000Z (over 12 years ago)
- Default Branch: master
- Last Pushed: 2013-03-22T11:44:03.000Z (almost 12 years ago)
- Last Synced: 2024-08-01T16:44:38.406Z (7 months ago)
- Language: Python
- Homepage:
- Size: 178 KB
- Stars: 473
- Watchers: 60
- Forks: 284
- Open Issues: 3
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
这是一个关于sina微博的爬虫,采用python开发,并修改了其sdk中的bug,采用mongodb存储,实现了多进程爬取任务。功能是:获取新浪微博1000w用户的基本信息和每个爬取用户最近发表的50条微博,使用python编写,多进程爬取,将数据存储在了
mongodb中