Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/C4o/ChineseDarkWebCrawler
中文暗网爬虫
https://github.com/C4o/ChineseDarkWebCrawler
Last synced: about 1 month ago
JSON representation
中文暗网爬虫
- Host: GitHub
- URL: https://github.com/C4o/ChineseDarkWebCrawler
- Owner: C4o
- License: gpl-3.0
- Created: 2018-11-16T15:49:12.000Z (about 6 years ago)
- Default Branch: master
- Last Pushed: 2018-11-16T17:16:31.000Z (about 6 years ago)
- Last Synced: 2024-07-31T12:07:50.381Z (4 months ago)
- Language: HTML
- Size: 3.87 MB
- Stars: 265
- Watchers: 13
- Forks: 74
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- favorite-link - 中文暗网爬虫。
- awesome-hacking-lists - C4o/ChineseDarkWebCrawler - 中文暗网爬虫 (HTML)
README
# 中文暗网爬虫
## 运行环境
```
python2.7
selenium
tor浏览器
geckodriver.exe
```## 运行方式
* 没有什么特殊的库,缺啥直接pip安装就行
#### 页面爬取
```
darkweb.py 页面爬取保存及图片id保存脚本
示例: python darkweb.py keyword pagenum
keyword必须是其中一个:'sex','data','service','material','virtual_source','teach','cvv','other','basic','private'
pagenum是页数,随意
```
#### 图片爬取
```
get_darkweb_pic_auto.py 根据保存的图片id进行图片定时爬取
python get_darkweb_pic_auto.py
时间间隔自行设定
```
#### 前台显示
```
使用nginx.conf启动nginx
python manage.py runserver 127.0.0.1 port
修改配置文件连对应的端口号
```
## 预览
![1](1.png "1")![2](2.png "2")
## PS
如果要用前台展示,换个好看的前端,没空改,所以太丑了,顺便帮我更新下