Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/KingBridgeSS/XZSpider
https://github.com/KingBridgeSS/XZSpider
Last synced: 5 days ago
JSON representation
- Host: GitHub
- URL: https://github.com/KingBridgeSS/XZSpider
- Owner: KingBridgeSS
- Created: 2023-07-29T14:33:33.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2023-09-24T10:12:26.000Z (about 1 year ago)
- Last Synced: 2024-08-02T15:34:51.499Z (3 months ago)
- Language: Python
- Size: 6.84 KB
- Stars: 6
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# XZSpider
[先知社区](https://xz.aliyun.com/)爬虫,支持异步下载,更新文章URL,并把文章以离线markdown格式保存到本地。# Usage
1. 安装依赖`pip3 install -r requirements.txt`
2. 更新 URLs
`python3 update_list.py`
运行后程序会把上次爬取的URL保存到previous_list.json,本次爬取的所有文章URL保存在list.json,并求差集保存到diff_list.json以供下载。
3. 下载
```
Usage: python3 download.py -d
Options:
-d Specify the save path for downloaded files (if not specified, download to ./downloads)
-h Show this help message
```# TODOs
- [x] 提供进度条
- [x] 异步下载(线程池)
- [ ] 异步更新URL