Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/elliotxx/readnewspaper
自动获取电子版报纸,方便每天阅读
https://github.com/elliotxx/readnewspaper
crawler lxml newspaper pypdf2 python requests
Last synced: about 1 month ago
JSON representation
自动获取电子版报纸,方便每天阅读
- Host: GitHub
- URL: https://github.com/elliotxx/readnewspaper
- Owner: elliotxx
- License: gpl-2.0
- Created: 2018-03-20T03:50:29.000Z (almost 7 years ago)
- Default Branch: master
- Last Pushed: 2019-11-11T07:55:35.000Z (about 5 years ago)
- Last Synced: 2024-11-06T03:47:05.650Z (3 months ago)
- Topics: crawler, lxml, newspaper, pypdf2, python, requests
- Language: Python
- Size: 17.6 KB
- Stars: 4
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
## readNewspaper
自动获取电子版报纸,方便每天阅读## 使用方式
```python
python BandaoNewspaper.py
```## 目前可以获取的报纸
* 《半岛都市报》
报纸首页:http://bddsb.bandao.cn/## 特性
* [x] 自动合并PDF
* [x] 代理IP池
* [ ] 运行脚本时自动获取代理IP池
* [ ] 通过参数选择对应日期的报纸
* [ ] 每天定时检查是否有最新报纸,如果有,生成pdf发邮件提醒## 依赖
* PyPDF2
* requests
* lxml## 参考资料
* 在windows下安装PyPdf2,将文件夹中的pdf文件合成为一个pdf文件
http://blog.csdn.net/andy_blogs/article/details/78041679* Python SMTP 发送带附件电子邮件
https://blog.csdn.net/zm2714/article/details/7993732* python+smtp发送邮件附件问题
https://segmentfault.com/q/1010000009102883