https://github.com/turboway/glidedsky
glidedsky 通关笔记
https://github.com/turboway/glidedsky
Last synced: 19 days ago
JSON representation
glidedsky 通关笔记
- Host: GitHub
- URL: https://github.com/turboway/glidedsky
- Owner: TurboWay
- Created: 2020-09-07T08:53:50.000Z (over 4 years ago)
- Default Branch: master
- Last Pushed: 2021-01-14T02:19:19.000Z (over 4 years ago)
- Last Synced: 2025-03-26T19:45:38.533Z (about 1 month ago)
- Language: Python
- Homepage: http://www.glidedsky.com/
- Size: 668 KB
- Stars: 28
- Watchers: 3
- Forks: 12
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# glidedsky
glidedsky 通关笔记[镀金的天空](http://www.glidedsky.com/) 是一个互联网技能认证网站,要保证用户解出一道题目就意味着拥有了解决类似问题相应的技能

## note
- 爬虫采集属于 io 密集型操作,使用多线程并发可以提高效率,但是最佳并发数取决于爬虫的机器配置,而不是越多越好
- 网络请求有时候会出错,重试是必要的,不用框架的话,装饰器是很好的选择
- 使用代理 ip 时,网络错误导致漏爬的可能性很高,只有重试是不够的,先把结果存下来,做好补爬的准备,是比较稳妥的策略
- 使用图片识别时,成功率不会达到 100%,所以多采集几次是必要的,对每个数取重复率最高的结果,是较好的做法## list
| 代码 | 说明 |
| ------------ | ------------ |
| [crawler-basic-1.py](https://github.com/TurboWay/glidedsky/blob/master/crawler-basic-1.py) | 爬虫-基础1 |
| [crawler-basic-2.py](https://github.com/TurboWay/glidedsky/blob/master/crawler-basic-2.py) | 爬虫-基础2 |
| [crawler-captcha-1.py](https://github.com/TurboWay/glidedsky/blob/master/crawler-captcha-1.py) | 爬虫-验证码-1 |
| [crawler-captcha-2.py](https://github.com/TurboWay/glidedsky/blob/master/crawler-captcha-2.py) | 爬虫-验证码-2 【网站服务异常,暂时无法审题】 |
| [crawler-css-puzzle-1.py](https://github.com/TurboWay/glidedsky/blob/master/crawler-css-puzzle-1.py) | 爬虫-CSS反爬 |
| [crawler-font-puzzle-1.py](https://github.com/TurboWay/glidedsky/blob/master/crawler-font-puzzle-1.py) | 爬虫-字体反爬-1 |
| [crawler-font-puzzle-2.py](https://github.com/TurboWay/glidedsky/blob/master/crawler-font-puzzle-2.py) | 爬虫-字体反爬-2 |
| [crawler-ip-block-1.py](https://github.com/TurboWay/glidedsky/blob/master/crawler-ip-block-1.py) | 爬虫-IP屏蔽1 |
| [crawler-ip-block-2.py](https://github.com/TurboWay/glidedsky/blob/master/crawler-ip-block-2.py) | 爬虫-IP屏蔽2 |
| [crawler-javascript-obfuscation-1.py](https://github.com/TurboWay/glidedsky/blob/master/crawler-javascript-obfuscation-1.py) | 爬虫-JS加密1 |
| [crawler-sprite-image-1.py](https://github.com/TurboWay/glidedsky/blob/master/crawler-sprite-image-1.py) | 爬虫-雪碧图-1 |
| [crawler-sprite-image-2.py](https://github.com/TurboWay/glidedsky/blob/master/crawler-sprite-image-2.py) | 爬虫-雪碧图-2 |## refer
>滑动验证码 参考 https://github.com/ybsdegit/captcha_qq
>
>图片识别模型训练 https://github.com/TurboWay/antman