https://github.com/rogary/zhihutopic
https://github.com/rogary/zhihutopic
Last synced: 2 months ago
JSON representation
- Host: GitHub
- URL: https://github.com/rogary/zhihutopic
- Owner: Rogary
- Created: 2015-11-25T13:58:51.000Z (over 10 years ago)
- Default Branch: master
- Last Pushed: 2018-12-02T07:32:04.000Z (over 7 years ago)
- Last Synced: 2025-03-12T21:32:03.409Z (over 1 year ago)
- Language: Python
- Size: 8.79 KB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# ZhihuTopic
> 第一次捣鼓python捣鼓了好几天
> 终于是把知乎的topic爬出来了
> 结果居然跟大牛们的二十几万差了很多让我很纠结
> 不过检查了一下逻辑应该没有问题
> 这里说一下思路
## 思路
> 遍历知乎的话题树,从[跟话题](http://www.zhihu.com/topic/19776749/organize/entire)开始
> 单线程太慢所以改成了多线程,从第一级的6个话题开始
> 如果遇到了 父子节点相同的则跳过
总共爬出4W+ 的结果 希望各位批评指正