{"id":23051132,"url":"https://github.com/duguce/jdspider","last_synced_at":"2025-08-15T03:31:53.737Z","repository":{"id":218365293,"uuid":"745929950","full_name":"Duguce/JdSpider","owner":"Duguce","description":"🍀 这是一个简易的京东评论\u0026问答内容爬虫","archived":false,"fork":false,"pushed_at":"2024-04-03T02:25:17.000Z","size":7607,"stargazers_count":5,"open_issues_count":0,"forks_count":1,"subscribers_count":1,"default_branch":"main","last_synced_at":"2024-04-03T03:31:11.062Z","etag":null,"topics":["comments","jdspider","jingdong","qa","spider"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Duguce.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null}},"created_at":"2024-01-20T15:26:15.000Z","updated_at":"2024-03-26T17:00:11.000Z","dependencies_parsed_at":"2024-03-11T16:03:22.528Z","dependency_job_id":"e0b99c2e-6cd3-4091-afbc-b9f7ba59a3e9","html_url":"https://github.com/Duguce/JdSpider","commit_stats":null,"previous_names":["duguce/jdspider"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Duguce%2FJdSpider","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Duguce%2FJdSpider/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Duguce%2FJdSpider/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Duguce%2FJdSpider/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Duguce","download_url":"https://codeload.github.com/Duguce/JdSpider/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":229890093,"owners_count":18140042,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["comments","jdspider","jingdong","qa","spider"],"created_at":"2024-12-15T23:44:33.233Z","updated_at":"2024-12-15T23:44:34.015Z","avatar_url":"https://github.com/Duguce.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# JdSpider\n\n京东评论\u0026问答数据爬虫：根据商品Id实现对京东平台上的评论内容（目前仅支持爬取单个商品前100页的评论内容）及商品问答模块的问题数据\n\n## 项目结构\n\n```\n│  com_spider.py               # 京东评论爬取模块\n│  config.py                   # 配置文件\n│  main.py                     # 主程序入口（批量爬取评论和问答内容）\n│  qa_spider.py                # 问答爬虫模块\n│  README.md                   # 项目说明文档\n│  requirements.txt            # 依赖库配置\n│  search_spider.py            # 搜索爬虫模块（根据关键词批量爬取商品Id）\n│\n├─drivers                      # 驱动文件目录\n│      chromedriver.exe        # Chrome浏览器驱动\n│      geckodriver.exe         # Firefox浏览器驱动\n│\n├─ids_collection               # 商品ID存储目录\n├─output                       # 爬取评论和问答内容存储目录\n\n```\n\n## 快速开始\n\n1. 准备环境\n\n   - `conda create -n jd_spider python=3.8.18`\n\n   - `conda activate jd_spider`\n\n   - `pip install -r requirements.txt`\n\n2. 配置\n\n   - 重命名[example_config.py](./example_config.py)为`config.py`\n   - 在`config.py`配置相关信息\n\n3. 运行\n\n   - 运行[search_spider.py](./search_spider.py)获取商品ID\n   - 运行[main.py](./main.py)收集商品评论和问答内容\n\n## 待做功能\n\n- [x] 根据关键词从京东主页批量抓取商品ID；\n- [x] 批量抓取商品评论\u0026问答数据逻辑；\n- [x] `main.py`模块实现断点续爬；\n- [ ] `search_spider.py`模块中登录时的滑块验证实现自动化。\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fduguce%2Fjdspider","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fduguce%2Fjdspider","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fduguce%2Fjdspider/lists"}