{"id":21078375,"url":"https://github.com/lki/wescraper","last_synced_at":"2026-02-27T22:41:59.408Z","repository":{"id":83188525,"uuid":"59003776","full_name":"LKI/wescraper","owner":"LKI","description":"依赖Scrapy和搜狗搜索微信公众号文章","archived":false,"fork":false,"pushed_at":"2017-03-25T04:37:47.000Z","size":44,"stargazers_count":46,"open_issues_count":1,"forks_count":27,"subscribers_count":10,"default_branch":"gh-pages","last_synced_at":"2025-09-09T06:39:41.992Z","etag":null,"topics":["scrapy","sogou","wechat"],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/LKI.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2016-05-17T08:21:04.000Z","updated_at":"2025-07-17T09:15:02.000Z","dependencies_parsed_at":null,"dependency_job_id":"ad52fd9f-7a9d-47be-9996-0411b6fd7688","html_url":"https://github.com/LKI/wescraper","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/LKI/wescraper","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LKI%2Fwescraper","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LKI%2Fwescraper/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LKI%2Fwescraper/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LKI%2Fwescraper/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/LKI","download_url":"https://codeload.github.com/LKI/wescraper/tar.gz/refs/heads/gh-pages","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LKI%2Fwescraper/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":29917933,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-02-27T19:37:42.220Z","status":"ssl_error","status_checked_at":"2026-02-27T19:37:41.463Z","response_time":57,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["scrapy","sogou","wechat"],"created_at":"2024-11-19T19:40:16.834Z","updated_at":"2026-02-27T22:41:59.392Z","avatar_url":"https://github.com/LKI.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# WeScraper (WEchat SCRAPER)\n\n本工具使用Python2.7和[scrapy][scrapy]来搜索微信公众号文章。\n\n# 使用教程\n\n## 命令行直接查询\n\n安装Scrapy，直接查询。\n\n```\npip install scrapy\npython wescraper/scraper.py account liriansu miawu \u003e we.json # 查询liriansu和miawu相关的公众号\npython wescraper/scraper.py key-day liriansu miawu \u003e we.json # 查询liriansu和miawu相关的文章（一天内）\n```\n\n## Web Server查询\n\n安装Scrapy与Tornado，通过本地server查询：\n\n```\npip install scrapy tornado\npython wescraper/server.py\n```\n\n在server起来以后就可以通过`http://localhost/account/foo/bar/baz...`\n来获取微信公众号文章列表了。\n\n或者可以通过`http://localhost/key-year/foo/bar/baz...`\n以关键字来查询公众号文章。\n\n## Python Code调用\n\n参见[scraper.py][scraper-py]源码\n\n# 详细说明\n\n* 一些可配置的参数见[config.py][config-py]\n\n* 查询公众号默认获取列表的第一个。\n\n* 本工具有可能会被Ban，解决方案可以参考[Scrapy: Avoiding getting banned][anti]\n（一般而言，换IP就可以解决问题了）\n\n* [cookie.py][cookie-py]内维护了一个Cookie池，会在n个Cookie中随机选取来访问，假如Cookie被ban了就会换一个Cookie。\n\n* 欢迎在本代码基础上修改，记得跑一下单元测试噢：`python wescraper/test/test.py`\n\n* 本工具完全依赖[搜狗微信搜索][sogou]抓取文章，假如搜狗微信搜索接口什么的变了可能就会抓取失败。\n\n* [Python大法好！][dive-into-python] :wink:\n\n# 版权/免责\n\n代码版权归GitHub原作者 @LKI 所有。\n严禁用于商业用途，其它转载/Fork随意。\n\n[scrapy]: https://github.com/scrapy/scrapy\n[scraper-py]: /wescraper/scraper.py\n[config-py]: /wescraper/config.py\n[anti]: http://doc.scrapy.org/en/latest/topics/practices.html#avoiding-getting-banned\n[cookie-py]: /wescraper/cookie.py\n[sogou]:  http://weixin.sogou.com/\n[dive-into-python]: http://www.diveintopython.net/\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flki%2Fwescraper","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Flki%2Fwescraper","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flki%2Fwescraper/lists"}