{"id":19084373,"url":"https://github.com/coghost/crawlers","last_synced_at":"2025-07-10T19:03:06.936Z","repository":{"id":73322221,"uuid":"108359546","full_name":"coghost/crawlers","owner":"coghost","description":"crawlers in one","archived":false,"fork":false,"pushed_at":"2020-04-02T07:09:24.000Z","size":272,"stargazers_count":1,"open_issues_count":0,"forks_count":3,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-02-22T06:41:23.471Z","etag":null,"topics":["crawler","python3","staticimg","weibo"],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/coghost.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2017-10-26T03:59:18.000Z","updated_at":"2020-04-02T07:09:27.000Z","dependencies_parsed_at":null,"dependency_job_id":"ef9eaa1f-5651-4a48-95ef-0f9441c8326a","html_url":"https://github.com/coghost/crawlers","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/coghost/crawlers","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/coghost%2Fcrawlers","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/coghost%2Fcrawlers/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/coghost%2Fcrawlers/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/coghost%2Fcrawlers/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/coghost","download_url":"https://codeload.github.com/coghost/crawlers/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/coghost%2Fcrawlers/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":264637925,"owners_count":23642077,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["crawler","python3","staticimg","weibo"],"created_at":"2024-11-09T02:51:05.732Z","updated_at":"2025-07-10T19:03:06.911Z","avatar_url":"https://github.com/coghost.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"---\ntitle: python 爬虫\n---\n\n\n\n# crawlers\n\n\u003e  crawlers in one `python 3.6`\n\n\n\n## Thanks to\n\n- [chenjiandongx](# https://github.com/chenjiandongx/awesome-spider)\n- [爬虫攻防](https://www.zhuyingda.com/blog/article.html?id=17\u0026origin=segment)\n- [fuck-login](https://github.com/xchaoinfo/fuck-login)\n\n\n## DONE\n\n- [x] 静态图片下载\n  - [44style](http://44.style/)\n  - [mmjpg](www.mmjpg.com)\n  - ...\n- [x] google  crx 插件爬取\n  - [chromecj](http://chromecj.com/)\n  - [cnplugins](http://www.cnplugins.com)\n- [x] luoo 网音乐\n- [x] one 读书\n- [x] [sdifen周](http://www.sdifen.com/)\n- [x] [伯乐python资源](http://hao.jobbole.com/?catid=144)\n- [x] 电影查询\n      - [x] [电影天堂](http://www.dytt8.net/)\n  - [x][66ys](http://66ys.cc/)\n- [x] 东奥会计题库\n- [x] 代理\n\n\n\n\n## docker machines\n\n### mongo\n\n```sh\ndocker run --name luoo_mg \\\n  -v \u003cYOU_BASE_DIR\u003e/Luoo/db/data:/data/db \\\n  -p \u003cYOU_PORT\u003e:27017 \\\n  -d mongo:latest --smallfiles  \n```\n\n### redis\n\n\u003e 切记: 在启动前需要先建立好 data 目录, 和 redis.conf 文件\n\n- docker\n\n  ```sh\n  docker run \\\n    --name=crawl_redis \\\n    -tid \\\n    -p \u003cYOU_PORT\u003e:6379 \\\n    -v \u003cYOU_BASE_DIR\u003e/Luoo/redis/data:/data \\\n    -v \u003cYOU_BASE_DIR\u003e/Luoo/redis/redis.conf:/usr/local/etc/redis/redis.conf \\\n    redis redis-server /usr/local/etc/redis/redis.conf\n  ```\n\n\n\n- `redis.conf`\n\n  ```sh\n  port 6379\n  timeout 300\n  loglevel verbose\n  save 900 1\n  save 300 10\n  save 60 10000\n  rdbcompression yes\n  appendonly yes\n  appendfsync everysec\n  requirepass 123456\n  ```\n\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcoghost%2Fcrawlers","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcoghost%2Fcrawlers","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcoghost%2Fcrawlers/lists"}