{"id":13815276,"url":"https://github.com/pjialin/pyproxy-async","last_synced_at":"2025-03-17T15:12:57.712Z","repository":{"id":36692904,"uuid":"195766006","full_name":"pjialin/pyproxy-async","owner":"pjialin","description":"基于 Python Asyncio + Redis  实现的代理池","archived":false,"fork":false,"pushed_at":"2024-04-18T15:36:38.000Z","size":68,"stargazers_count":164,"open_issues_count":4,"forks_count":38,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-03-02T13:08:40.152Z","etag":null,"topics":["async","proxy","proxy-pool","redis"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/pjialin.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2019-07-08T08:09:38.000Z","updated_at":"2025-02-14T02:42:57.000Z","dependencies_parsed_at":"2024-04-18T16:54:21.282Z","dependency_job_id":"223f31d9-7dc9-4aa8-a969-53b8995e6397","html_url":"https://github.com/pjialin/pyproxy-async","commit_stats":null,"previous_names":[],"tags_count":4,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pjialin%2Fpyproxy-async","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pjialin%2Fpyproxy-async/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pjialin%2Fpyproxy-async/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pjialin%2Fpyproxy-async/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/pjialin","download_url":"https://codeload.github.com/pjialin/pyproxy-async/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":244056425,"owners_count":20390719,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["async","proxy","proxy-pool","redis"],"created_at":"2024-08-04T04:03:14.676Z","updated_at":"2025-03-17T15:12:57.694Z","avatar_url":"https://github.com/pjialin.png","language":"Python","funding_links":[],"categories":["Python"],"sub_categories":[],"readme":"# PyProxy-Async\n基于 Python 协程 + Redis 实现的简单的代理池维护，及代理 IP 抓取服务\n\n## 流程图\n![流程图](https://doc.pjialin.com/stuff/B6LkuGaXEMtItuPtT69CFQ.png)\n\n## Features\n* HTTP/HTTPS 检测\n* 协程实现\n* Web Api 接口支持\n\n## TODO\n- [x] Docker Support\n- [x] Custom check rule Support \n- [x] TSDB Support (Prometheus has been supported)\n- [ ] More api Support\n\n## Usage\n**环境依赖 Python 3.6 +**\n1. 克隆\n```\ngit clone https://github.com/pjialin/pyproxy-async\ncd pyproxy-async\n```\n2. 安装依赖\n```\npip install -r requirements.txt \n```\n\n3. 完善配置文件\n```\ncp config.toml.example config.toml\n```\n\n4. 启动\n```\npython main.py\n```\n### Docker 使用\n1. 拉取镜像\n```\ndocker pull pjialin/pyproxy-async:latest\n```\n\n2. 下载配置文件\n```\ncurl -o config.toml https://raw.githubusercontent.com/pjialin/pyproxy-async/master/config.toml.example\n```\n\n3. 启动\n```\ndocker run -d -v $(PWD)/config.toml:/code/config.toml -v pyproxy-data:/code/data --name pyproxy pjialin/pyproxy-async:latest\n```\n\n## Web Api\n启动完成之后，访问 `127.0.0.1:8080/get_ip` (配置文件中的端口)，即可获得一个随机的 IP, 如\n```\n# curl http://127.0.0.1:8080/get_ip  \n{\"ip\":\"213.6.45.18\",\"port\":\"39252\",\"http\":\"http://213.6.45.18:39252\"}\n\n# 支持过滤条件 https，rule 如\ncurl http://127.0.0.1:8080/get_ip?https=1\u0026rule=google\n```\n\n### 从文件或 Url 中加载已存在的 IP 列表\n**文件**\n1. 将文件命名为 `*.ip.txt`，如 `new.ip.txt`，并放在根目录下，文件格式为 `host:port`，如 \n```\n127.0.0.1:80\n127.0.0.1:8080\n```\n\n2. 加载到 IP 池中\n```\npython load.py [file_name]  # 默认加载所有 *.ip.txt 文件\n```\n**从 Url 中加载**\n```\npython load.py url # 如 python load.py https://ser.com/ip  支持任意文本，程序通过正则进行匹配\n```\n\n### 添加抓取服务\n增加新的 IP 抓取服务非常简单，只需要定义好要抓取的页面和对应的解析器，框架会自动加载并进行抓取。\n在 `src/sites` 目录中提供了一个示例文件，[site.py.example](https://github.com/pjialin/pyproxy-async/blob/master/src/sites/site.py.example)，供参考 \n\n### 集群支持\n目前支持使用同一个 Redis 地址，实现集群 IP 检测以及抓取\n\n## License\n[Apache License 2.0](https://github.com/pjialin/pyproxy-async/blob/master/LICENSE)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpjialin%2Fpyproxy-async","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fpjialin%2Fpyproxy-async","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpjialin%2Fpyproxy-async/lists"}