{"id":27111135,"url":"https://github.com/cocoakekeyu/autoproxy","last_synced_at":"2025-10-25T23:43:52.441Z","repository":{"id":69338727,"uuid":"68268712","full_name":"cocoakekeyu/autoproxy","owner":"cocoakekeyu","description":"一个用于scrapy爬虫的自动代理中间件","archived":false,"fork":false,"pushed_at":"2017-07-16T04:26:39.000Z","size":11,"stargazers_count":148,"open_issues_count":3,"forks_count":30,"subscribers_count":12,"default_branch":"master","last_synced_at":"2025-04-07T00:55:54.970Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/cocoakekeyu.png","metadata":{"files":{"readme":"README.MD","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2016-09-15T05:24:16.000Z","updated_at":"2024-02-16T16:51:54.000Z","dependencies_parsed_at":"2023-04-24T05:08:14.650Z","dependency_job_id":null,"html_url":"https://github.com/cocoakekeyu/autoproxy","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/cocoakekeyu/autoproxy","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cocoakekeyu%2Fautoproxy","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cocoakekeyu%2Fautoproxy/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cocoakekeyu%2Fautoproxy/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cocoakekeyu%2Fautoproxy/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/cocoakekeyu","download_url":"https://codeload.github.com/cocoakekeyu/autoproxy/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cocoakekeyu%2Fautoproxy/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":273727786,"owners_count":25157133,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-09-05T02:00:09.113Z","response_time":402,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-04-07T00:56:45.881Z","updated_at":"2025-10-25T23:43:52.434Z","avatar_url":"https://github.com/cocoakekeyu.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# AutoProxyMiddleware\n\n## 简介\n一个用于scrapy爬虫的自动代理中间件。可自动抓取和切换代理，自定义抓取和切换规则。\n\n## 用法\n将中间件模块放置到项目中，并在项目设置文件中添加该中间件。如\n```python\nDOWNLOADER_MIDDLEWARES = {\n    'projectname.autoproxy.AutoProxyMiddleware': 543,\n}\n```\n\n## 配置\n可在项目配置文件中使用`AUTO_PROXY`配置项配置代理中间件。如\n```python\nAUTO_PROXY = {\n\t'test_urls':[('http://upaiyun.com','online'),('http://huaban.com', '33010602001878')],\n\t'ban_code':[500,502,503,504],\n}\n```\n**所有可用配置**\n- `'enable'`: 一个布尔值，是否启用该中间件。默认为`True`\n- `'test_urls'`: 一个二元组的列表，网址+特征码(返回的网页内容中能找到的特定值)，用作代理连接的测试。默认为`[('http://www.w3school.com.cn', '06004630'), ]`\n- `'test_proxy_timeout'`: 大于0的整数，用于测试代理时连接超时设置。默认为`5`\n- `'download_timeout'`: 大于0的整数，与scrapy的`download_timeout`一样，启用该中间件则设置。默认为`60`\n- `'test_threadnums'`: 大于0的整数，启动测试代理的线程数。默认为`20`\n- `'ban_code'`: 一个列表，代理被禁用的http状态码。确认返回状态码在此范围可自动切换代理。默认为`[503,]`\n- `'ban_re'`: 正则表达式字符串，代理被禁用返回的页面内容包含匹配正则式的内容，则切换代理，若为空则不启用。默认为`r''`\n- `'proxy_least'`: 大于0的整数， 若代理池可用数量小于它则自动抓取新的代理。默认为`3`\n- `'init_valid_proxys'`: 大于0的整数， 初始化爬虫时等待的可用代理数量。数值大会导致初始化比较慢，在爬虫进行中也可以同时测试保存的代理。默认为`1`\n- `'invalid_limit'`: 大于0的整数，每个代理成功下载到页面时都会对其计数，若突然无法连接或者被网站拒绝将对这个代理进行invaild操作，若代理爬取的页面数大于该设置数值，则暂时不invaild，切换至另一个代理，并减少其页面计数。默认为`200`\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcocoakekeyu%2Fautoproxy","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcocoakekeyu%2Fautoproxy","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcocoakekeyu%2Fautoproxy/lists"}