{"id":13585950,"url":"https://github.com/Jiramew/spoon","last_synced_at":"2025-04-07T14:32:52.436Z","repository":{"id":45443472,"uuid":"97892852","full_name":"Jiramew/spoon","owner":"Jiramew","description":"🥄 A package for building  specific Proxy Pool for different Sites.","archived":false,"fork":false,"pushed_at":"2023-05-22T21:33:39.000Z","size":87,"stargazers_count":174,"open_issues_count":6,"forks_count":23,"subscribers_count":7,"default_branch":"master","last_synced_at":"2025-03-13T02:47:01.138Z","etag":null,"topics":["crawler","distributed","ip","proxies","proxy","proxy-provider","proxypool","python","redis","spider","spoon"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Jiramew.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2017-07-21T01:30:26.000Z","updated_at":"2024-09-15T05:44:52.000Z","dependencies_parsed_at":"2024-11-06T04:32:09.860Z","dependency_job_id":"1b39aff9-6614-4591-8ed0-f85b9b5cb6fb","html_url":"https://github.com/Jiramew/spoon","commit_stats":{"total_commits":34,"total_committers":3,"mean_commits":"11.333333333333334","dds":0.08823529411764708,"last_synced_commit":"61251ce7524a4b2d0d222e7c76b9933ee2ade3c7"},"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Jiramew%2Fspoon","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Jiramew%2Fspoon/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Jiramew%2Fspoon/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Jiramew%2Fspoon/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Jiramew","download_url":"https://codeload.github.com/Jiramew/spoon/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247670232,"owners_count":20976532,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["crawler","distributed","ip","proxies","proxy","proxy-provider","proxypool","python","redis","spider","spoon"],"created_at":"2024-08-01T15:05:14.481Z","updated_at":"2025-04-07T14:32:52.116Z","avatar_url":"https://github.com/Jiramew.png","language":"Python","funding_links":[],"categories":["Python"],"sub_categories":[],"readme":"# Spoon - A package for building specific Proxy Pool for different Sites.\nSpoon is a library for building Distributed Proxy Pool for each different sites as you assign.      \nOnly running on python 3.\n\n## Install\nSimply run: `pip install spoonproxy` or clone the repo and set it into your PYTHONPATH.\n    \n## Run\n\n### Spoon-server\nPlease make sure the Redis is running. Default configuration is \"host:localhost, port:6379\". You can also modify the Redis connection.      \nLike `example.py` in `spoon_server/example`,      \nYou can assign many different proxy providers.\n```python\nfrom spoon_server.proxy.fetcher import Fetcher\nfrom spoon_server.main.proxy_pipe import ProxyPipe\nfrom spoon_server.proxy.kuai_provider import KuaiProvider\nfrom spoon_server.proxy.xici_provider import XiciProvider\nfrom spoon_server.database.redis_config import RedisConfig\nfrom spoon_server.main.checker import CheckerBaidu\n\ndef main_run():\n    redis = RedisConfig(\"127.0.0.1\", 21009)\n    p1 = ProxyPipe(url_prefix=\"https://www.baidu.com\",\n                   fetcher=Fetcher(use_default=False),\n                   database=redis,\n                   checker=CheckerBaidu()).set_fetcher([KuaiProvider()]).add_fetcher([XiciProvider()])\n    p1.start()\n\n\nif __name__ == '__main__':\n    main_run()\n```\n\nAlso, with different checker, you can validate the result precisely.\n```python\nclass CheckerBaidu(Checker):\n    def checker_func(self, html=None):\n        if isinstance(html, bytes):\n            html = html.decode('utf-8')\n        if re.search(r\".*百度一下，你就知道.*\", html):\n            return True\n        else:\n            return False\n```\n\nAlso, as the code shows in `spoon_server/example/example_multi.py`, by using multiprocess, you can get many queues to fetching \u0026 validating the proxies.       \nYou can also assign different Providers for different url.      \nThe default proxy providers are shown below, you can write your own providers.             \n\u003ctable class=\"table table-bordered table-striped\"\u003e\n    \u003cthead\u003e\n    \u003ctr\u003e\n        \u003cth style=\"width: 100px;\"\u003ename\u003c/th\u003e\n        \u003cth style=\"width: 100px;\"\u003edescription\u003c/th\u003e\n    \u003c/tr\u003e\n    \u003c/thead\u003e\n    \u003ctbody\u003e\n        \u003ctr\u003e\n          \u003ctd\u003eWebProvider\u003c/td\u003e\n          \u003ctd\u003eGet proxy from http api\u003c/td\u003e\n        \u003c/tr\u003e\n        \u003ctr\u003e\n          \u003ctd\u003eFileProvider\u003c/td\u003e\n          \u003ctd\u003eGet proxy from file\u003c/td\u003e\n        \u003c/tr\u003e\n        \u003ctr\u003e\n          \u003ctd\u003eGouProvider\u003c/td\u003e\n          \u003ctd\u003ehttp://www.goubanjia.com\u003c/td\u003e\n        \u003c/tr\u003e\n        \u003ctr\u003e\n          \u003ctd\u003eKuaiProvider\u003c/td\u003e\n          \u003ctd\u003ehttp://www.kuaidaili.com\u003c/td\u003e\n        \u003c/tr\u003e\n        \u003ctr\u003e\n          \u003ctd\u003eSixProvider\u003c/td\u003e\n          \u003ctd\u003ehttp://m.66ip.cn\u003c/td\u003e\n        \u003c/tr\u003e\n        \u003ctr\u003e\n          \u003ctd\u003eUsProvider\u003c/td\u003e\n          \u003ctd\u003ehttps://www.us-proxy.org\u003c/td\u003e\n        \u003c/tr\u003e\n        \u003ctr\u003e\n          \u003ctd\u003eWuyouProvider\u003c/td\u003e\n          \u003ctd\u003ehttp://www.data5u.com\u003c/td\u003e\n        \u003c/tr\u003e\n        \u003ctr\u003e\n          \u003ctd\u003eXiciProvider\u003c/td\u003e\n          \u003ctd\u003ehttp://www.xicidaili.com\u003c/td\u003e\n        \u003c/tr\u003e\n        \u003ctr\u003e\n          \u003ctd\u003eIP181Provider\u003c/td\u003e\n          \u003ctd\u003ehttp://www.ip181.com\u003c/td\u003e\n        \u003c/tr\u003e\n        \u003ctr\u003e\n          \u003ctd\u003eXunProvider\u003c/td\u003e\n          \u003ctd\u003ehttp://www.xdaili.cn\u003c/td\u003e\n        \u003c/tr\u003e\n        \u003ctr\u003e\n          \u003ctd\u003ePlpProvider\u003c/td\u003e\n          \u003ctd\u003ehttps://list.proxylistplus.com\u003c/td\u003e\n        \u003c/tr\u003e\n        \u003ctr\u003e\n          \u003ctd\u003eIP3366Provider\u003c/td\u003e\n          \u003ctd\u003ehttp://www.ip3366.net\u003c/td\u003e\n        \u003c/tr\u003e\n        \u003ctr\u003e\n          \u003ctd\u003eBusyProvider\u003c/td\u003e\n          \u003ctd\u003ehttps://proxy.coderbusy.com\u003c/td\u003e\n        \u003c/tr\u003e\n        \u003ctr\u003e\n          \u003ctd\u003eNianProvider\u003c/td\u003e\n          \u003ctd\u003ehttp://www.nianshao.me\u003c/td\u003e\n        \u003c/tr\u003e\n        \u003ctr\u003e\n          \u003ctd\u003ePdbProvider\u003c/td\u003e\n          \u003ctd\u003ehttp://proxydb.net\u003c/td\u003e\n        \u003c/tr\u003e\n        \u003ctr\u003e\n          \u003ctd\u003eZdayeProvider\u003c/td\u003e\n          \u003ctd\u003ehttp://ip.zdaye.com\u003c/td\u003e\n        \u003c/tr\u003e\n        \u003ctr\u003e\n          \u003ctd\u003eYaoProvider\u003c/td\u003e\n          \u003ctd\u003ehttp://www.httpsdaili.com/\u003c/td\u003e\n        \u003c/tr\u003e\n        \u003ctr\u003e\n          \u003ctd\u003eFeilongProvider\u003c/td\u003e\n          \u003ctd\u003ehttp://www.feilongip.com/\u003c/td\u003e\n        \u003c/tr\u003e\n        \u003ctr\u003e\n          \u003ctd\u003eIP31Provider\u003c/td\u003e\n          \u003ctd\u003ehttps://31f.cn/http-proxy/\u003c/td\u003e\n        \u003c/tr\u003e\n        \u003ctr\u003e\n          \u003ctd\u003eXiaohexiaProvider\u003c/td\u003e\n          \u003ctd\u003ehttp://www.xiaohexia.cn/\u003c/td\u003e\n        \u003c/tr\u003e\n        \u003ctr\u003e\n          \u003ctd\u003eCoolProvider\u003c/td\u003e\n          \u003ctd\u003ehttps://www.cool-proxy.net/\u003c/td\u003e\n        \u003c/tr\u003e\n        \u003ctr\u003e\n          \u003ctd\u003eNNtimeProvider\u003c/td\u003e\n          \u003ctd\u003ehttp://nntime.com/\u003c/td\u003e\n        \u003c/tr\u003e\n        \u003ctr\u003e\n          \u003ctd\u003eListendeProvider\u003c/td\u003e\n          \u003ctd\u003ehttps://www.proxy-listen.de/\u003c/td\u003e\n        \u003c/tr\u003e\n        \u003ctr\u003e\n          \u003ctd\u003eIhuanProvider\u003c/td\u003e\n          \u003ctd\u003ehttps://ip.ihuan.me/\u003c/td\u003e\n        \u003c/tr\u003e\n        \u003ctr\u003e\n          \u003ctd\u003eIphaiProvider\u003c/td\u003e\n          \u003ctd\u003ehttp://www.iphai.com/\u003c/td\u003e\n        \u003c/tr\u003e\n        \u003ctr\u003e\n          \u003ctd\u003eMimvpProvider(@NeedCaptcha)\u003c/td\u003e\n          \u003ctd\u003ehttps://proxy.mimvp.com/\u003c/td\u003e\n        \u003c/tr\u003e\n        \u003ctr\u003e\n          \u003ctd\u003eGPProvider(@NeedProxy if you're in China)\u003c/td\u003e\n          \u003ctd\u003ehttp://www.gatherproxy.com\u003c/td\u003e\n        \u003c/tr\u003e\n        \u003ctr\u003e\n          \u003ctd\u003eFPLProvider(@NeedProxy if you're in China)\u003c/td\u003e\n          \u003ctd\u003ehttps://free-proxy-list.net\u003c/td\u003e\n        \u003c/tr\u003e\n        \u003ctr\u003e\n          \u003ctd\u003eSSLProvider(@NeedProxy if you're in China)\u003c/td\u003e\n          \u003ctd\u003ehttps://www.sslproxies.org\u003c/td\u003e\n        \u003c/tr\u003e\n        \u003ctr\u003e\n          \u003ctd\u003eNordProvider(@NeedProxy if you're in China)\u003c/td\u003e\n          \u003ctd\u003ehttps://nordvpn.com\u003c/td\u003e\n        \u003c/tr\u003e\n        \u003ctr\u003e\n          \u003ctd\u003ePremProvider(@NeedProxy if you're in China)\u003c/td\u003e\n          \u003ctd\u003ehttps://premproxy.com\u003c/td\u003e\n        \u003c/tr\u003e\n        \u003ctr\u003e\n          \u003ctd\u003eYouProvider(@Deprecated)\u003c/td\u003e\n          \u003ctd\u003ehttp://www.youdaili.net\u003c/td\u003e\n        \u003c/tr\u003e\n    \u003c/tbody\u003e\n\u003c/table\u003e\n\n### Spoon-web\nA Simple django web api demo. You could use any web server and write your own api.           \nGently run `python manager.py runserver **.**.**.**:*****`      \nThe simple apis include:\n\u003ctable class=\"table table-bordered table-striped\"\u003e\n    \u003cthead\u003e\n    \u003ctr\u003e\n        \u003cth style=\"width: 100px;\"\u003ename\u003c/th\u003e\n        \u003cth style=\"width: 100px;\"\u003edescription\u003c/th\u003e\n    \u003c/tr\u003e\n    \u003c/thead\u003e\n    \u003ctbody\u003e\n        \u003ctr\u003e\n          \u003ctd\u003ehttp://127.0.0.1:21010/api/v1/get_keys\u003c/td\u003e\n          \u003ctd\u003eGet all keys from redis\u003c/td\u003e\n        \u003c/tr\u003e\n        \u003ctr\u003e\n          \u003ctd\u003ehttp://127.0.0.1:21010/api/v1/fetchone_from?target=www.google.com\u0026filter=65\u003c/td\u003e\n          \u003ctd\u003eGet one useful proxy. \u003cbr\u003etarget: the specific url\u003cbr\u003e filter: successful-revalidate times\u003c/td\u003e\n        \u003c/tr\u003e\n        \u003ctr\u003e\n          \u003ctd\u003ehttp://127.0.0.1:21010/api/v1/fetchall_from?target=www.google.com\u0026filter=65\u003c/td\u003e\n          \u003ctd\u003eGet all useful proxies.\u003c/td\u003e\n        \u003c/tr\u003e\n        \u003ctr\u003e\n          \u003ctd\u003ehttp://127.0.0.1:21010/api/v1/fetch_hundred_recent?target=www.baidu.com\u0026filter=5\u003c/td\u003e\n          \u003ctd\u003eGet recently joined full-scored proxies. \u003cbr\u003etarget: the specific url\u003cbr\u003e filter: time in seconds\u003c/td\u003e\n        \u003c/tr\u003e\n        \u003ctr\u003e\n          \u003ctd\u003ehttp://127.0.0.1:21010/api/v1/fetch_stale?num=100\u003c/td\u003e\n          \u003ctd\u003eGet recently proxies without check. \u003cbr\u003enum: the specific number of proxies you want\u003c/td\u003e\n        \u003c/tr\u003e\n        \u003ctr\u003e\n          \u003ctd\u003ehttp://127.0.0.1:21010/api/v1/fetch_recent?target=www.baidu.com\u003c/td\u003e\n          \u003ctd\u003eGet recently proxies that successfully validated. \u003cbr\u003etarget: the specific url\u003c/td\u003e\n        \u003c/tr\u003e\n    \u003c/tbody\u003e\n\u003c/table\u003e\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FJiramew%2Fspoon","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FJiramew%2Fspoon","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FJiramew%2Fspoon/lists"}