{"id":20710211,"url":"https://github.com/oxylabs/aiohttp-proxy-integration","last_synced_at":"2025-04-23T04:52:47.662Z","repository":{"id":46600672,"uuid":"412338246","full_name":"oxylabs/aiohttp-proxy-integration","owner":"oxylabs","description":"Python tutorial for implementing Residential Proxies with AIOHTTP","archived":false,"fork":false,"pushed_at":"2025-02-11T12:29:34.000Z","size":70,"stargazers_count":9,"open_issues_count":0,"forks_count":3,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-04-23T04:52:44.091Z","etag":null,"topics":["aiohttp","asyncio","beautifulsoup","bs4","github-python","proxy-generator","proxy-list","proxy-list-github","proxy-rotator","proxy-site","python","python3","requests","residential-proxy","rotating-proxy","scraping","webproxy"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/oxylabs.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-10-01T05:21:44.000Z","updated_at":"2025-02-11T12:29:38.000Z","dependencies_parsed_at":"2024-11-17T02:10:35.717Z","dependency_job_id":"71c9477a-c2a5-441a-a77f-1880ab428cbb","html_url":"https://github.com/oxylabs/aiohttp-proxy-integration","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/oxylabs%2Faiohttp-proxy-integration","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/oxylabs%2Faiohttp-proxy-integration/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/oxylabs%2Faiohttp-proxy-integration/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/oxylabs%2Faiohttp-proxy-integration/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/oxylabs","download_url":"https://codeload.github.com/oxylabs/aiohttp-proxy-integration/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":250372946,"owners_count":21419722,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["aiohttp","asyncio","beautifulsoup","bs4","github-python","proxy-generator","proxy-list","proxy-list-github","proxy-rotator","proxy-site","python","python3","requests","residential-proxy","rotating-proxy","scraping","webproxy"],"created_at":"2024-11-17T02:10:30.571Z","updated_at":"2025-04-23T04:52:47.655Z","avatar_url":"https://github.com/oxylabs.png","language":"Python","readme":"# Integrating Oxylabs' Residential Proxies with AIOHTTP\r\n\r\n[![Oxylabs promo code](https://raw.githubusercontent.com/oxylabs/product-integrations/refs/heads/master/Affiliate-Universal-1090x275.png)](https://oxylabs.go2cloud.org/aff_c?offer_id=7\u0026aff_id=877\u0026url_id=112)\r\n\r\n\r\n[![](https://dcbadge.vercel.app/api/server/eWsVUJrnG5)](https://discord.gg/GbxmdGhZjq)\r\n\r\n[\u003cimg src=\"https://img.shields.io/static/v1?label=\u0026message=Python\u0026color=brightgreen\" /\u003e](https://github.com/topics/python) \r\n[\u003cimg src=\"https://img.shields.io/static/v1?label=\u0026message=Web%20Scraping\u0026color=important\" /\u003e](https://github.com/topics/web-scraping) \r\n[\u003cimg src=\"https://img.shields.io/static/v1?label=\u0026message=Residential%20Proxy\u0026color=blueviolet\" /\u003e](https://github.com/topics/residential-proxy) \r\n[\u003cimg src=\"https://img.shields.io/static/v1?label=\u0026message=Aiohttp\u0026color=blue\" /\u003e](https://github.com/topics/aiohttp) \r\n[\u003cimg src=\"https://img.shields.io/static/v1?label=\u0026message=Asyncio\u0026color=yellow\" /\u003e](https://github.com/topics/asyncio)\r\n\r\n## Requirements for the Integration\r\n\r\nFor the integration to work you'll need to install `aiohttp` library, use `Python 3.6` \r\nversion or higher and Residential Proxies. \u003cbr\u003e If you don't have `aiohttp` library, \r\nyou can install it by using `pip` command:\r\n\r\n```bash \r\npip install aiohttp\r\n```\r\n\r\nYou can get Residential Proxies here: https://oxy.yt/arWH\r\n\r\n## Proxy Authentication\r\n\r\nThere are 2 ways to authenticate proxies with `aiohttp`.\u003cbr\u003e\r\nThe first way is to authorize and pass credentials along with the proxy URL\r\nusing `aiohttp.BasicAuth`:\r\n\r\n```python\r\nimport aiohttp\r\n\r\nUSER = \"user\"\r\nPASSWORD = \"pass\"\r\nEND_POINT = \"pr.oxylabs.io:7777\"\r\n \r\nasync def fetch():\r\n    async with aiohttp.ClientSession() as session:\r\n        proxy_auth = aiohttp.BasicAuth(USER, PASSWORD)\r\n        async with session.get(\r\n                \"https://ip.oxylabs.io/location\", \r\n                proxy=\"http://pr.oxylabs.io:7777\", \r\n                proxy_auth=proxy_auth ,\r\n        ) as resp:\r\n            print(await resp.text())\r\n```\r\n\r\nThe second one is by passing authentication credentials in proxy URL:\r\n\r\n```python\r\nimport aiohttp\r\n\r\nUSER = \"user\"\r\nPASSWORD = \"pass\"\r\nEND_POINT = \"pr.oxylabs.io:7777\"\r\n\r\nasync def fetch():\r\n    async with aiohttp.ClientSession() as session:\r\n        async with session.get(\r\n                \"https://ip.oxylabs.io/location\", \r\n                proxy=f\"http://{USER}:{PASSWORD}@{END_POINT}\",\r\n        ) as resp: \r\n            print(await resp.text())\r\n```\r\n\r\nIn order to use your own proxies, adjust `user` and `pass` fields with your \r\nOxylabs account credentials.\r\n\r\n## Testing Proxies\r\n\r\nTo see if the proxy is working, try visiting https://ip.oxylabs.io/location. \r\nIf everything is working correctly, it will return an IP address of a proxy \r\nthat you're currently using.\r\n\r\n## Sample Project: Extracting Data From Multiple Pages\r\n\r\nTo better understand how residential proxies can be utilized for asynchronous \r\ndata extracting operations, we wrote a sample project to scrape product listing \r\ndata and save the output to a `CSV` file. The proxy rotation allows us to send \r\nmultiple requests at once risk-free – meaning that we don't need to worry about \r\nCAPTCHA or getting blocked. This makes the web scraping process extremely fast \r\nand efficient – now you can extract data from thousands of products in a matter \r\nof seconds!\r\n\r\n```python\r\nimport asyncio\r\nimport time\r\nimport sys\r\nimport os\r\n\r\nimport aiohttp\r\nimport pandas as pd\r\nfrom bs4 import BeautifulSoup\r\n\r\nUSER = \"user\"\r\nPASSWORD = \"pass\"\r\nEND_POINT = \"pr.oxylabs.io:7777\"\r\n\r\n# Generate a list of URLs to scrape.\r\nurl_list = [\r\n    f\"https://books.toscrape.com/catalogue/category/books_1/page-{page_num}.html\"\r\n    for page_num in range(1, 51)\r\n]\r\n\r\n\r\nasync def parse_data(text, results_list):\r\n    soup = BeautifulSoup(text, \"lxml\")\r\n    for product_data in soup.select(\"ol.row \u003e li \u003e article.product_pod\"):\r\n        data = {\r\n            \"title\": product_data.select_one(\"h3 \u003e a\")[\"title\"],\r\n            \"url\": product_data.select_one(\"h3 \u003e a\").get(\"href\")[5:],\r\n            \"product_price\": product_data.select_one(\"p.price_color\").text,\r\n            \"stars\": product_data.select_one(\"p\")[\"class\"][1],\r\n        }\r\n        results_list.append(data)  # Fill results_list by reference.\r\n        print(f\"Extracted data for a book: {data['title']}\")\r\n\r\n\r\nasync def fetch(session, sem, url, results_list):\r\n    async with sem:\r\n        async with session.get(\r\n            url,\r\n            proxy=f\"http://{USER}:{PASSWORD}@{END_POINT}\",\r\n        ) as response:\r\n            await parse_data(await response.text(), results_list)\r\n\r\n\r\nasync def create_jobs(results_list):\r\n    sem = asyncio.Semaphore(4)\r\n    async with aiohttp.ClientSession() as session:\r\n        await asyncio.gather(\r\n            *[fetch(session, sem, url, results_list) for url in url_list]\r\n        )\r\n\r\n\r\nif __name__ == \"__main__\":\r\n    results = []\r\n    start = time.perf_counter()\r\n\r\n    # Different EventLoopPolicy must be loaded if you're using Windows OS.\r\n    # This helps to avoid \"Event Loop is closed\" error.\r\n    if sys.platform.startswith(\"win\") and sys.version_info.minor \u003e= 8:\r\n        asyncio.set_event_loop_policy(asyncio.WindowsSelectorEventLoopPolicy())\r\n\r\n    try:\r\n        asyncio.run(create_jobs(results))\r\n    except Exception as e:\r\n        print(e)\r\n        print(\"We broke, but there might still be some results\")\r\n\r\n    print(\r\n        f\"\\nTotal of {len(results)} products from {len(url_list)} pages \"\r\n        f\"gathered in {time.perf_counter() - start:.2f} seconds.\",\r\n    )\r\n    df = pd.DataFrame(results)\r\n    df[\"url\"] = df[\"url\"].map(\r\n        lambda x: \"\".join([\"https://books.toscrape.com/catalogue\", x])\r\n    )\r\n    filename = \"scraped-books.csv\"\r\n    df.to_csv(filename, encoding=\"utf-8-sig\", index=False)\r\n    print(f\"\\nExtracted data can be found at {os.path.join(os.getcwd(), filename)}\")\r\n```\r\n\r\nIf you want to test the project's script by yourself, you'll need to install \r\nsome additional packages. To do that, simply download `requirements.txt` file \r\nand use `pip` command:\r\n\r\n```bash \r\npip install -r requirements.txt\r\n```\r\n\r\nIf you're having any trouble integrating proxies with `aiohttp` and this guide \r\ndidn't help you - feel free to contact Oxylabs customer support at support@oxylabs.io.\r\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Foxylabs%2Faiohttp-proxy-integration","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Foxylabs%2Faiohttp-proxy-integration","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Foxylabs%2Faiohttp-proxy-integration/lists"}