{"id":20710010,"url":"https://github.com/oxylabs/rotating-proxies-with-python","last_synced_at":"2025-04-23T04:51:58.927Z","repository":{"id":43120740,"uuid":"382149272","full_name":"oxylabs/Rotating-Proxies-With-Python","owner":"oxylabs","description":"Learn about how to rotate proxies by using Python.","archived":false,"fork":false,"pushed_at":"2025-02-11T12:28:56.000Z","size":221,"stargazers_count":36,"open_issues_count":0,"forks_count":4,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-03-29T22:21:08.525Z","etag":null,"topics":["json-database-python","proxies","proxy","proxy-list","proxy-list-github","proxy-rotator","python","python-image-scraper","python-web-crawler","rotating-proxy","scraper-python","scraping","socks5-proxy","socks5-proxy-list","socks5-server","web-proxies","web-proxy","web-scraping"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/oxylabs.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-07-01T20:29:58.000Z","updated_at":"2025-02-11T12:29:00.000Z","dependencies_parsed_at":"2025-01-07T13:29:46.794Z","dependency_job_id":"a47b517f-482a-4957-a834-47a771508cdf","html_url":"https://github.com/oxylabs/Rotating-Proxies-With-Python","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/oxylabs%2FRotating-Proxies-With-Python","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/oxylabs%2FRotating-Proxies-With-Python/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/oxylabs%2FRotating-Proxies-With-Python/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/oxylabs%2FRotating-Proxies-With-Python/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/oxylabs","download_url":"https://codeload.github.com/oxylabs/Rotating-Proxies-With-Python/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":250372947,"owners_count":21419722,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["json-database-python","proxies","proxy","proxy-list","proxy-list-github","proxy-rotator","python","python-image-scraper","python-web-crawler","rotating-proxy","scraper-python","scraping","socks5-proxy","socks5-proxy-list","socks5-server","web-proxies","web-proxy","web-scraping"],"created_at":"2024-11-17T02:09:32.453Z","updated_at":"2025-04-23T04:51:58.920Z","avatar_url":"https://github.com/oxylabs.png","language":"Python","readme":"# Rotating Proxies With Python\n[\u003cimg src=\"https://img.shields.io/static/v1?label=\u0026message=Python\u0026color=brightgreen\" /\u003e](https://github.com/topics/python) [\u003cimg src=\"https://img.shields.io/static/v1?label=\u0026message=Web%20Scraping\u0026color=important\" /\u003e](https://github.com/topics/web-scraping) [\u003cimg src=\"https://img.shields.io/static/v1?label=\u0026message=Rotating%20Proxies\u0026color=blueviolet\" /\u003e](https://github.com/topics/rotating-proxies)\n\n[![Oxylabs promo code](https://raw.githubusercontent.com/oxylabs/product-integrations/refs/heads/master/Affiliate-Universal-1090x275.png)](https://oxylabs.go2cloud.org/aff_c?offer_id=7\u0026aff_id=877\u0026url_id=112)\n\n[![](https://dcbadge.vercel.app/api/server/eWsVUJrnG5)](https://discord.gg/GbxmdGhZjq)\n\n## Table of Contents\n\n- [Finding Current IP Address](#finding-your-current-ip-address)\n- [Using A Single Proxy](#using-a-single-proxy)\n- [Rotating Multiple Proxies](#rotating-multiple-proxies)\n- [Rotating Multiple Proxies Using Async](#rotating-multiple-proxies-using-async)\n\n## Prerequisites\n\nThis article uses the python `requests` module. In order to install it, you can use `virtualenv`. `virtualenv` is a tool to create isolated Python environments.\n\nStart by creating a virtual environment in your project folder by running\n```bash\n$ virtualenv venv\n```\nThis will install python, pip and common libraries in your project folder.\n\nNext, invoke the source command to activate the environment. \n```bash\n$ source venv/bin/activate\n```\n\nLastly, install the `requests` module in the current virtual environment\n```bash\n$ pip install requests\n```\n\nAlternatively, you can install the dependencies from the included [requirements.txt](requirements.txt) file by running\n\n```bash\n$ pip install -r requirements.txt\n```\n\nCongratulations, you have successfully installed the `request` module. Now, it's time to find out your current Ip address!\n\n## Finding Your Current IP Address\n\nCreate a file with the `.py` extension with the following contents (or just copy [no_proxy.py](src/no_proxy.py)):\n\n```python\nimport requests\n\nresponse = requests.get('https://ip.oxylabs.io/location')\nprint(response.text)\n```\n\nNow, run it from a terminal\n\n```bash\n$ python no_proxy.py\n\n128.90.50.100\n```\nThe output of this script will show your current IP address, which uniquely identifies you on the network. Instead of exposing it directly when requesting pages, we will use a proxy server.\n\nLet's start by using a single proxy.\n\n## Using A Single Proxy \n\nYour first step is to [find a free proxy server](https://www.google.com/search?q=free+proxy+server+list).\n\n**Important Note**: free proxies are unreliable, slow and can collect the data about the pages you access. If you're looking for a reliable paid option, we highly recommend using [oxylabs.io](https://oxy.yt/GrVD) \n\nTo use a proxy, you will need its:\n* scheme (e.g. `http`)\n* ip (e.g. `2.56.215.247`)\n* port (e.g. `3128`)\n* username and password that is used to connect to the proxy (optional)\n\nOnce you have it, you need to set it up in the following format\n```\nSCHEME://USERNAME:PASSWORD@YOUR_PROXY_IP:YOUR_PROXY_PORT\n```\n\nHere are a few examples of the proxy formats you may encounter:\n```text\nhttp://2.56.215.247:3128\nhttps://2.56.215.247:8091\nhttps://my-user:aegi1Ohz@2.56.215.247:8044\n```\n\nOnce you have the proxy information, assign it to a constant.\n\n```python\nPROXY = 'http://2.56.215.247:3128'\n```\n\nNext, define a timeout in seconds as it is always a good idea to avoid waiting indefinitely for the response that may never be returned (due to network issues, server issues or the problems with the proxy server)\n```python\nTIMEOUT_IN_SECONDS = 10\n```\n\nThe requests module [needs to know](https://docs.python-requests.org/en/master/user/advanced/#proxies) when to actually use the proxy.\nFor that, consider the website you are attempting to access. Does it use http or https?\nSince we're trying to access **https**://ip.oxylabs.io/location, we can define this configuration as follows\n```python\nscheme_proxy_map = {\n    'https': PROXY,\n}\n```\n\n**Note**: you can specify multiple protocols, and even define specific domains for which a different proxy will be used\n\n```python\nscheme_proxy_map = {\n    'http': PROXY1,\n    'https': PROXY2,\n    'https://example.org': PROXY3,\n}\n```\n\nFinally, we make the request by calling `requests.get` and passing all the variables we defined earlier. We also handle the exceptions and show the error when a network issue occurs.\n\n```python\ntry:\n    response = requests.get('https://ip.oxylabs.io/location', proxies=scheme_proxy_map, timeout=TIMEOUT_IN_SECONDS)\nexcept (ProxyError, ReadTimeout, ConnectTimeout) as error:\n        print('Unable to connect to the proxy: ', error)\nelse:\n    print(response.text)\n```\n\nThe output of this script should show you the ip of your proxy:\n\n```bash\n$ python single_proxy.py\n\n2.56.215.247\n```\n\nYou are now hidden behind a proxy when making your requests through the python script.\nYou can find the complete code in the file [single_proxy.py](src/single_proxy.py).\n\nNow we're ready to rotate through a list of proxies, instead of using a single one!\n\n## Rotating Multiple Proxies\n\nIf you're using unreliable proxies, it could prove beneficial to save a bunch of them into a csv file and run a loop to determine whether they are still available.\n\nFor that purpose, first create a file `proxies.csv` with the following content:\n```text\nhttp://2.56.215.247:3128\nhttps://88.198.24.108:8080\nhttp://50.206.25.108:80\nhttp://68.188.59.198:80\n... any other proxy servers, each of them on a separate line\n```\n\nThen, create a python file and define both the filename, and how long are you willing to wait for a single proxy to respond:\n\n```python\nTIMEOUT_IN_SECONDS = 10\nCSV_FILENAME = 'proxies.csv'\n```\n\nNext, write the code that opens the csv file and reads every proxy server line by line into a `csv_row` variable and builds `scheme_proxy_map` configuration needed by the requests module.\n\n```python\nwith open(CSV_FILENAME) as open_file:\n    reader = csv.reader(open_file)\n    for csv_row in reader:\n        scheme_proxy_map = {\n            'https': csv_row[0],\n        }\n```\n\nAnd finally, we use the same scraping code from the previous section to access the website via proxy\n\n```python\nwith open(CSV_FILENAME) as open_file:\n    reader = csv.reader(open_file)\n    for csv_row in reader:\n        scheme_proxy_map = {\n            'https': csv_row[0],\n        }\n        \n        # Access the website via proxy\n        try:\n            response = requests.get('https://ip.oxylabs.io/location', proxies=scheme_proxy_map, timeout=TIMEOUT_IN_SECONDS)\n        except (ProxyError, ReadTimeout, ConnectTimeout) as error:\n            pass\n        else:\n            print(response.text)\n```\n\n**Note**: if you are only interested in scraping the content using *any* working proxy from the list, then add a break after print to stop going through the proxies in the csv file\n\n```python\n        try:\n            response = requests.get('https://ip.oxylabs.io/location', proxies=scheme_proxy_map, timeout=TIMEOUT_IN_SECONDS)\n        except (ProxyError, ReadTimeout, ConnectTimeout) as error:\n            pass\n        else:\n            print(response.text)\n            break # notice the break here\n```\n\nThis complete code is available in [rotating_multiple_proxies.py](src/rotating_multiple_proxies.py)\n\nThe only thing that is preventing us from reaching our full potential is speed.\nIt's time to tackle that in the next section!\n\n## Rotating Multiple Proxies Using Async\n\nChecking all the proxies in the list one by one may be an option for some, but it has one significant downside - this approach is painfully slow. This is because we are using a synchronous approach. We tackle requests one at a time and only move to the next once the previous one is completed. \n\nA better option would be to make requests and wait for responses in a non-blocking way - this would speed up the script significantly.\n\nIn order to do that we use the `aiohttp` module. You can install it using the following cli command: \n\n```bash\n$ pip install aiohttp\n```\n\nThen, create a python file where you define:\n* the csv filename that contains the proxy list\n* url that you wish to use to check the proxies\n* how long are you willing to wait for each proxy - the timeout setting\n\n```python\nCSV_FILENAME = 'proxies.csv'\nURL_TO_CHECK = 'https://ip.oxylabs.io/location'\nTIMEOUT_IN_SECONDS = 10\n```\n\nNext, we define an async function and run it using the asyncio module.\nIt accepts two parameters: \n* the url it needs to request\n* the proxy to use to access it\n\nWe then print the response. If the script received an error when attempting to access the url via proxy, it will print it as well.\n\n```python\n\nasync def check_proxy(url, proxy):\n    try:\n        session_timeout = aiohttp.ClientTimeout(total=None,\n                                                sock_connect=TIMEOUT_IN_SECONDS,\n                                                sock_read=TIMEOUT_IN_SECONDS)\n        async with aiohttp.ClientSession(timeout=session_timeout) as session:\n            async with session.get(url, proxy=proxy, timeout=TIMEOUT_IN_SECONDS) as resp:\n                print(await resp.text())\n    except Exception as error:\n        # you can comment out this line to only see valid proxies printed out in the command line\n        print('Proxy responded with an error: ', error)\n        return\n```\n\nThen, we define a main function that reads the csv file and creates an asynchronous task to check the proxy for every single record in the csv file. \n\n```python\n\nasync def main():\n    tasks = []\n    with open(CSV_FILENAME) as open_file:\n        reader = csv.reader(open_file)\n        for csv_row in reader:\n            task = asyncio.create_task(check_proxy(URL_TO_CHECK, csv_row[0]))\n            tasks.append(task)\n\n    await asyncio.gather(*tasks)\n```\n\nFinally, we run the main function and wait until all the async tasks complete\n```python\nasyncio.run(main())\n```\n\nThis complete code is available in [rotating_multiple_proxies.py](src/rotating_multiple_proxies_async.py)\n\nThis code now runs exceptionally fast!\n\n# We are open to contribution!\n\nBe sure to play around with it and create a pull request with any improvements you may find.\nAlso, check this [Best rotating proxy service](https://medium.com/@oxylabs.io/10-best-rotating-proxy-services-for-2024-853d840af1a4) list.\n\nHappy coding!\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Foxylabs%2Frotating-proxies-with-python","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Foxylabs%2Frotating-proxies-with-python","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Foxylabs%2Frotating-proxies-with-python/lists"}