{"id":28998621,"url":"https://github.com/luminati-io/proxy-with-python-requests","last_synced_at":"2025-06-25T07:09:29.305Z","repository":{"id":272611023,"uuid":"915683486","full_name":"luminati-io/Proxy-with-python-requests","owner":"luminati-io","description":"How to use proxies with Python Requests for web scraping, including setup, rotating proxies, and integrating Bright Data’s proxy services","archived":false,"fork":false,"pushed_at":"2025-02-02T12:41:22.000Z","size":412,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-03-22T07:02:02.726Z","etag":null,"topics":["datacenter-proxy","proxies","proxy","python","python-requests","requests-library-python","residential-proxy","rotating-proxy"],"latest_commit_sha":null,"homepage":"https://brightdata.com/blog/proxy-101/proxy-with-python-requests","language":null,"has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/luminati-io.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2025-01-12T14:31:43.000Z","updated_at":"2025-02-02T12:41:25.000Z","dependencies_parsed_at":null,"dependency_job_id":"68f30587-415d-4bf7-a9ec-46c25997db1f","html_url":"https://github.com/luminati-io/Proxy-with-python-requests","commit_stats":null,"previous_names":["luminati-io/proxy-with-python-requests"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/luminati-io/Proxy-with-python-requests","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/luminati-io%2FProxy-with-python-requests","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/luminati-io%2FProxy-with-python-requests/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/luminati-io%2FProxy-with-python-requests/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/luminati-io%2FProxy-with-python-requests/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/luminati-io","download_url":"https://codeload.github.com/luminati-io/Proxy-with-python-requests/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/luminati-io%2FProxy-with-python-requests/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":261823776,"owners_count":23215150,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["datacenter-proxy","proxies","proxy","python","python-requests","requests-library-python","residential-proxy","rotating-proxy"],"created_at":"2025-06-25T07:09:21.755Z","updated_at":"2025-06-25T07:09:29.283Z","avatar_url":"https://github.com/luminati-io.png","language":null,"readme":"# Using a Proxy with Python Requests\n\n\n[![Promo](https://github.com/luminati-io/Rotating-Residential-Proxies/blob/main/50%25%20off%20promo.png)](https://brightdata.com/proxy-types/residential-proxies) \n\nIn this guide, you will learn how to use proxies with Python requests, particularly for [web scraping](https://brightdata.com/blog/how-tos/what-is-web-scraping), to [bypass website restrictions](https://brightdata.com/blog/proxy-101/how-to-bypass-an-ip-ban) by changing your IP and location:\n\n- [Using a Proxy with a Python Request](#using-a-proxy-with-a-python-request)\n- [Installing Packages](#installing-packages)\n- [Components of Proxy IP Address](#components-of-proxy-ip-address)\n- [Setting Proxies Directly in Requests](#setting-proxies-directly-in-requests)\n- [Setting Proxies via Environment Variables](#setting-proxies-via-environment-variables)\n- [Rotating Proxies Using a Custom Method and an Array of Proxies](#rotating-proxies-using-a-custom-method-and-an-array-of-proxies)\n- [Using the Bright Data Proxy Service with Python](#using-the-bright-data-proxy-service-with-python)\n- [Conclusion](#conclusion)\n\n## Using a Proxy with a Python Request\n\n## Installing Packages\n\nUse `pip install` to install the following Python packages to send requests to the web page and collect the links:\n\n- `requests`: sends HTTP requests to the website where you want to scrape the data.\n- `beautifulsoup4`: parses HTML and XML documents to extract all the links.\n\n## Components of Proxy IP Address\n\nThe three primary components of a proxy server are:\n\n1.  **Protocol** is typically either HTTP or HTTPS.\n2.  **Address** can be an IP address or a DNS hostname.\n3.  **Port number** is anywhere between 0 and 65535, e.g. `2000`.\n\nThus, a proxy IP address would look like this: `https://192.167.0.1:2000` or\n`https://proxyprovider.com:2000`.\n\n## Setting Proxies Directly in Requests\n\nThis guide covers three ways to set proxies in requests. The first approach assumes doing that directly in the requests module.\n\nDo as follows:\n\n1. Import the Requests and Beautiful Soup packages in your Python script.\n2. Create a directory called `proxies` that contains proxy server information.\n3. In the `proxies` directory, define both the HTTP and HTTPS connections to the proxy URL.\n4. Define the Python variable to set the URL of the web page you want to scrape the data from. Use `https://brightdata.com`.\n5. Send a GET request to the web page using the `request.get()` method with two arguments: the URL of the website and proxies. The response will be stored in the `response` variable.\n6. Pass `response.content` and `html.parser` as arguments to the `BeautifulSoup()` method to collect links.\n7. Use the `find_all()` method with `a` as an argument to find all the links on the web page.\n8. Extract the `href` attribute of each link using the `get()` method.\n\nHere is the complete source code:\n\n```\n# import packages.  \nimport requests  \nfrom bs4 import BeautifulSoup  \n  \n# Define proxies to use.  \nproxies = {  \n    'http': 'http://proxyprovider.com:2000',  \n    'https': 'https://proxyprovider.com:2000',  \n}  \n  \n# Define a link to the web page.  \nurl = \"https://example.com/\"   \n  \n# Send a GET request to the website.  \nresponse = requests.get(url, proxies=proxies)  \n  \n# Use BeautifulSoup to parse the HTML content of the website.  \nsoup = BeautifulSoup(response.content, \"html.parser\")  \n  \n# Find all the links on the website.  \nlinks = soup.find_all(\"a\")  \n  \n# Print all the links.  \nfor link in links:  \n    print(link.get(\"href\"))\n```\n\nHere is the output from running the script above:\n\n![Scraped links](https://github.com/luminati-io/Proxy-with-python-requests/blob/main/link-to-webpage-2-1024x653.png)\n\n## Setting Proxies via Environment Variables\n\nTo use the same proxy for all requests, it's best to set environment variables in the terminal window:\n\n```powershell\nexport HTTP_PROXY='http://proxyprovider.com:2000'  \nexport HTTPS_PROXY='https://proxyprovider.com:2000'\n```\n\nYou can remove the proxies definition from the script now:\n\n```\n# import packages.  \nimport requests  \nfrom bs4 import BeautifulSoup  \n  \n# Define a link to the web page.  \nurl = \"https://example.com/\"   \n  \n# Send a GET request to the website.  \nresponse = requests.get(url)  \n  \n# Use BeautifulSoup to parse the HTML content of the website.  \nsoup = BeautifulSoup(response.content, \"html.parser\")  \n  \n# Find all the links on the website.  \nlinks = soup.find_all(\"a\")  \n  \n# Print all the links.  \nfor link in links:  \n    print(link.get(\"href\"))\n```\n\n## Rotating Proxies Using a Custom Method and an Array of Proxies\n\n[![Promo](https://github.com/luminati-io/LinkedIn-Scraper/blob/main/Proxies%20and%20scrapers%20GitHub%20bonus%20banner.png)](https://brightdata.com/proxy-types/residential-proxies) \n\nRotating proxies helps work around the restrictions that websites put when they receive a large number of requests from the same IP address.\n\nDo as follows:\n\n1. Import the following Python libraries: Requests, Beautiful Soup, and Random.\n2. Create a list of proxies to use during the rotation process. Use the `http://proxyserver.com:port` format:\n\n```\n# List of proxies  \nproxies = [  \n    \"http://proxyprovider1.com:2010\", \"http://proxyprovider1.com:2020\",  \n    \"http://proxyprovider1.com:2030\", \"http://proxyprovider2.com:2040\",  \n    \"http://proxyprovider2.com:2050\", \"http://proxyprovider2.com:2060\",  \n    \"http://proxyprovider3.com:2070\", \"http://proxyprovider3.com:2080\",  \n    \"http://proxyprovider3.com:2090\"  \n]\n```\n\n3. Create a custom method called `get_proxy()`. It will randomly select a proxy from the list of proxies using the `random.choice()` method and return the selected proxy in dictionary format (both HTTP and HTTPS keys). You’ll use this method whenever you send a new request:\n\n```\n# Custom method to rotate proxies  \ndef get_proxy():  \n    # Choose a random proxy from the list  \n    proxy = random.choice(proxies)  \n    # Return a dictionary with the proxy for both http and https protocols  \n    return {'http': proxy, 'https': proxy}  \n```\n\n4. Create a loop that sends a certain number of GET requests using the rotated proxies. In each request, the `get()` method uses a randomly chosen proxy specified by the `get_proxy()` method.\n\n5. Collect the links from the HTML content of the web page using the Beautiful Soup package, as explained previously.\n\n6. Catch and print any exceptions that occur during the request process.\n\nHere is the complete source code for this example:\n\n```\n# import packages  \nimport requests  \nfrom bs4 import BeautifulSoup  \nimport random  \n  \n# List of proxies  \nproxies = [  \n    \"http://proxyprovider1.com:2010\", \"http://proxyprovider1.com:2020\",  \n    \"http://proxyprovider1.com:2030\", \"http://proxyprovider2.com:2040\",  \n    \"http://proxyprovider2.com:2050\", \"http://proxyprovider2.com:2060\",  \n    \"http://proxyprovider3.com:2070\", \"http://proxyprovider3.com:2080\",  \n    \"http://proxyprovider3.com:2090\"  \n]\n\n# Custom method to rotate proxies  \ndef get_proxy():  \n    # Choose a random proxy from the list  \n    proxy = random.choice(proxies)  \n    # Return a dictionary with the proxy for both http and https protocols  \n    return {'http': proxy, 'https': proxy}  \n  \n  \n# Send requests using rotated proxies  \nfor i in range(10):  \n    # Set the URL to scrape  \n    url = 'https://brightdata.com/'  \n    try:  \n        # Send a GET request with a randomly chosen proxy  \n        response = requests.get(url, proxies=get_proxy())  \n  \n        # Use BeautifulSoup to parse the HTML content of the website.  \n        soup = BeautifulSoup(response.content, \"html.parser\")  \n  \n        # Find all the links on the website.  \n        links = soup.find_all(\"a\")  \n  \n        # Print all the links.  \n        for link in links:  \n            print(link.get(\"href\"))  \n    except requests.exceptions.RequestException as e:  \n        # Handle any exceptions that may occur during the request  \n        print(e)\n```\n\n## Using the Bright Data Proxy Service with Python\n\nBright Data has a large network of more than 72 million [residential proxy IPs](https://brightdata.com/proxy-types/residential-proxies) and more than 770,000 [datacenter proxies](https://brightdata.com/proxy-types/datacenter-proxies).\n\nYou can [integrate Bright Data’s datacenter proxies](https://brightdata.com/integration) into your Python requests. Once you have an account with Bright Data, follow these steps to create your first proxy:\n\n1. Click **View proxy product** on the welcome page to view the different types of proxy offered by Bright Data:\n\n![Bright Data proxy types](https://github.com/luminati-io/Proxy-with-python-requests/blob/main/bright-data-proxy-types-1024x464.png)\n\n2. Select **Datacenter Proxies** to create a new proxy, and on the subsequent page, add your details, and save it:\n\n![Datacenter proxies configuration](https://github.com/luminati-io/Proxy-with-python-requests/blob/main/datacenter-proxies-837x1024.png)\n\n3. Once your proxy is created, the dashboard will show you parameters such as the host, the port, the username, and the password to use in your scripts:\n\n![Datacenter proxy parameters](https://github.com/luminati-io/Proxy-with-python-requests/blob/main/datacenter-proxy-parameters-928x1024.png)\n\n4. Copy-paste these parameters to your script and use the following format of the proxy URL: `username-(session-id)-password@host:port`.\n\n\u003e **Note:**\\\n\u003e The `session-id` is a random number created by using a Python package called `random`.\n\nHere is the code that uses a proxy from Bright Data in a Python request:\n\n```\nimport requests  \nfrom bs4 import BeautifulSoup  \nimport random  \n  \n# Define parameters provided by Brightdata  \nhost = 'brd.superproxy.io'  \nport = 33335  \nusername = 'username'  \npassword = 'password'  \nsession_id = random.random()  \n  \n# format your proxy  \nproxy_url = ('http://{}-session-{}:{}@{}:{}'.format(username, session_id,  \n                                                     password, host, port))  \n  \n# define your proxies in dictionary  \nproxies = {'http': proxy_url, 'https': proxy_url}  \n  \n# Send a GET request to the website  \nurl = \"https://example.com/\"   \nresponse = requests.get(url, proxies=proxies)  \n  \n# Use BeautifulSoup to parse the HTML content of the website  \nsoup = BeautifulSoup(response.content, \"html.parser\")  \n  \n# Find all the links on the website  \nlinks = soup.find_all(\"a\")  \n  \n# Print all the links  \nfor link in links:  \n    print(link.get(\"href\"))\n```\n\nRunning this code will make a successful request using Bright Data’s [proxy service](https://brightdata.com/proxy-types).\n\n## Conclusion\n\nWith Bright Data’s web platform, you can get reliable proxies for your project that cover any country or city in the world. Try Bright Data's [proxy services](https://brightdata.com/proxy-types) for free now!\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fluminati-io%2Fproxy-with-python-requests","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fluminati-io%2Fproxy-with-python-requests","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fluminati-io%2Fproxy-with-python-requests/lists"}