{"id":28998638,"url":"https://github.com/luminati-io/manage-failed-python-requests","last_synced_at":"2025-10-28T19:32:05.112Z","repository":{"id":278193195,"uuid":"930840820","full_name":"luminati-io/manage-failed-python-requests","owner":"luminati-io","description":"Handle failed HTTP requests in Python using retry strategies with HTTPAdapter, Tenacity, and custom logic to improve web scraping reliability.","archived":false,"fork":false,"pushed_at":"2025-02-18T13:22:41.000Z","size":23,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-03-22T07:02:02.095Z","etag":null,"topics":["headless-browser","http","python","requests","scraping-browser","status-codes","tenacity","web-scraping","web-unblocker"],"latest_commit_sha":null,"homepage":"https://brightdata.com/blog/web-data/retry-failed-requests-python","language":null,"has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/luminati-io.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2025-02-11T09:52:09.000Z","updated_at":"2025-02-18T13:22:45.000Z","dependencies_parsed_at":"2025-02-18T14:35:23.019Z","dependency_job_id":null,"html_url":"https://github.com/luminati-io/manage-failed-python-requests","commit_stats":null,"previous_names":["luminati-io/manage-failed-python-requests"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/luminati-io/manage-failed-python-requests","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/luminati-io%2Fmanage-failed-python-requests","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/luminati-io%2Fmanage-failed-python-requests/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/luminati-io%2Fmanage-failed-python-requests/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/luminati-io%2Fmanage-failed-python-requests/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/luminati-io","download_url":"https://codeload.github.com/luminati-io/manage-failed-python-requests/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/luminati-io%2Fmanage-failed-python-requests/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":261823776,"owners_count":23215150,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["headless-browser","http","python","requests","scraping-browser","status-codes","tenacity","web-scraping","web-unblocker"],"created_at":"2025-06-25T07:09:21.977Z","updated_at":"2025-10-28T19:32:05.023Z","avatar_url":"https://github.com/luminati-io.png","language":null,"funding_links":[],"categories":[],"sub_categories":[],"readme":"# Managing Failed Requests in Python\n\n[![Promo](https://github.com/luminati-io/LinkedIn-Scraper/raw/main/Proxies%20and%20scrapers%20GitHub%20bonus%20banner.png)](https://brightdata.com/) \n\nThis guide explains how to handle failed HTTP requests in Python with retry strategies and custom logic.\n\n- [What Are Status Codes?](#what-are-status-codes)\n- [Retry Strategies](#retry-strategies)\n- [HTTPAdapter](#httpadapter)\n- [Tenacity](#tenacity)\n- [Building a Custom Retry Mechanism](#building-a-custom-retry-mechanism)\n- [Conclusion](#conclusion)\n\n## What Are Status Codes?\n\nStatus codes are standardized three-digit numbers used in various protocols to indicate the result of a request. According to [Mozilla](https://developer.mozilla.org/en-US/docs/Web/HTTP/Status), HTTP status codes can be broken down into the following categories:\n\n- **100-199**: Informational responses\n- **200-299**: Successful responses\n- **300-399**: Redirection messages\n- **400-499**: Client error messages\n- **500-599**: Server error messages\n\nWhen developing client-side applications like web scrapers, it's crucial to pay attention to status codes in the 400 and 500 ranges. Codes in the 400s typically indicate client-side errors, such as authentication failures, rate limiting, timeouts, or the well-known _404: Not Found error_. Meanwhile, status codes in the 500s signal server-side issues that may require retries or alternative handling strategies.\n\nHere is a list of common error codes (taken from Mozilla’s [official documentation](https://developer.mozilla.org/en-US/docs/Web/HTTP/Status#client_error_responses)) you will encounter when performing web scraping:\n\n| **Status Code** | **Meaning** | **Description** |\n| --- | --- | --- |\n| 400 | Bad Request | Check your request format |\n| [401](https://brightdata.com/faqs/proxy-errors/error-401-how-to-avoid) | Unauthorized | Check your API key |\n| [403](https://brightdata.com/faqs/proxy-errors/403-status-error-how-to-avoid) | Forbidden | You cannot access this data |\n| 404 | Not Found | Site/Endpoint doesn’t exist |\n| [408](https://brightdata.com/faqs/proxy-errors/error-408-how-to-avoid) | Request Timeout | Request timed out, try again |\n| [429](https://brightdata.com/faqs/proxy-errors/429-error-how-to-avoid) | Too Many Requests | Slow down your requests |\n| 500 | Internal Server Error | Generic server error, retry request |\n| 501 | Not Implemented | Server doesn’t support this yet |\n| [502](https://brightdata.com/faqs/proxy-errors/502-error-how-to-avoid) | Bad Gateway | Failed response from an upstream server |\n| [503](https://brightdata.com/faqs/proxy-errors/503-error-how-to-avoid) | Service Unavailable | Server is temporarily down, retry later |\n| [504](https://brightdata.com/faqs/proxy-errors/504-error-how-to-avoid) | Gateway Timeout | Timed out waiting for an upstream server |\n\n## Retry Strategies\n\nWhen implementing a retry mechanism in Python, you can leverage pre-built libraries like `HTTPAdapter` and `Tenacity`. Alternatively, you may choose to develop custom retry logic based on your specific needs.\n\nA well-designed retry strategy should include both a retry limit and a backoff mechanism. The retry limit prevents infinite loops, ensuring that failed requests don’t continue indefinitely. A backoff strategy, which gradually increases the delay between retries, helps prevent excessive requests that could lead to being blocked or overloading the server.\n\n- **Retry Limits**: It’s essential to define a retry limit. After a specified number of attempts (X), the scraper should stop retrying to avoid infinite loops.  \n- **Backoff Algorithm**: A gradual increase in wait time between retries helps prevent overwhelming the server. Start with a small delay, such as 0.3 seconds, then incrementally increase it to 0.6 seconds, 1.2 seconds, and so forth.\n\n## HTTPAdapter\n\nWith `HTTPAdapter`, we need to configure three things: `total`, `backoff_factor`, and `status_forcelist`. `allowed_methods` isn’t a requirement per se, but it helps define our retry conditions and thus makes our code safer. In the code below, we use [httpbin](https://httpbin.org/) to automatically force an error and trigger the retry logic.\n\n```python\nimport logging\nimport requests\nfrom requests.adapters import HTTPAdapter\nfrom urllib3.util.retry import Retry\n\n# Configure logging\nlogging.basicConfig(level=logging.INFO, format=\"%(asctime)s - %(levelname)s - %(message)s\")\nlogger = logging.getLogger(__name__)\n\n# Create a session\nsession = requests.Session()\n\n# Configure retry settings\nretry = Retry(\n    total=3,  # Maximum retries\n    backoff_factor=0.3,  # Time between retries (exponential backoff)\n    status_forcelist=(429, 500, 502, 503, 504),  # Status codes to trigger a retry\n    allowed_methods={\"GET\", \"POST\"}  # Allow retries for GET and POST\n)\n\n# Mount the adapter with our custom settings\nadapter = HTTPAdapter(max_retries=retry)\nsession.mount(\"http://\", adapter)\nsession.mount(\"https://\", adapter)\n\n# Function to make a request and test retry logic\ndef make_request(url, method=\"GET\"):\n    try:\n        logger.info(f\"Making a {method} request to {url} with retry logic...\")\n        \n        if method == \"GET\":\n            response = session.get(url)\n        elif method == \"POST\":\n            response = session.post(url)\n        else:\n            logger.error(\"Unsupported HTTP method: %s\", method)\n            return\n        \n        response.raise_for_status()\n        logger.info(\"✅ Request successful: %s\", response.status_code)\n    \n    except requests.exceptions.RequestException as e:\n        logger.error(\"❌ Request failed after retries: %s\", e)\n        logger.info(\"Retries attempted: %d\", len(response.history) if response else 0)\n\n# Test Cases\nmake_request(\"https://httpbin.org/status/200\")  # ✅ Should succeed without retries\nmake_request(\"https://httpbin.org/status/500\")  # ❌ Should retry 3 times and fail\nmake_request(\"https://httpbin.org/status/404\")  # ❌ Should fail immediately (no retries)\nmake_request(\"https://httpbin.org/status/500\", method=\"POST\")  # ❌ Should retry 3 times and fail\n```\n\nOnce you created a `Session` object, do this:\n\n- Create a `Retry` object and define:\n    - `total`: The maximum limit for retrying a request.\n    - `backoff_factor`: Time to wait between retries. This adjusts exponentially as our retries increase.\n    - `status_forcelist`: A list of bad status codes. Any codes in this list will automatically trigger a retry.\n- Create an `HTTPAdapter` object with our `retry` variable: `adapter = HTTPAdapter(max_retries=retry)`.\n- Once you’ve created the `adapter`, mount it to the HTTP and HTTPS methods using `session.mount()`.\n\nWhen you run this code, the three retries (`total=3`) will run, and then you’ll get the following output.\n\n```\n2024-06-10 12:00:00 - INFO - Making a GET request to https://httpbin.org/status/200 with retry logic...\n2024-06-10 12:00:00 - INFO - ✅ Request successful: 200\n\n2024-06-10 12:00:01 - INFO - Making a GET request to https://httpbin.org/status/500 with retry logic...\n2024-06-10 12:00:02 - ERROR - ❌ Request failed after retries: 500 Server Error: INTERNAL SERVER ERROR for url: ...\n2024-06-10 12:00:02 - INFO - Retries attempted: 3\n\n2024-06-10 12:00:03 - INFO - Making a GET request to https://httpbin.org/status/404 with retry logic...\n2024-06-10 12:00:03 - ERROR - ❌ Request failed after retries: 404 Client Error: NOT FOUND for url: ...\n2024-06-10 12:00:03 - INFO - Retries attempted: 0\n\n2024-06-10 12:00:04 - INFO - Making a POST request to https://httpbin.org/status/500 with retry logic...\n2024-06-10 12:00:05 - ERROR - ❌ Request failed after retries: 500 Server Error: INTERNAL SERVER ERROR for url: ...\n2024-06-10 12:00:05 - INFO - Retries attempted: 3\n```\n\n## Tenacity\n\nYou can also use [`Tenacity`](https://tenacity.readthedocs.io/en/latest/), a popular open source retry library for Python. It’s not limited to HTTP, but it gives you an expressive way to implement retries.\n\nStart with installing `Tenacity`:\n\n```bash\npip install tenacity\n```\n\nOnce installed, create a _decorator_ and use it to wrap a requests function. With the `@retry` decorator, add the `stop`, `wait`, and `retry` arguments.\n\n```python\nimport logging\nimport requests\nfrom tenacity import retry, stop_after_attempt, wait_exponential, retry_if_exception_type, retry_if_result, RetryError\n\n# Configure logging\nlogging.basicConfig(level=logging.INFO, format=\"%(asctime)s - %(levelname)s - %(message)s\")\nlogger = logging.getLogger(__name__)\n\n# Define a retry strategy\n@retry(\n    stop=stop_after_attempt(3),  # Retry up to 3 times\n    wait=wait_exponential(multiplier=0.3),  # Exponential backoff\n    retry=(\n        retry_if_exception_type(requests.exceptions.RequestException) |  # Retry on request failures\n        retry_if_result(lambda r: r.status_code in {500, 502, 503, 504})  # Retry on specific HTTP status codes\n    ),\n)\ndef make_request(url):\n    logger.info(\"Making a request with retry logic to %s...\", url)\n    response = requests.get(url)\n    response.raise_for_status()\n    logger.info(\"✅ Request successful: %s\", response.status_code)\n    return response\n\n# Attempt to make the request\ntry:\n    make_request(\"https://httpbin.org/status/500\")  # Test with a failing status code\nexcept RetryError as e:\n    logger.error(\"❌ Request failed after all retries: %s\", e)    \n```\n\nThe logic and settings here are very similar to the first example with `HTTPAdapter`:\n\n- `stop=stop_after_attempt(3)`: This tells `tenacity` to give up after 3 failed retries.\n- `wait=wait_exponential(multiplier=0.3)` uses the same wait that we used before. It also backs off exponentially, just like before.\n- `retry=retry_if_exception_type(requests.exceptions.RequestException)` tells `tenacity` to use this logic every time a `RequestException` occurs.\n- `make_request()` makes a request to our error endpoint. It receives all of the traits from the decorator you created above it.\n\nWhen you run this code, you get a similar output:\n\n```\n2024-06-10 12:00:00 - INFO - Making a request with retry logic to https://httpbin.org/status/500...\n2024-06-10 12:00:01 - WARNING - Retrying after 0.3 seconds...\n2024-06-10 12:00:01 - INFO - Making a request with retry logic to https://httpbin.org/status/500...\n2024-06-10 12:00:02 - WARNING - Retrying after 0.6 seconds...\n2024-06-10 12:00:02 - INFO - Making a request with retry logic to https://httpbin.org/status/500...\n2024-06-10 12:00:03 - ERROR - ❌ Request failed after all retries: RetryError[...]\n```\n\n## Building a Custom Retry Mechanism\n\nYou can also create a custom retry mechanism, which is often the best approach when working with specialized code. With a relatively small amount of code, you can achieve the same functionality provided by existing libraries while tailoring it to your specific needs.\n\nThe code below demonstrates how to import `sleep` for the exponential backoff, set the configuration (`total`, `backoff_factor` and `bad_codes`), and use a `while` loop to hold the retry logic. `while`you still have tries and you haven’t succeeded, attempt the request.\n\n```python\nimport logging\nimport requests\nfrom time import sleep\n\n# Configure logging\nlogging.basicConfig(level=logging.INFO, format=\"%(asctime)s - %(levelname)s - %(message)s\")\nlogger = logging.getLogger(__name__)\n\n# Create a session\nsession = requests.Session()\n\n# Define retry settings\nTOTAL_RETRIES = 3\nINITIAL_BACKOFF = 0.3\nBAD_CODES = {429, 500, 502, 503, 504}\n\ndef make_request(url):\n    current_tries = 0\n    backoff = INITIAL_BACKOFF\n    success = False\n\n    while current_tries \u003c TOTAL_RETRIES and not success:\n        try:\n            logger.info(\"Making a request with retry logic to %s...\", url)\n            response = session.get(url)\n            \n            if response.status_code in BAD_CODES:\n                raise requests.exceptions.HTTPError(f\"Received {response.status_code}, triggering retry\")\n            \n            response.raise_for_status()\n            logger.info(\"✅ Request successful: %s\", response.status_code)\n            success = True\n            return response\n\n        except requests.exceptions.RequestException as e:\n            logger.error(\"❌ Request failed: %s, retries left: %d\", e, TOTAL_RETRIES - current_tries - 1)\n            if current_tries \u003c TOTAL_RETRIES - 1:\n                logger.info(\"⏳ Retrying in %.1f seconds...\", backoff)\n                sleep(backoff)\n                backoff *= 2  # Exponential backoff\n            current_tries += 1\n\n    logger.error(\"🚨 Request failed after all retries.\")\n    return None\n\n# Test Cases\nmake_request(\"https://httpbin.org/status/500\")  # ❌ Should retry 3 times and fail\nmake_request(\"https://httpbin.org/status/200\")  # ✅ Should succeed without retries\n```\n\nThe actual logic here is handled by a simple `while` loop.\n\n- If `response.status_code` is in the list of `bad_codes`, the script throws an exception.\n- If a request fails, the script:\n    - Prints an error message to the console.\n    - `sleep(backoff_factor)` waits before sending the next request.\n    - `backoff_factor = backoff_factor * 2` doubles our `backoff_factor` for the next try.\n    - Increments `current_tries` so it doesn’t stay in the loop indefinitely.\n\nHere’s the output from the custom retry code.\n\n```\n2024-06-10 12:00:00 - INFO - Making a request with retry logic to https://httpbin.org/status/500...\n2024-06-10 12:00:01 - ERROR - ❌ Request failed: Received 500, triggering retry, retries left: 2\n2024-06-10 12:00:01 - INFO - ⏳ Retrying in 0.3 seconds...\n2024-06-10 12:00:02 - INFO - Making a request with retry logic to https://httpbin.org/status/500...\n2024-06-10 12:00:03 - ERROR - ❌ Request failed: Received 500, triggering retry, retries left: 1\n2024-06-10 12:00:03 - INFO - ⏳ Retrying in 0.6 seconds...\n2024-06-10 12:00:04 - INFO - Making a request with retry logic to https://httpbin.org/status/500...\n2024-06-10 12:00:05 - ERROR - ❌ Request failed: Received 500, triggering retry, retries left: 0\n2024-06-10 12:00:05 - ERROR - 🚨 Request failed after all retries.\n```\n\n## Conclusion\n\nTo avoid all kinds of failed requests, we’ve developed products like the [Web Unlocker API](https://brightdata.com/products/web-unlocker) and [Scraping Browser](https://brightdata.com/products/scraping-browser). These tools automatically handle anti-bot measures, CAPTCHA challenges, and IP blocks, ensuring seamless and efficient web scraping for even the most challenging websites.\n\nSign up now and start your free trial today.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fluminati-io%2Fmanage-failed-python-requests","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fluminati-io%2Fmanage-failed-python-requests","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fluminati-io%2Fmanage-failed-python-requests/lists"}