{"id":50454194,"url":"https://github.com/mizcausevic-dev/rate-limit-shield","last_synced_at":"2026-06-01T01:05:45.122Z","repository":{"id":356619142,"uuid":"1233085597","full_name":"mizcausevic-dev/rate-limit-shield","owner":"mizcausevic-dev","description":"Production-grade rate limiting, circuit breaking, and retry shaping for LLM APIs. Token bucket + breaker + jittered backoff with HTTP 429 / Retry-After awareness.","archived":false,"fork":false,"pushed_at":"2026-05-08T21:33:57.000Z","size":12,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-05-08T23:31:05.997Z","etag":null,"topics":["anthropic","circuit-breaker","llm","llmops","openai","python","rate-limiting","reliability","retry-policy","sre"],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/mizcausevic-dev.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-05-08T15:19:43.000Z","updated_at":"2026-05-08T21:34:00.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/mizcausevic-dev/rate-limit-shield","commit_stats":null,"previous_names":["mizcausevic-dev/rate-limit-shield"],"tags_count":null,"template":false,"template_full_name":null,"purl":"pkg:github/mizcausevic-dev/rate-limit-shield","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mizcausevic-dev%2Frate-limit-shield","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mizcausevic-dev%2Frate-limit-shield/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mizcausevic-dev%2Frate-limit-shield/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mizcausevic-dev%2Frate-limit-shield/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/mizcausevic-dev","download_url":"https://codeload.github.com/mizcausevic-dev/rate-limit-shield/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mizcausevic-dev%2Frate-limit-shield/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":33755379,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-05-31T02:00:06.040Z","response_time":95,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["anthropic","circuit-breaker","llm","llmops","openai","python","rate-limiting","reliability","retry-policy","sre"],"created_at":"2026-06-01T01:05:45.049Z","updated_at":"2026-06-01T01:05:45.111Z","avatar_url":"https://github.com/mizcausevic-dev.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# rate-limit-shield 🛡️\n\n\u003e Production-grade rate limiting, circuit breaking, and retry shaping for LLM APIs.\n\u003e Token bucket + breaker + jittered backoff with HTTP 429 / `Retry-After` awareness.\n\n[![CI](https://github.com/mizcausevic-dev/rate-limit-shield/actions/workflows/ci.yml/badge.svg)](https://github.com/mizcausevic-dev/rate-limit-shield/actions/workflows/ci.yml)\n![Python](https://img.shields.io/badge/python-3.10%2B-blue)\n![License](https://img.shields.io/badge/license-MIT-green)\n![Status](https://img.shields.io/badge/status-alpha-orange)\n\n---\n\n## Why\n\nEvery team building on LLM APIs hand-rolls the same three primitives: a token bucket,\na circuit breaker, and an exponential-backoff retry. They get it 80% right, ship it,\nand discover the failure modes in production at 3 AM.\n\n**rate-limit-shield ships the 100% version once.**\n\n## What\n\nThree composable primitives + an LLM-aware facade:\n\n| Component | Purpose |\n|---|---|\n| `TokenBucket` | Thread-safe, continuous-refill rate limiter |\n| `CircuitBreaker` | Closed -\u003e Open -\u003e Half-Open state machine |\n| `RetryPolicy` | Exponential backoff with full jitter |\n| `Shield` | Composes all three with a single `.call()` API |\n| `LLMShield` | Per-model isolation, parses `Retry-After`, designed for OpenAI / Anthropic / Bedrock |\n\nZero runtime dependencies in core. Stdlib-only.\n\n## Architecture\n\n![Architecture](docs/architecture.svg)\n\n## Circuit breaker state machine\n\nThree states, four transitions - the breaker fails fast when an upstream vendor\ndegrades, then probes for recovery:\n\n![Circuit breaker FSM](docs/circuit-breaker-fsm.svg)\n\n## Install\n\n```bash\npip install rate-limit-shield\n```\n\nOr from source:\n\n```bash\ngit clone https://github.com/mizcausevic-dev/rate-limit-shield\ncd rate-limit-shield\npip install -e \".[dev]\"\npytest\n```\n\n## Quickstart\n\n### Per-model LLM protection\n\n```python\nfrom rate_limit_shield import LLMShield\n\nls = LLMShield(rpm_limits={\n    \"gpt-4\":       60,    # 60 req/min\n    \"claude-opus\": 30,\n    \"gpt-3.5\":     200,\n})\n\nshield = ls.shield_for(\"gpt-4\")\nresult = shield.call(openai_client.chat.completions.create,\n                     model=\"gpt-4\",\n                     messages=[...])\n```\n\n### Custom composition\n\n```python\nfrom rate_limit_shield import Shield, TokenBucket, CircuitBreaker, RetryPolicy\n\nshield = Shield(\n    bucket  = TokenBucket(capacity=100, refill_rate=10),     # 100 burst, 10/s sustained\n    breaker = CircuitBreaker(failure_threshold=5, recovery_timeout=30),\n    retry   = RetryPolicy(max_attempts=3, base_delay=1.0, jitter=True),\n)\n\nresponse = shield.call(http_client.get, \"https://api.example.com/v1/chat\")\n```\n\n### Honor `Retry-After`\n\n```python\nfrom rate_limit_shield import parse_retry_after\nimport time\n\ntry:\n    resp = shield.call(do_request)\nexcept SomeHTTPError as e:\n    wait = parse_retry_after(e.response.headers)\n    if wait:\n        time.sleep(wait)\n```\n\n## Buyer\n\n- **SRE / Platform Reliability** - eliminates the hand-rolled retry sprawl across services\n- **MLOps / Platform** - pairs natively with model routers and per-model quotas\n- **Cost / FinOps** - circuit breakers cap blast radius when a vendor degrades\n\n## Pairs With\n\n- [`agent-router`](https://github.com/mizcausevic-dev/agent-router) - router decides which model to hit; shield protects each model's quota\n- [`agent-canary`](https://github.com/mizcausevic-dev/agent-canary) - per-version rate limits during progressive rollout\n- [`agentobserve`](https://github.com/mizcausevic-dev/agentobserve) - emit shield state into your observability stack\n\n## Roadmap\n\n- [ ] Async API (`AsyncShield`, `AsyncTokenBucket`)\n- [ ] Redis-backed distributed bucket for multi-pod deployments\n- [ ] Prometheus metrics adapter\n- [ ] Bulkhead isolation primitive\n- [ ] HTTP-Date format for `Retry-After`\n- [ ] PyPI release\n\n## Doctrine\n\n\u003e *\"Premature optimization is the root of all evil. But premature retry storms are the root of all incidents.\"*\n\nThree rules:\n\n1. **Cap the blast radius.** Breakers stop you from DoSing your own vendor.\n2. **Jitter or die.** Synchronized retries are how a 500 becomes a 5,000.\n3. **Honor `Retry-After`.** The vendor told you when to come back. Listen.\n\n## License\n\nMIT - see [LICENSE](./LICENSE).\n\n---\n\nBuilt by [Mirza Causevic](https://github.com/mizcausevic-dev) - Part of the\n[mizcausevic-dev](https://github.com/mizcausevic-dev) AI platform engineering portfolio.\n\n---\n\n**Connect:** [LinkedIn](https://www.linkedin.com/in/mirzacausevic/) · [Kinetic Gain](https://kineticgain.com) · [Medium](https://medium.com/@mizcausevic/) · [Skills](https://mizcausevic.com/skills/)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmizcausevic-dev%2Frate-limit-shield","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmizcausevic-dev%2Frate-limit-shield","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmizcausevic-dev%2Frate-limit-shield/lists"}