{"id":50454193,"url":"https://github.com/mizcausevic-dev/agent-canary","last_synced_at":"2026-06-01T01:05:44.697Z","repository":{"id":356571893,"uuid":"1233130531","full_name":"mizcausevic-dev/agent-canary","owner":"mizcausevic-dev","description":"Progressive rollout, shadow mode, and auto-rollback for AI agents. Sticky-percent routing with promote/rollback gates driven by real metrics. Platform engineering reliability for the agent era.","archived":false,"fork":false,"pushed_at":"2026-05-08T16:48:06.000Z","size":14,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-05-08T18:36:40.849Z","etag":null,"topics":["ai-agents","canary","deployment","feature-flags","platform-engineering","progressive-rollout","python","reliability","shadow-deployment","sre"],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/mizcausevic-dev.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-05-08T16:14:41.000Z","updated_at":"2026-05-08T16:48:10.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/mizcausevic-dev/agent-canary","commit_stats":null,"previous_names":["mizcausevic-dev/agent-canary"],"tags_count":null,"template":false,"template_full_name":null,"purl":"pkg:github/mizcausevic-dev/agent-canary","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mizcausevic-dev%2Fagent-canary","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mizcausevic-dev%2Fagent-canary/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mizcausevic-dev%2Fagent-canary/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mizcausevic-dev%2Fagent-canary/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/mizcausevic-dev","download_url":"https://codeload.github.com/mizcausevic-dev/agent-canary/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mizcausevic-dev%2Fagent-canary/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":33755379,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-05-31T02:00:06.040Z","response_time":95,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai-agents","canary","deployment","feature-flags","platform-engineering","progressive-rollout","python","reliability","shadow-deployment","sre"],"created_at":"2026-06-01T01:05:44.621Z","updated_at":"2026-06-01T01:05:44.692Z","avatar_url":"https://github.com/mizcausevic-dev.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# agent-canary 🚦\n\n\u003e Progressive rollout, shadow mode, and auto-rollback for AI agents.\n\u003e Sticky-percent routing with promote/rollback gates driven by real metrics.\n\n[![CI](https://github.com/mizcausevic-dev/agent-canary/actions/workflows/ci.yml/badge.svg)](https://github.com/mizcausevic-dev/agent-canary/actions/workflows/ci.yml)\n![Python](https://img.shields.io/badge/python-3.10%2B-blue)\n![License](https://img.shields.io/badge/license-MIT-green)\n![Status](https://img.shields.io/badge/status-alpha-orange)\n\n---\n\n## Why\n\nEvery team rolling a new agent or model version into production lives in fear\nof the same thing: cutting over 100% of traffic and finding out at 3 AM that\nsomething subtle broke. The fix is universally agreed on - **progressive\nrollout** - and universally hand-rolled, badly.\n\n**agent-canary ships the staged rollout / shadow / auto-rollback you keep\nmeaning to build.**\n\n- Sticky % routing: a user assigned to canary STAYS on canary\n- Shadow mode: mirror traffic to v1.1 at zero user impact\n- Stage gates: 1% -\u003e 5% -\u003e 25% -\u003e 50% -\u003e 100% with success-rate + latency thresholds\n- Auto-rollback: canary materially worse than stable? Done. Zero %.\n\n## What\n\nFive primitives, zero runtime dependencies:\n\n| Component | Purpose |\n|---|---|\n| `CanaryRouter` | Sticky-key % routing via consistent hashing (MD5) |\n| `VersionMetrics` | Thread-safe rolling-window success rate + latency percentiles |\n| `Stage` / `Rollout` | Staged FSM with min duration / min samples / success / p95 gates |\n| `ShadowDeployment` | Mirror calls to a candidate fn in a background thread, swallow shadow errors |\n| `AgentCanary` | Facade tying decision + rollout + metrics + auto-decisions |\n\n## Architecture\n\n```\n                  +---------------------+\n                  |   AgentCanary       |\n                  |  (single facade)    |\n                  +----------+----------+\n                             |\n            +----------------+----------------+\n            |                |                |\n            v                v                v\n    +-------------+  +--------------+  +---------------+\n    |CanaryRouter |  |  Rollout     |  |VersionMetrics |\n    |(sticky %)   |  |  (FSM gates) |  |(per-version)  |\n    +------+------+  +------+-------+  +-------+-------+\n           |                |                  |\n           v                v                  v\n    decide(key) -\u003e  can_promote(metrics)?  record(ok, ms)\n    \"stable\" or     PROMOTE / HOLD /       success rate,\n    \"canary\"        ROLLBACK               p50/p95/p99\n```\n\n## Install\n\n```bash\npip install agent-canary\n```\n\nOr from source:\n\n```bash\ngit clone https://github.com/mizcausevic-dev/agent-canary\ncd agent-canary\npip install -e \".[dev]\"\npytest\n```\n\n## Quickstart\n\n### Progressive rollout with auto-decisions\n\n```python\nfrom agent_canary import AgentCanary, AutoAction, Rollout\n\ncanary = AgentCanary(\n    stable_version=\"agent-v1.0.0\",\n    canary_version=\"agent-v1.1.0\",\n    rollout=Rollout.standard(),  # 1% -\u003e 5% -\u003e 25% -\u003e 50% -\u003e 100%\n)\n\n# In your request handler:\ndef handle(user_id: str, prompt: str):\n    version = canary.route(sticky_key=user_id)\n    start = time.perf_counter()\n    try:\n        result = call_agent(version, prompt)\n        canary.record(version, success=True,\n                     latency_ms=(time.perf_counter()-start)*1000)\n        return result\n    except Exception:\n        canary.record(version, success=False,\n                     latency_ms=(time.perf_counter()-start)*1000)\n        raise\n\n# In a periodic background task (every minute or so):\ndef evaluate():\n    action = canary.auto_decide()\n    if action != AutoAction.HOLD:\n        print(f\"Applying: {action.value}\")\n    canary.apply(action)\n```\n\n### Shadow mode (zero user impact)\n\n```python\nfrom agent_canary import ShadowDeployment\n\ndef diff_compare(stable_result, shadow_result):\n    if stable_result != shadow_result:\n        log.info(\"divergence\", extra={\"stable\": stable_result, \"shadow\": shadow_result})\n\nshadowed = ShadowDeployment(\n    stable_fn=stable_agent.invoke,\n    shadow_fn=canary_agent.invoke,\n    comparator=diff_compare,\n)\n\n# Stable result is what user sees. Canary runs in the background.\nresult = shadowed.call(prompt)\n```\n\n### Custom rollout stages\n\n```python\nfrom agent_canary import Rollout, Stage\n\naggressive = Rollout(stages=[\n    Stage(percent=0.05, min_duration_seconds=300,  min_samples=200, success_threshold=0.99),\n    Stage(percent=0.50, min_duration_seconds=600,  min_samples=500, success_threshold=0.99, max_p95_ms=400),\n    Stage(percent=1.00, min_duration_seconds=0,    min_samples=0,   success_threshold=0.99),\n])\n```\n\n## Buyer\n\n- **Platform Engineering** - drop-in canary infrastructure for agent fleets\n- **SRE** - blast-radius control for model and prompt deployments\n- **ML Platform / MLOps** - works for ANY versioned dispatchable: prompt, model, full agent\n\n## Pairs With\n\n- [`agent-router`](https://github.com/mizcausevic-dev/agent-router) - decides WHICH version exists; agent-canary decides WHO sees which\n- [`rate-limit-shield`](https://github.com/mizcausevic-dev/rate-limit-shield) - per-version quotas during canary\n- [`identity-mesh`](https://github.com/mizcausevic-dev/identity-mesh) - identity-based canary cohorts (e.g. only research-* agents)\n- [`agentobserve`](https://github.com/mizcausevic-dev/agentobserve) - emit `canary.status()` snapshots into your observability stack\n\n## Roadmap\n\n- [ ] Persistent state backend (Redis) for multi-pod deployments\n- [ ] Cohort-based routing (identity, region, tier)\n- [ ] Statistical significance gates (CUPED, sequential testing)\n- [ ] Prometheus / OpenTelemetry exporter\n- [ ] PyPI release\n\n## Doctrine\n\n\u003e *\"Two truths in production: every deploy is a canary you didn't notice,\n\u003e and the only safe rollout is one you can roll back.\"*\n\nThree rules:\n\n1. **Sticky routing.** A user assigned to canary STAYS on canary - flapping is worse than slow rollouts.\n2. **Shadow before rollout.** Mirror traffic at zero user impact. Find the breakages before you cut over.\n3. **Auto-rollback wins.** Don't trust humans to wake up at 3 AM. Trust the gate.\n\n## License\n\nMIT - see [LICENSE](./LICENSE).\n\n---\n\nBuilt by [Mirza Causevic](https://github.com/mizcausevic-dev) - Part of the\n[mizcausevic-dev](https://github.com/mizcausevic-dev) AI platform engineering portfolio.\n\n---\n\n**Connect:** [LinkedIn](https://www.linkedin.com/in/mirzacausevic/) · [Kinetic Gain](https://kineticgain.com) · [Medium](https://medium.com/@mizcausevic/) · [Skills](https://mizcausevic.com/skills/)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmizcausevic-dev%2Fagent-canary","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmizcausevic-dev%2Fagent-canary","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmizcausevic-dev%2Fagent-canary/lists"}