{"id":32791309,"url":"https://github.com/boshu2/12-factor-agentops","last_synced_at":"2025-11-05T13:01:20.533Z","repository":{"id":322491465,"uuid":"1089708404","full_name":"boshu2/12-factor-agentops","owner":"boshu2","description":" DevOps + SRE principles for operating LLM applications reliably at scale. Complementary to 12-Factor Agents for building","archived":false,"fork":false,"pushed_at":"2025-11-04T18:02:08.000Z","size":146,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-11-04T20:10:15.095Z","etag":null,"topics":["12-factor","agent-orchestration","agentops","agents","ai-agents","ai-agents-framework","ai-operations","argocd","context-engineering","devops","flux","gitops","infrastructure-as-code","kubernetes","kyverno","llm","openshift","platform-engineering","production-operations","sre"],"latest_commit_sha":null,"homepage":"","language":"Makefile","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/boshu2.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-11-04T17:49:10.000Z","updated_at":"2025-11-04T18:02:13.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/boshu2/12-factor-agentops","commit_stats":null,"previous_names":["boshu2/12-factor-agentops"],"tags_count":null,"template":false,"template_full_name":null,"purl":"pkg:github/boshu2/12-factor-agentops","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/boshu2%2F12-factor-agentops","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/boshu2%2F12-factor-agentops/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/boshu2%2F12-factor-agentops/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/boshu2%2F12-factor-agentops/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/boshu2","download_url":"https://codeload.github.com/boshu2/12-factor-agentops/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/boshu2%2F12-factor-agentops/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":282823603,"owners_count":26733133,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-11-05T02:00:05.946Z","response_time":58,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["12-factor","agent-orchestration","agentops","agents","ai-agents","ai-agents-framework","ai-operations","argocd","context-engineering","devops","flux","gitops","infrastructure-as-code","kubernetes","kyverno","llm","openshift","platform-engineering","production-operations","sre"],"created_at":"2025-11-05T13:00:21.585Z","updated_at":"2025-11-05T13:01:20.525Z","avatar_url":"https://github.com/boshu2.png","language":"Makefile","readme":"# 12-Factor AgentOps\n\n\u003cdiv align=\"center\"\u003e\n\n**Operational patterns from the intersection: infrastructure FOR AI + AI FOR infrastructure**\n\n\u003ca href=\"https://www.apache.org/licenses/LICENSE-2.0\"\u003e\n    \u003cimg src=\"https://img.shields.io/badge/Code-Apache%202.0-blue.svg\" alt=\"Code License: Apache 2.0\"\u003e\u003c/a\u003e\n\u003ca href=\"https://creativecommons.org/licenses/by-sa/4.0/\"\u003e\n    \u003cimg src=\"https://img.shields.io/badge/Content-CC%20BY--SA%204.0-lightgrey.svg\" alt=\"Content License: CC BY-SA 4.0\"\u003e\u003c/a\u003e\n\u003cimg src=\"https://img.shields.io/badge/Status-Alpha-orange.svg\" alt=\"Status: Alpha\"\u003e\n\n\u003c/div\u003e\n\n---\n\n\u003e [!IMPORTANT]\n\u003e **Status: Alpha** - Patterns proven at production scale in federal infrastructure. Now validating generalization across domains.\n\u003e\n\u003e **Looking for Context Engineering?** See [12-Factor Agents - Factor 3](https://github.com/humanlayer/12-factor-agents/blob/main/content/factor-03-own-your-context-window.md) by [@dexhorthy](https://github.com/dexhorthy)\n\n---\n\n## The Intersection\n\n**I build GPU/HPC platforms that enable AI workloads.**\n\n**I use AI agents to automate infrastructure operations.**\n\n**I operate both at production scale in federal, security-hardened environments.**\n\nThis framework documents operational patterns from both sides of the AI equation.\n\n---\n\n## The Problem\n\nEveryone's building AI agents. Nobody's figured out how to operate them reliably.\n\n- **Week 1:** \"This is amazing!\"\n- **Week 4:** Errors piling up\n- **Week 8:** Back to manual work\n\nSound familiar? **It's 2015 microservices chaos all over again.**\n\nWe know how to build reliable infrastructure. We know how to build reliable software.\n\n**But operating AI agents in production? We're still figuring that out.**\n\n---\n\n## What This Is\n\nPlatform engineer with 10+ years climbing the IT stack—systems, networking, storage, security, platforms, automation.\n\n**Current work:**\n- Building GPU/HPC infrastructure for AI inference/training workloads (20+ production clusters)\n- Using AI agents to automate platform operations (GitOps validation, runbooks, policy)\n- Operating in mission-critical, multi-tenant, federal environments\n\n**12-Factor AgentOps = Meta-patterns extracted from real production work at this intersection.**\n\n---\n\n## Why This Perspective Matters\n\nMost people have **ONE** of these:\n- Infrastructure ops (no AI exposure)\n- AI/ML engineering (no infrastructure ops)\n- AI agent users (no production operations)\n\n**This framework comes from having ALL THREE:**\n1. Building platforms **FOR** AI workloads\n2. Using AI **TO BUILD** platforms\n3. Operating both at **PRODUCTION SCALE** in **HIGH-STAKES** environments\n\n---\n\n## The Approach\n\n```\nProduction Operations → Extract Patterns → Document → Validate → Refine\n         ↓                     ↓              ↓          ↓          ↓\n   (What works?)        (Why works?)    (Share it)  (Test it)  (Improve it)\n```\n\n1. **Document patterns** proven at production scale\n2. **Extract meta-patterns** that generalize across contexts\n3. **Share early**, validate with community\n4. **Refine** based on diverse implementations\n\n**Not theory. Production.**\n\n---\n\n## The Invitation\n\nIf you're at a similar intersection:\n- Operating AI/ML infrastructure at scale\n- Using AI agents for DevOps/SRE work\n- Building platforms in constrained environments\n\n**Try these patterns. Share what works in your context. Help prove whether operational discipline transfers.**\n\n---\n\n## Framework: The Factors\n\nThe 12 factors are being published as they're validated for generalization.\n\n### Coming Soon\n\n| Factor | Focus | Status |\n|--------|-------|--------|\n| **I: Git as Knowledge OS** | Commits = memory, branches = isolation, logs = audit trail | Documenting |\n| **II: Context Engineering** | JIT loading, 40% rule, progressive disclosure | Documenting |\n| **III: Small Specialized Agents** | Single responsibility, composable workflows | Documenting |\n| **IV: Validation Gates** | Test before deploy, fail fast, rollback easy | Planned |\n| **V: Observability** | Metrics, logs, traces for agent operations | Planned |\n| **VI: Session Continuity** | Pause/resume, state preservation, recovery | Planned |\n\n\u003e [!TIP]\n\u003e **Subscribe to releases** to get notified when factors are published\n\n---\n\n## Background\n\n**Platform Engineer**\n- 10+ years: Systems → Networks → Security → Platforms → Automation → AI\n- 20+ production Kubernetes clusters in federal/DoD environments\n- GPU/HPC infrastructure for AI inference/training\n- AI-assisted infrastructure operations (GitOps, observability, compliance)\n\n**Unfair advantage:** Deep ops + automation + AI fluency + cultural translation\n\n---\n\n## Contributing\n\nEarly-stage documentation of production patterns.\n\n**Want to help?**\n- ✅ Implement patterns in your context\n- ✅ Share results (successes AND failures)\n- ✅ Suggest adaptations for your domain\n- ✅ Challenge assumptions constructively\n\nSee [CLAUDE.md](CLAUDE.md) for AgentOps principles and contribution guidelines.\n\n---\n\n## Attribution \u0026 Inspiration\n\nThis framework builds on foundational work from:\n\n### [12-Factor Apps](https://12factor.net) (Heroku)\nThe original methodology for building software-as-a-service apps. Established principles for:\n- Configuration management\n- Dependency isolation\n- Stateless processes\n- Environment parity\n\n**Their insight:** Operational discipline makes applications reliable and portable.\n\n### [12-Factor Agents](https://github.com/humanlayer/12-factor-agents) (Dex Horthy, HumanLayer)\nFramework for building reliable LLM applications. Pioneered:\n- Context engineering principles\n- Human-in-the-loop patterns\n- Agent reliability practices\n- Production-grade AI systems\n\n**Their insight:** AI agents need the same rigor as traditional software.\n\n### This Project's Focus\n\n**12-Factor AgentOps** extends these foundations to **operations**:\n- Not just building reliable agents (12-Factor Agents covers this)\n- Not just building reliable apps (12-Factor Apps covers this)\n- **Operating AI agents and infrastructure at production scale**\n\nWe document patterns from the intersection: infrastructure FOR AI + AI FOR infrastructure.\n\n---\n\n## Related Work\n\n**If you're building AI agents, read these first:**\n- [12-Factor Agents](https://github.com/humanlayer/12-factor-agents) by [@dexhorthy](https://github.com/dexhorthy) - Building reliable LLM applications\n- [Building Effective Agents](https://www.anthropic.com/engineering/building-effective-agents) by Anthropic - Agent design patterns\n- [The Outer Loop](https://theouterloop.substack.com) by Dex Horthy - AI agent development insights\n\n**If you're operating infrastructure, you know these:**\n- [12-Factor Apps](https://12factor.net) - SaaS application methodology\n- [Site Reliability Engineering](https://sre.google/books/) - Google's SRE practices\n- [DevOps Handbook](https://itrevolution.com/product/the-devops-handbook-second-edition/) - DevOps principles\n\n**This framework sits at the intersection.**\n\n---\n\n## License\n\nCode: [Apache 2.0 License](LICENSE) (permissive, use freely)\n\nDocumentation: [CC BY-SA 4.0 License](LICENSE) (share alike, attribute)\n\nFull license text: [LICENSE](LICENSE)\n\n---\n\n\u003cdiv align=\"center\"\u003e\n\n**Let's make AI agents as reliable as the infrastructure they run on.**\n\n*Patterns proven at production scale in federal infrastructure. Validating generalization across domains.*\n\n\u003c/div\u003e\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fboshu2%2F12-factor-agentops","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fboshu2%2F12-factor-agentops","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fboshu2%2F12-factor-agentops/lists"}