{"id":50944243,"url":"https://github.com/luckeyfaraday/athena-loops","last_synced_at":"2026-06-17T18:07:13.528Z","repository":{"id":365504986,"uuid":"1272270670","full_name":"luckeyfaraday/athena-loops","owner":"luckeyfaraday","description":"Backend-agnostic AI agent orchestration loop in Python — the orchestrator→worker→reviewer pattern as a deterministic harness. Drive any LLM backend (Claude, Codex, opencode, aider); ships as an MCP server + CLI.","archived":false,"fork":false,"pushed_at":"2026-06-17T16:03:35.000Z","size":160,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-06-17T17:28:15.125Z","etag":null,"topics":["agent-loop","agent-orchestration","agentic-workflows","ai-agents","aider","anthropic","claude","claude-code","codex","coding-agents","llm","mcp","mcp-server","multi-agent","opencode","orchestrator-worker","python"],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/luckeyfaraday.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-06-17T12:57:55.000Z","updated_at":"2026-06-17T16:03:39.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/luckeyfaraday/athena-loops","commit_stats":null,"previous_names":["luckeyfaraday/athena-loops"],"tags_count":null,"template":false,"template_full_name":null,"purl":"pkg:github/luckeyfaraday/athena-loops","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/luckeyfaraday%2Fathena-loops","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/luckeyfaraday%2Fathena-loops/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/luckeyfaraday%2Fathena-loops/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/luckeyfaraday%2Fathena-loops/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/luckeyfaraday","download_url":"https://codeload.github.com/luckeyfaraday/athena-loops/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/luckeyfaraday%2Fathena-loops/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":34459765,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-17T02:00:05.408Z","response_time":127,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["agent-loop","agent-orchestration","agentic-workflows","ai-agents","aider","anthropic","claude","claude-code","codex","coding-agents","llm","mcp","mcp-server","multi-agent","opencode","orchestrator-worker","python"],"created_at":"2026-06-17T18:07:08.046Z","updated_at":"2026-06-17T18:07:13.521Z","avatar_url":"https://github.com/luckeyfaraday.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# agentloop — backend-agnostic AI agent orchestration loop for Python\n\n**agentloop** is a lightweight Python framework for multi-agent orchestration. It\nimplements the **orchestrator → worker → reviewer pattern** (the AI agent\norchestration loop) as a deterministic harness with a closed feedback loop: a goal\nis decomposed into subtasks, fanned out to worker subagents, aggregated, and run\nthrough a review gate that loops until the work meets its success criteria. One\nloop drives any LLM backend — Anthropic Claude, Claude Code, Codex, opencode, or\naider — through a single `Agent` interface, and it ships as both an **MCP server**\nand a plain CLI so any coding agent can call it.\n\nThe design principle: **the loop is a harness (deterministic code), not a skill.**\nA prompt can *describe* \"decompose, review, loop until done\" but can't *guarantee*\nit. So the control flow lives in code, and the model-facing judgement (how to\ndecompose, the review rubric) lives in swappable prompts. One harness drives any\nbackend through a single `Agent` interface.\n\n\u003e ![Peter Steinberger: \"You should be designing loops that prompt your agents.\"](docs/peterloops.png)\n\u003e\n\u003e The tweet that started it all — [Peter Steinberger (@steipete)](https://twitter.com/steipete):\n\u003e *\"You shouldn't be prompting coding agents anymore. You should be designing **loops**\n\u003e that prompt your agents.\"* agentloop is that idea as a reusable harness.\n\n```\n            ┌──────────── harness (this package) ────────────┐\ngoal ─▶ decompose ─▶ fan-out to subagents ─▶ aggregate ─▶ review gate ─▶ done?\n            ▲                                                          │ no\n            └──────────────── feedback: refine plan ◀──────────────────┘\n```\n\n## Quick start\n\n```bash\npython3 -m examples.run_demo        # zero-dependency MockAgent\npython3 -m pytest                   # 6 tests, no deps\n```\n\n```python\nfrom agentloop import Orchestrator, Budget\nfrom agentloop.adapters import MockAgent\n\norch = Orchestrator(MockAgent(), budget=Budget(max_iterations=4))\nresult = orch.run(\n    goal=\"Write a briefing on the orchestrator-worker pattern.\",\n    success_criteria=\"Covers decomposition, execution, review, and the feedback loop.\",\n)\nprint(result.completed, result.iterations, result.stop_reason)\nprint(result.final_output)\n```\n\n## Use a real model\n\n```bash\npip install -e \".[claude]\"\nexport ANTHROPIC_API_KEY=sk-...\npython3 -m examples.run_demo --claude\n```\n\n## Plug into any coding agent\n\nThe loop is pluggable in two directions, both thin wrappers over the `Agent` seam:\n\n**Inward — coding agents *are* the workers.** `CliAgent` runs each role\n(decomposer / subagent / reviewer) through a headless coding-agent CLI, so the\nworkers get that agent's tools, file access, and repo context:\n\n```python\nfrom agentloop import Orchestrator\nfrom agentloop.adapters import CliAgent\n\n# Point the worker at a repo and let it actually edit files headlessly:\nagent = CliAgent.claude_code(cwd=\"/path/to/repo\", skip_permissions=True)\norch = Orchestrator(agent)                     # or .codex() / .opencode() / .aider()\nresult = orch.run(goal=\"Add a /health endpoint + test\", success_criteria=\"test passes\")\n```\n\nKnobs for autonomous coding workers:\n- `cwd=...` — run the worker inside a specific repo (works for every preset).\n- `skip_permissions=True` — let the worker use tools without prompting\n  (`--dangerously-skip-permissions` / `--dangerously-bypass-approvals-and-sandbox` /\n  `--yes-always`). Needed for headless coding, but it bypasses all safety prompts —\n  point it at a worktree or throwaway branch, not your main checkout.\n- `timeout=...` — seconds to cap **each** worker CLI call. **Default is `None`\n  (no cap)**: a real coding worker is slow and unpredictable, so a short per-call\n  timeout just kills it mid-task and throws the work away. Bound the *run* instead\n  with `Budget(max_seconds=...)`, which is checked between iterations.\n- `verify_commands=[...]` — run real commands after each worker iteration and feed\n  failures back into the next loop. Use this for deterministic checks like\n  `python3 -m pytest`, `npm test`, or `npx playwright test`. Any failing verifier\n  blocks completion even if the reviewer would otherwise accept the work.\n\nFor big builds, keep subgoals small (the worker has to finish one in a single\ncall) and give the loop room with `max_iterations`; one cold worker can't build\neverything in one shot.\n\nAuth piggybacks on the CLI's own login, so a Claude.ai / ChatGPT **subscription\nOAuth** session works with no API key.\n\n**Isolated runs.** The safe default for `skip_permissions` is to run inside a\nthrowaway git worktree on its own branch — your main checkout is never touched:\n\n```python\nfrom agentloop import Orchestrator, worktree\nfrom agentloop.adapters import CliAgent\n\nwith worktree(\"/path/to/repo\") as wt:               # new branch + checkout\n    agent = CliAgent.claude_code(cwd=wt.path, skip_permissions=True)\n    Orchestrator(agent).run(goal=\"...\", success_criteria=\"...\")\n    print(wt.changed_files())   # what the run touched\n    wt.commit(\"agentloop run\")  # optional: persist on the branch\n```\n\nCleanup mirrors the harness's own worktrees: `cleanup=\"auto\"` (default) **keeps**\nthe worktree iff the run changed something (so you can inspect/merge the branch)\nand removes it if it left nothing; `\"always\"` / `\"never\"` force the choice.\n\n**Partial work is never lost.** When you orchestrate against a repo, the loop\n**commits the worktree after every iteration** (`agentloop: iteration N`). So if\na later iteration fails or a budget guard stops the run, each completed\niteration's work is preserved as a checkpoint commit on the branch — recoverable,\nnot discarded. The result's `worktree.checkpoints` lists them. Workers are also\ntold to **inspect the working directory and continue** from prior work rather\nthan restart, so a retry builds on what's already there instead of clobbering it.\n\nThe example runner isolates automatically when given a repo:\n\n```bash\npython3 -m examples.run_with_cli_agent claude /path/to/repo\n```\n\n```bash\npython3 -m examples.run_with_cli_agent claude   # codex | opencode | aider\n```\n\nCustom CLI? It's just a command template (`{prompt}`, `{system}`, `{combined}`;\nno prompt placeholder ⇒ text is piped on stdin):\n\n```python\nCliAgent([\"my-agent\", \"--system\", \"{system}\", \"--ask\", \"{prompt}\"])\n```\n\nPresets are starting points — CLI flags vary by version; confirm yours and tweak\n`agentloop/adapters/cli.py`. A non-zero exit or timeout becomes a FAILED task\n(with retries), not a silent wrong answer.\n\n**Outward — a coding agent *calls* the loop.** An MCP server exposes the loop as a\ntool, so any MCP-aware agent (Claude Code, Cursor, Codex, opencode, Cline, Windsurf)\ncan invoke it. The caller need not be the worker — pick `backend` independently.\n\n```bash\npip install -e \".[mcp]\"                # installs the `mcp` SDK\npython3 -m agentloop.mcp_server        # stdio transport\n```\n\nTools:\n- `orchestrate(goal, success_criteria, backend, cwd, max_iterations, skip_permissions, isolate, model, verify_commands, verify_timeout, detach)`\n  — runs the loop; returns either the result or `{ status: \"needs_input\", questions[], token }`.\n  With `detach=true` it returns `{ status: \"running\", run_id, … }` immediately (see below).\n- `orchestrate_resume(token, answers, detach)` — continues a run that asked for input\n- `orchestrate_status(run_id)` — light status of a detached run (phase, iteration, running)\n- `orchestrate_tail(run_id, cursor, limit)` — the events a detached run produced since `cursor`\n- `orchestrate_result(run_id, wait, timeout)` — the final result of a detached run\n- `orchestrate_list()` — every detached run this server has started, with status\n- `list_backends()` — the worker engines this server can drive\n- `doctor(cwd?)` — non-invasive diagnostics for backend CLI availability, target\n  directory access, and timeout interpretation\n\nA completed result is structured:\n`{ completed, iterations, stop_reason, final_output, summary, history[], worktree? }`\n— `summary` is a one-line human-readable digest for the calling agent to show.\nWhile it runs, `orchestrate` streams a `notifications/progress` update when it\nstarts and after every iteration (e.g. `iteration 2/4: 3/3 subgoals ok, gates\npass, goal incomplete`), so the caller sees live status instead of a bare\nspinner.\nWhen `cwd` is given it runs in an isolated worktree (see above) by default.\n\n**Live visibility — don't go blind until it's done.** A blocking `orchestrate`\ncall hides everything until the whole loop returns. Pass `detach=true` to run it\non a background thread and get a `run_id` back immediately:\n\n```\norchestrate(goal=…, backend=\"claude_code\", cwd=\"/repo\", detach=true)\n  -\u003e { status: \"running\", run_id, run_dir, events_path, tail_command }\norchestrate_tail(run_id, cursor)      -\u003e { events[], cursor, running, more }\norchestrate_status(run_id)            -\u003e { phase, iteration, running, … }\norchestrate_result(run_id, wait=true) -\u003e the final result, once done\n```\n\nThe calling agent starts the run, then **interleaves polling `orchestrate_tail`\nwith talking to you** — so you both see the loop work in real time: each subgoal\nas it's planned, every worker's start/finish and an output preview, the\nverification results, and the reviewer's verdict. Pass the returned `cursor` back\nto `orchestrate_tail` to get only what's new.\n\nEvery event is also appended to a durable JSONL log you can **`tail -f` from any\nterminal**, independent of the agent:\n\n```\n\u003ccwd\u003e/.agentloop/runs/\u003crun_id\u003e/\n  events.jsonl     # one event per line — tail -f this\n  status.json      # latest phase / iteration / running\n  result.json      # the final result (written once, at the end)\n  workers/         # full stdout of each worker call: iter\u003cN\u003e_\u003csubgoal\u003e.out\n```\n\nWorker output previews ride inline in the event stream; each worker's *full*\noutput is written to `workers/iter\u003cN\u003e_\u003csubgoal\u003e.out` and referenced by\n`data.output_path`. Detached runs also sidestep the MCP request-timeout problem\nentirely: the tool returns at once, so there's no long-blocking call to time out.\n\nTimeouts have three different layers. The MCP host/client has its own request\ntimeout; if it reports `-32001 Request timed out`, that often means the host\nstopped waiting before `orchestrate` returned, not that a backend failed to\nspawn. Configure long-running MCP hosts to wait at least `600000` ms. Separately,\n`orchestrate(timeout=...)` is a hard cap for each CLI worker subprocess, while\n`max_seconds` is a cooperative loop budget checked between phases/iterations and\ndoes not interrupt one blocked subprocess. Have agents call `doctor()` before\nguessing about missing CLIs or broken backend spawning.\n\n**Plug into Claude Code.** Point `PYTHONPATH` at the repo so the server resolves\nthe package no matter where Claude Code launches it (no install needed):\n\n```bash\nclaude mcp add athena-loops --scope user \\\n  -e PYTHONPATH=/abs/path/to/athena-loops \\\n  -- python3 -m agentloop.mcp_server\n\nclaude mcp list                        # -\u003e athena-loops: ✔ Connected\n```\n\nClaude Code's CLI registration does not expose a per-server request-timeout flag;\nuse `doctor()` to distinguish host timeout symptoms from backend availability.\n\n`--scope user` makes it available in every project; use `local` for just this\none, or `project` to write a shared `.mcp.json`. If you `pip install -e .`\ninstead, the `agentloop-mcp` console script is on PATH and the `-e PYTHONPATH=…`\nis unnecessary: `claude mcp add athena-loops -- agentloop-mcp`.\n\n**Plug into Cursor / Cline / Windsurf** (`.mcp.json` / `mcp.json`):\n\n```json\n{\n  \"mcpServers\": {\n    \"athena-loops\": {\n      \"command\": \"python3\",\n      \"args\": [\"-m\", \"agentloop.mcp_server\"],\n      \"env\": { \"PYTHONPATH\": \"/abs/path/to/athena-loops\" },\n      \"timeout\": 600000\n    }\n  }\n}\n```\n\nThen (restart the session first) ask the host agent to \"use agentloop to\norchestrate: \u003cgoal\u003e\", choosing a `backend` (`claude_code`, `codex`, `mock`, …)\nfor the workers. With `backend=claude_code` the workers spawn nested `claude`\nsub-sessions.\n\nFor agents that *don't* speak MCP but can run a shell, there's a plain CLI over\nthe same contract:\n\n```bash\nagentloop run --goal \"Add a /health endpoint + test\" --criteria \"test passes\" \\\n  --backend claude_code --cwd . --skip-permissions --json \\\n  --verify \"python3 -m pytest\" --verify \"npx playwright test\"\nagentloop backends\n```\n\n`--json` prints the full result on stdout; `--progress` streams one NDJSON line\nper iteration on stderr; `--goal -` / `--goal-file` read long prompts. `--verify`\nis repeatable and runs each command after every iteration in the same repo/worktree\nas the worker. Exit code is `0` if completed, `1` if a budget guard stopped it,\n`2` on error — so scripts and agents can branch on the outcome.\n\n## How the user gives input (intake \u0026 clarification)\n\nBefore any planning, the loop runs an **intake** phase: it can propose success\ncriteria (if you didn't give any) and ask the clarifying questions it needs —\nthe diagram's \"App Follow-up Questions\". *Where* the human answers is a second\nswappable seam, `Interaction`, mirroring the `Agent` seam:\n\n| Surface | Interaction | UX |\n|---|---|---|\n| Python / headless (default) | `AutoInteraction` | never blocks; proceeds with best judgment |\n| Interactive terminal | `ConsoleInteraction` | prompts the human via `input()` |\n| MCP / scripted CLI | `SuspendInteraction` | returns `needs_input` + a resume token instead of blocking |\n\n**Terminal wizard** — omit `--goal` and it asks; criteria and clarifying\nquestions are prompted inline:\n\n```bash\nagentloop run                         # Goal\u003e … then proposes criteria + asks questions\nagentloop run --goal \"…\" --non-interactive   # never prompt; use defaults\n```\n\n**Inside another agent (MCP)** — `orchestrate(...)` returns\n`{ status: \"needs_input\", questions: [...], token }` when it needs answers; the\nhost agent collects them from the user and calls\n`orchestrate_resume(token, answers)` to continue. Same flow on the CLI for tools:\n\n```bash\nagentloop run --goal \"build an API\" --ask        # prints questions + token, exits 3\nagentloop run --resume \u003ctoken\u003e --answer \"FastAPI\" --answer \"Postgres\"\n```\n\n**Python** — pass your own:\n\n```python\nfrom agentloop import Orchestrator, ConsoleInteraction\nOrchestrator(agent, interaction=ConsoleInteraction()).run(goal=\"…\")  # criteria optional\n```\n\n## The seam (where to put what)\n\n| Layer | Lives in | What it owns |\n|-------|----------|--------------|\n| **Harness** | `orchestrator.py`, `scheduler.py`, `types.py` | the loop, fan-out, aggregation, review gate, termination guards, failure capture |\n| **Agent seam** | `agent.py` + `adapters/` | one `Agent.run(request) -\u003e response` per backend (Mock, Claude, …) |\n| **Skills** | `roles.py` | the prompts *inside* each box: decomposer, subagent, reviewer rubric |\n\nTo support a new backend, implement one method:\n\n```python\nfrom agentloop.agent import Agent, AgentRequest, AgentResponse\n\nclass MyAgent(Agent):\n    def run(self, request: AgentRequest) -\u003e AgentResponse:\n        text = call_your_model(system=request.system, prompt=request.prompt)\n        return AgentResponse(text=text)\n```\n\nThe three roles (Orchestrator, Subagent, Reviewer) are the *same* `Agent`\ninvoked with different system prompts — not separate classes.\n\n## Two gaps in the original diagram, handled here\n\n- **Termination guards** — `Budget` caps iterations, wall-clock time, and total\n  agent calls so the NO-branch can't spin forever.\n- **Subagent failure handling** — a subagent that raises becomes a `FAILED`\n  `TaskResult` (with retries), visible to the reviewer and the feedback step,\n  instead of crashing the run or being silently dropped.\n\n## Layout\n\n```\nagentloop/\n  orchestrator.py   # the loop (deterministic harness) — emits live events\n  scheduler.py      # parallel/sequential subagent execution + retries\n  roles.py          # role prompts — the tunable \"skills\"\n  agent.py          # Agent interface + robust JSON extraction\n  types.py          # Budget, Subgoal, TaskResult, ReviewResult, LoopState, LoopEvent, ...\n  runs.py           # detached background runs + the tail-able event log\n  mcp_server.py     # MCP tools: orchestrate(detach), orchestrate_tail/status/result\n  adapters/\n    mock.py         # deterministic, dependency-free (demo + tests)\n    claude.py       # Anthropic SDK backend\nexamples/run_demo.py\ntests/test_orchestrator.py\n```\n\n## FAQ\n\n**What is agentloop?** A backend-agnostic Python framework that implements the AI\nagent orchestration loop — the orchestrator–worker–reviewer pattern with a closed\nfeedback loop — as deterministic harness code rather than a prompt.\n\n**What is the orchestrator–worker–reviewer pattern?** An LLM agent decomposes a\ngoal into subtasks (orchestrator), parallel worker subagents execute them, and a\nreviewer gates the aggregated result against success criteria, looping with\nrefined plans until done or a budget guard stops it.\n\n**Which LLM backends does agentloop support?** Any model behind a single\n`Agent.run()` method. Built-in adapters cover a dependency-free MockAgent, the\nAnthropic Claude SDK, and headless coding-agent CLIs — Claude Code, Codex,\nopencode, and aider — via `CliAgent`.\n\n**How do I orchestrate multiple coding agents from Claude Code, Cursor, or Cline?**\nRun agentloop as an MCP server (`python3 -m agentloop.mcp_server`) and call its\n`orchestrate` tool, or use the plain `agentloop run` CLI from any agent that has a\nshell.\n\n**Does agentloop need an API key?** No — when you drive it through a coding-agent\nCLI it piggybacks on that CLI's own login, so a Claude.ai or ChatGPT subscription\nOAuth session works without an `ANTHROPIC_API_KEY`.\n\n**How does agentloop avoid infinite agent loops?** A `Budget` caps iterations,\nwall-clock time, and total agent calls, and a failing subagent becomes a `FAILED`\ntask result (with retries) instead of crashing or silently vanishing.\n\n**Is this like Peter Steinberger Loops / the agent loop technique?** Yes — it's the\nsame family of idea popularized by Peter Steinberger's writing on running coding\nagents in a loop (\"Peter Steinberger Loops\"). agentloop turns that pattern into a\nreusable, backend-agnostic harness with an explicit review gate and budget guards,\nrather than a one-off shell script.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fluckeyfaraday%2Fathena-loops","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fluckeyfaraday%2Fathena-loops","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fluckeyfaraday%2Fathena-loops/lists"}