{"id":48393167,"url":"https://github.com/tim-osterhus/millrace","last_synced_at":"2026-04-26T03:01:12.974Z","repository":{"id":348033272,"uuid":"1196214186","full_name":"tim-osterhus/millrace","owner":"tim-osterhus","description":"A governed, autonomous runtime for agent work too long, too stateful, or too recovery-sensitive to survive a single session.","archived":false,"fork":false,"pushed_at":"2026-04-19T13:27:30.000Z","size":3287,"stargazers_count":2,"open_issues_count":1,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-04-19T15:38:18.351Z","etag":null,"topics":["agentic-ai-development","agentic-coding","agentic-engineering","agentic-framework","agentic-workflow","autonomous-agents","harness-engineering"],"latest_commit_sha":null,"homepage":"https://millrace.ai","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"agpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/tim-osterhus.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-03-30T13:35:35.000Z","updated_at":"2026-04-19T13:27:22.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/tim-osterhus/millrace","commit_stats":null,"previous_names":["tim-osterhus/millrace"],"tags_count":34,"template":false,"template_full_name":null,"purl":"pkg:github/tim-osterhus/millrace","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tim-osterhus%2Fmillrace","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tim-osterhus%2Fmillrace/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tim-osterhus%2Fmillrace/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tim-osterhus%2Fmillrace/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/tim-osterhus","download_url":"https://codeload.github.com/tim-osterhus/millrace/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tim-osterhus%2Fmillrace/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32284333,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-25T18:29:39.964Z","status":"online","status_checked_at":"2026-04-26T02:00:05.962Z","response_time":129,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["agentic-ai-development","agentic-coding","agentic-engineering","agentic-framework","agentic-workflow","autonomous-agents","harness-engineering"],"created_at":"2026-04-06T01:04:08.484Z","updated_at":"2026-04-26T03:01:12.968Z","avatar_url":"https://github.com/tim-osterhus.png","language":"Python","readme":"# Millrace\n\n[![PyPI](https://img.shields.io/pypi/v/millrace-ai.svg)](https://pypi.org/project/millrace-ai/)\n[![Python 3.11+](https://img.shields.io/badge/python-3.11+-blue.svg)](https://www.python.org/downloads/)\n[![License](https://img.shields.io/github/license/tim-osterhus/millrace.svg)](LICENSE)\n\n\u003cp align=\"center\"\u003e\n  \u003cimg\n    src=\"docs/assets/images/millrace-icon-signal-transparent-glow.png\"\n    alt=\"Millrace signal mark\"\n    width=\"260\"\n  /\u003e\n\u003c/p\u003e\n\n## Other agents win sprints. Millrace wins marathons.\n\nRaw agent harnesses - Codex, Claude Code, Gemini - are built for sprints. Give\nthem a tight spec in a greenfield repo, let them rip, ship the result. They are\nvery good at this.\n\nMillrace starts where those runs end. When the work spans sessions,\naccumulates state, needs to survive a crash without losing context, or has to\ngate on real acceptance criteria rather than \"the agent said it was done\",\nthat is the problem Millrace was built for.\n\nMillrace is a governed runtime for long-running agent work. The harness still\ndoes the local stage work. Millrace owns the queue, the compiled plan, runtime\nstate, recovery paths, and closure behavior around that work.\n\n\u003e If you are an agent and your harness supports repo-local skills, for example\n\u003e Codex or Claude Code, install\n\u003e `docs/skills/millrace-ops-agent-manual/SKILL.md` to become a Millrace\n\u003e Operator before deciding whether work should enter Millrace. It teaches when\n\u003e to use Millrace, when not to, and how to run, monitor, and intervene safely.\n\n### Agent Install Examples\n\n- Codex: if your environment supports repo-local skills, load\n  `docs/skills/millrace-ops-agent-manual/` as the local skill package. The\n  entry file is `docs/skills/millrace-ops-agent-manual/SKILL.md`.\n- Claude Code: use the same package path or entry file as the local project\n  skill, then follow the autonomy handshake inside the skill before deciding\n  whether work should enter Millrace.\n- Other harnesses: if the harness accepts a `SKILL.md`, point it at\n  `docs/skills/millrace-ops-agent-manual/SKILL.md`; if it supports directory\n  packages, prefer `docs/skills/millrace-ops-agent-manual/`.\n\n## Runtime Lifecycle\n\nNeed the full implementation-accurate flow instead of the landing-page view?\nSee [the standalone lifecycle chart](docs/runtime/millrace-runtime-lifecycle-diagram.md).\n\n```mermaid\nflowchart TD\n    A[\"Bootstrap workspace and compile the frozen plan\"] --\u003e B{\"Deterministic tick loop\"}\n    B --\u003e C[\"Process control inputs:\u003cbr/\u003emailbox commands, watcher intake, reconciliation\"]\n    C --\u003e D{\"Scheduler claim decision\"}\n    D -- planning incident or spec --\u003e E[\"Planning loop:\u003cbr/\u003einterpret specs and incidents,\u003cbr/\u003egovern remediation, emit executable work\"]\n    D -- execution task --\u003e F[\"Execution loop:\u003cbr/\u003ebuild, verify, repair, recover, update\"]\n    D -- nothing claimable --\u003e G{\"Completion behavior eligible?\"}\n    G -- yes --\u003e H[\"Arbiter closure pass\"]\n    G -- no --\u003e I[\"Idle until the next tick\"]\n    E --\u003e J[\"Runtime applies results,\u003cbr/\u003epersists state, and routes the next action\"]\n    F --\u003e J\n    H --\u003e J\n    J --\u003e B\n    I --\u003e B\n```\n\nMillrace does not try to replace raw harness reasoning with a thicker prompt.\nIt wraps long-horizon work in a real runtime:\n\n- compile happens at startup and again only on explicit config reload\n- planning and execution are claim domains inside one deterministic scheduler,\n  not two concurrent lanes\n- stage results are routed by the runtime, not by direct stage-to-stage\n  handoffs\n- Arbiter activates only when the scheduler finds no lineage work left and\n  closure behavior is actually ready\n\nThe shipped core already includes separate planning and execution loops, typed\nterminal results, compiler-governed completion behavior, and persisted run\nartifacts for post-run inspection.\n\n## Early Proof\n\nMillrace already has a useful public benchmark, and the right read is not\n\"Millrace already beats raw Codex on absolute final quality.\" The useful read\nis that framework-driven orchestration is already competitive on hard,\nlong-horizon work while being much more efficient.\n\nOn the first substantive public A/B benchmark, both systems were aimed at the\nsame target: a parity-first modern Fabric port of Aura Cascade, a ten-year-old\nMinecraft mod. The stronger direct-agent condition, raw Codex on `gpt-5.4`\n`xhigh`, finished at `95 / 100`. Millrace, running as a staged daemon workflow\non routed `gpt-5.3-codex` `high` / `xhigh`, finished at `94 / 100`.\n\n| Metric | Raw Codex | Millrace |\n|------|------:|------:|\n| Final score | `95 / 100` | `94 / 100` |\n| Total tokens | `1,071,700,018` | `241,046,303` |\n| Wall-clock span | `72h 23m 20.320s` | `28h 02m 36.972s` |\n| Active runtime | `18h 04m 07.914s` | `12h 36m 15.515s` |\n\nThat means raw Codex used about `4.45x` Millrace's total tokens, took about\n`2.58x` the wall-clock span, and still used about `1.43x` Millrace's active\nruntime.\n\nThat wall-clock gap is not pure model speed. The raw Codex run needed repeated\nmanual continuation prompts whenever the operator was away from the keyboard,\nwhile Millrace kept progressing through a staged runtime. Even after accounting\nfor that, the active-runtime gap still favors Millrace.\n\nRead the full public evidence pack here:\n\n- [codex-vs-millrace-mc-mod](https://github.com/tim-osterhus/codex-vs-millrace-mc-mod)\n\n## How Millrace Fits With Raw Harnesses\n\nMillrace is not a replacement for Codex, Claude Code, Aider, or similar raw\nagent harnesses. It is the runtime layer you put around them when the work is\ntoo long-running, stateful, or recovery-sensitive to trust to a single session.\n\nThink of the split this way:\n\n- the raw harness reasons locally, edits code, and emits a stage result\n- Millrace decides which stage runs next and what contract that stage receives\n- Millrace persists queue state, runtime snapshots, artifacts, and recovery\n  context after each handoff\n- the operator or ops agent decides when work enters the runtime and how the\n  workspace is configured\n\nIf a direct Codex or Claude Code session is enough, use the direct session.\nMillrace matters when the work has crossed out of sprint territory.\n\n## When To Use Millrace\n\nUse Millrace when:\n\n- the work will outlast a single agent session\n- you want explicit stage gates instead of \"done enough\" chat conclusions\n- recovery and resumability matter\n- you need durable state, queue artifacts, and run history under\n  `\u003cworkspace\u003e/millrace-agents/`\n- completion has to clear a real closure pass rather than informal optimism\n- an operator or ops agent is intentionally managing intake and runtime control\n\nDo not use Millrace when:\n\n- the task is small, bounded, and cleanly handled in one direct session\n- the work is exploratory and governance would add more overhead than value\n- single-session throughput matters more than persistence and recovery\n- nobody is available to manage runtime configuration, intake, and workspace\n  hygiene\n\n## 60-Second Proof\n\nInstall:\n\n```bash\npip install millrace-ai\n```\n\nThen point Millrace at a workspace:\n\n```bash\nexport WORKSPACE=/absolute/path/to/your/workspace\n\nmillrace compile validate --workspace \"$WORKSPACE\"\nmillrace run once --workspace \"$WORKSPACE\"\nmillrace status --workspace \"$WORKSPACE\"\n```\n\nThat flow proves five things quickly:\n\n- Millrace can bootstrap its workspace contract under `millrace-agents/`\n- the selected mode compiles into one persisted `compiled_plan.json` before execution\n- that compiled plan carries node bindings, intake entries, recovery policies, closure-target activation, and post-stage routing\n- the shipped `default_codex` mode freezes closure behavior directly into that single compiled artifact\n- the runtime can execute a deterministic tick and report persisted status\n\nCanonical shipped modes today:\n\n- `default_codex`\n- `default_pi`\n\nCompatibility alias:\n\n- `standard_plain -\u003e default_codex`\n\n## Read By Journey\n\nNeed the single dense system explainer first?\nStart with `docs/millrace-technical-overview.md`.\n\n### Start Here\n\n- `docs/runtime/README.md`\n- `docs/skills/millrace-ops-agent-manual/SKILL.md` if you are operating\n  Millrace as an agent\n\n### Run It\n\n- `docs/runtime/millrace-cli-reference.md`\n- `docs/runtime/millrace-runtime-architecture.md`\n\n### Understand It\n\n- `docs/runtime/millrace-compiler-and-frozen-plans.md`\n- `docs/runtime/millrace-modes-and-loops.md`\n- `docs/runtime/millrace-arbiter-and-completion-behavior.md`\n- `docs/runtime/millrace-runner-architecture.md`\n\n### Extend It\n\n- `docs/runtime/millrace-entrypoint-mapping.md`\n- `docs/runtime/millrace-loop-authoring.md`\n- `docs/skills/millrace-loop-authoring/SKILL.md`\n- `docs/source-package-map.md`\n\n## Status\n\nMillrace ships as a maintained pre-1.0 runtime line. If you depend on exact\nbehavior, pin to a patch version and verify against the current CLI and docs\nrather than assuming every newer build is identical.\n\n## License\n\nSee `LICENSE`.\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftim-osterhus%2Fmillrace","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ftim-osterhus%2Fmillrace","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftim-osterhus%2Fmillrace/lists"}