{"id":50532768,"url":"https://github.com/faremeter/interchange-demo-dispatch","last_synced_at":"2026-06-03T15:01:29.500Z","repository":{"id":359179565,"uuid":"1244890383","full_name":"faremeter/interchange-demo-dispatch","owner":"faremeter","description":"Proof-of-concept deterministic TypeScript orchestrator for multi-agent code generation. Built by a prose dispatch running on its own spec.","archived":false,"fork":false,"pushed_at":"2026-05-20T21:39:39.000Z","size":467,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-05-20T23:51:36.308Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"TypeScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/faremeter.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-05-20T17:45:32.000Z","updated_at":"2026-05-20T21:39:44.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/faremeter/interchange-demo-dispatch","commit_stats":null,"previous_names":["faremeter/interchange-demo-dispatch"],"tags_count":null,"template":false,"template_full_name":null,"purl":"pkg:github/faremeter/interchange-demo-dispatch","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/faremeter%2Finterchange-demo-dispatch","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/faremeter%2Finterchange-demo-dispatch/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/faremeter%2Finterchange-demo-dispatch/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/faremeter%2Finterchange-demo-dispatch/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/faremeter","download_url":"https://codeload.github.com/faremeter/interchange-demo-dispatch/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/faremeter%2Finterchange-demo-dispatch/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":33870026,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-03T02:00:06.370Z","response_time":59,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2026-06-03T15:01:28.298Z","updated_at":"2026-06-03T15:01:29.493Z","avatar_url":"https://github.com/faremeter.png","language":"TypeScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"# interchange-demo-dispatch\n\nA proof-of-concept orchestrator that coordinates multiple AI coding agents\nworking in parallel on the same codebase, with deterministic state, typed\nhandoffs, and automatic verification at every step.\n\n## What problem is this solving?\n\nMost AI coding tools today are a single agent in a chat window, doing one\ntask at a time. Building real software with AI assistance needs more than\nthat: many tasks, running in parallel, handing typed work products to each\nother, with quality checks at the boundaries — basically, the work pattern\na small engineering team uses.\n\nYou can try to coordinate that through prose instructions (\"now ask the\nplanner to..., then have the critic review..., then commit...\"). It works,\nbut it's fragile: the same prompt produces different decisions on different\ndays; failure modes are silent; state lives in the model's head rather\nthan in a file you can inspect.\n\n`interchange-demo-dispatch` is a different bet. It puts the orchestrator in plain\nTypeScript code, gives every agent a typed tool interface, and persists\nevery state transition to disk as a YAML document validated against an\n[arktype](https://arktype.io/) schema. Agents are still LLMs; the part\nthat decides who goes next, what they see, and what counts as \"done\" is\ndeterministic.\n\n## What does it demonstrate?\n\nThe orchestrator drives a dispatch end-to-end against a fixture\nproject. The full pipeline:\n\n1. **Plans** a multi-task change against the target by spawning a\n   planner agent that reads a spec and emits a validated DAG.\n2. **Provisions** a git worktree for each level of the DAG, isolated by a\n   path-escape middleware so agents cannot read or write outside their\n   sandbox.\n3. **Runs implementer agents in parallel** at each level. Their tool\n   surface is filesystem + a single terminal tool (`submitOutput`) — no\n   git access, no network beyond the inference call.\n4. **Applies a per-deviation policy** (the \"Karen\" pure function) to\n   anything an agent reports as a deviation from its plan. Moderate\n   deviations consult a \"greybeard\" agent for technical judgment; major\n   ones escalate to the operator via a file the orchestrator polls.\n5. **Commits each task** in topological order at the level fan-in, with\n   shared-file attribution.\n6. **Critiques the level** via a per-task critic and a level-gate critic.\n   Blocking findings trigger a bounded amendment loop (three rounds\n   silent, four-plus requires operator confirmation).\n7. **Verifies the final build** against a baseline captured before the\n   run started, in one of three modes chosen by the planner based on\n   the spec: `baseline-equality` (output must match; refactors /\n   migrations), `no-new-failures` (output may differ but no new parsed\n   failures; bug fixes), or `skip-comparison` (baseline kept as a\n   diagnostic record but not used as a gate; additive specs that add\n   new tests / modules / binaries). When new failures do appear, an\n   attribution agent maps them to responsible tasks, fix agents repair\n   them, the affected commits are rebuilt, and critique re-runs — until\n   either the build is clean or escalation triggers.\n8. **Normalizes any of seven enumerated interruption points** on\n   resume (mid-task crash, mid-rebuild, mid-Phase-5 fix loop, etc.)\n   so a network blip or a kill-9 does not lose persisted state.\n   Forward-path re-entry from `planning` / `gating-plan` is wired;\n   re-entry from later statuses is a tracked follow-up — the resume\n   pass still consolidates on-disk state but `runDispatch` halts\n   before re-running the forward path.\n\nThe smoke test routes every inference call through the\n`@intx/inference-testing` harness — `setupHarness()` returns a\n`deps` bundle that the smoke test passes into `runDispatch` so model\ncalls go through `harness.deps.fetch` instead of `globalThis.fetch`.\nPer-turn responses are scripted with `harness.scenario.replyOnce`,\nwhich builds a complete OpenAI SSE stream for the tool calls the\ntest wants the agent to issue. CI burns no inference budget and the\ntest asserts structural properties of the resulting on-disk state,\ngit history, and persisted run document — see\n[Testing](#testing) for the harness model in detail.\n\n## A note on how this codebase was built\n\nThe orchestrator design in `spec.md` is a self-referential exercise:\nthis repository was constructed by a prose-based version of the same\norchestrator, running an 18-task DAG against this very spec. The\nresult is a working code version of the prose skill that built it.\nNotes from that run live in `dispatch/interchange-demo-dispatch-poc/` (gitignored,\npresent in the working copy for inspection).\n\nThe dispatch surfaced two real bugs in 5b (a branch-naming collision\nand a `levelBoundaries` off-by-one) that the smoke test would have\nhit; both were fixed upstream before the final commit. It also\nsurfaced two open issues in `runDispatch`'s rebuild semantics that\nthe smoke test works around for now; those are documented as\nfollow-ups.\n\n## How to try it\n\n```sh\nbun install\nbun test ./examples/smoke-test.ts\n```\n\nThe fixture target lives in `examples/fixtures/sample-target/`. The\nsmoke spec is `examples/hello-world-spec.md`. The smoke test routes\nevery inference call through the `@intx/inference-testing`\ndeterministic harness, so no model provider is contacted; it asserts\nthe structural Definition-of-Success criteria the harness can\nexercise (DoS 1-4 in the two-level end-to-end test: planner DAG\nshape, per-level worktrees, per-task + gate critique, fan-in commits)\nplus DoS 5 (Phase 5 verification against a captured baseline, using\nthe orchestrator's `buildGateRunner` shell-execution boundary) and\nDoS 6 (resume from a persisted `planning` state, asserting the\n`resume` hook fires exactly once and the second `runDispatch` does\nnot re-run `initRun`).\n\nThe full repository test suite:\n\n```sh\nbun run lint\nbun run build\nbun run test\n```\n\nA real-inference run against a live opencode-go endpoint goes through\nthe `interchange-demo-dispatch` CLI in the target repository. Declare\na `provider` block in `dispatch-config.yaml` and export the bearer\ncredential:\n\n```sh\nOPENCODE_API_KEY=... interchange-demo-dispatch [run-name] [--skip-baseline] [--verbose]\n```\n\n`--verbose` opts into live streaming of the model's reasoning to\nstderr; without it you get a one-line-per-turn summary. To wipe a\nrun from disk after aborting, `interchange-demo-dispatch clean \u003crun-name\u003e`.\n\nSee [Configuration](#configuration) for the `dispatch-config.yaml`\nschema, [CLI](#cli) for the full verb / flag surface, and [Streaming\noutput](#streaming-output) for the trace format.\n\n## CLI\n\n`interchange-demo-dispatch` exposes three verbs (see `src/cli.ts`):\n\n```\ninterchange-demo-dispatch [run-name] [--skip-baseline] [--verbose|-v]\n    Run a dispatch against ./spec.md and ./dispatch-config.yaml in\n    the current working directory. State lands under\n    `\u003ccwd\u003e/dispatch/\u003crun-name\u003e/`. `run-name` defaults to a\n    timestamp-derived identifier when omitted.\n\ninterchange-demo-dispatch teardown \u003crun-name\u003e\n    Remove every per-level worktree associated with the named run.\n    Does not delete the dispatch directory or its contents — the\n    operator-inspectable `report.md` and `run-state.yaml` survive.\n\ninterchange-demo-dispatch clean \u003crun-name\u003e\ninterchange-demo-dispatch clean --all\n    Wipe a run from disk in full: removes every per-level worktree,\n    deletes every `dispatch/\u003crun-name\u003e/...` branch, and removes the\n    `dispatch/\u003crun-name\u003e/` directory itself. Use after aborting a run\n    when you want a clean slate. Idempotent and tolerant of partial\n    state (corrupt run-state.yaml, dangling worktrees, stale branches).\n    `--all` wipes every run in the current working directory's\n    `dispatch/` and removes the `dispatch/` root if it ends up empty.\n```\n\n`--skip-baseline` hard-overrides baseline capture for greenfield\nbootstraps where the build gate does not yet exist; Phase 5\nshort-circuits in that mode regardless of the planner's\n`verificationMode` choice.\n\n`--verbose` (or `-v`) switches the default stderr trace from one\nsummary line per turn to streaming the model's thinking and terminal\ntext line-by-line as they arrive (with 🧠 and 💬 markers\nrespectively) so the operator can watch reasoning appear in real\ntime. Tool calls and errors render identically in both modes. See\n[Streaming output](#streaming-output) below for the on-the-wire\nshape and the programmatic `trace` option.\n\n## Streaming output\n\nEvery agent the orchestrator spawns drains its inference event\nstream — `inference.error` payloads always reach stderr, and when\n`RunDispatchOptions.trace` is wired the same drain forwards\nhuman-readable lines for thinking, tool calls, and terminal text.\nThe CLI sets a stderr trace sink by default, so an operator running\nthe binary sees live progress without any setup:\n\n```\n[planner] → read_file(path=\"package.json\")\n[planner] thinking: I'll start by reading package.json to see what\n  scripts and dependencies are already declared, then…\n[planner] → proposeTask(idHint=\"install-arktype\", level=1, …)\n[implementer 1a-install-arktype] → write_file(path=\"package.json\", …)\n[implementer 1a-install-arktype] → run_shell(command=\"bun install\")\n[critic 1a-install-arktype round-1] → recordVerdict(status=\"pass\", …)\n[gate-critic level-1 round-1] → recordGateVerdict(status=\"pass\", …)\n```\n\nstdout stays reserved for the report path the CLI prints at end.\nOperators who want silence can pipe stderr to `/dev/null`; library\ncallers wire their own sink or omit it entirely (in which case only\nthe `inference.error → stderr` behaviour fires).\n\nThe `AgentTrace` type accepts either a bare `(line: string) =\u003e void`\nor `{ write, verbose: true }` for the streaming mode the\n`--verbose` flag wires up. See `src/agent-trace.ts` for the formatter\nand `drainAgentStream` for the per-event handling.\n\n## Configuration\n\n`dispatch-config.yaml` lives in the target repository's root and\ncarries three blocks:\n\n```yaml\nbuildGate:\n  - bun run lint\n  - bun run build\n  - bun run test\n\nmodelConfig:\n  planner: kimi-k2.6\n  implementer: kimi-k2.6\n  critic: kimi-k2.6\n  gateCritic: kimi-k2.6\n  greybeard: kimi-k2.6\n  attribution: kimi-k2.6\n  fixAgent: kimi-k2.6\n\nprovider:\n  baseURL: https://opencode.ai/zen/go/v1\n  adapter: openai\n```\n\n- `buildGate` — required, non-empty. The ordered shell commands the\n  orchestrator captures as the baseline, inherits as each task's\n  default `verifyCommands`, and re-runs in Phase 5.\n- `modelConfig` — required. Per-role model string, threaded straight\n  through to the inference call. Use the model identifier the endpoint\n  expects (opencode-go accepts bare names like `kimi-k2.6`; some\n  proxies require a vendor prefix).\n- `provider` — optional. When present, both `baseURL` and `adapter`\n  are required. `adapter` selects the inference HTTP API style:\n  `\"openai\"` for OpenAI-compatible endpoints (including opencode-go),\n  `\"anthropic\"` for the Anthropic API. The bearer credential comes\n  from the `OPENCODE_API_KEY` env var; the CLI fails loudly when the\n  block is declared but the env var is unset.\n\nThe planner additionally decides a **Phase 5 verification mode** as\npart of finalizing the plan, persisted in `run-state.yaml` as\n`Run.verificationMode`. Three values:\n\n- `baseline-equality` — final build output must match the baseline\n  byte-for-byte (modulo path / timestamp normalization). Pick for\n  refactors / renames / migrations.\n- `no-new-failures` — final output may differ but no new parsed\n  failures may appear. Pick for bug fixes against a baseline with\n  known-failing tests.\n- `skip-comparison` — baseline captured for diagnostic record only;\n  Phase 5 skips the equality check. Pick for additive specs (new\n  modules, new CLI binaries, new tests).\n\nThe planner's system prompt teaches the choice from the spec's\nverbs (`add` / `create` / `implement` → likely additive; `fix` /\n`repair` → likely no-new-failures; `refactor` / `rename` /\n`migrate` → likely baseline-equality). The CLI's `--skip-baseline`\nflag is the operator override — when set, no baseline is captured\nat all and Phase 5 is a no-op regardless of mode.\n\n## Testing\n\n`runDispatch` accepts an optional `deps: Dependencies` option in\n`RunDispatchOptions` — the inference-layer dependency bundle (fetch,\nclock, etc.) threaded straight through to every spawned agent. That\nsingle seam is the supported way to drive a deterministic test.\n\nThe canonical example is `examples/smoke-test.ts`:\n\n```ts\nimport { setupHarness, wire } from \"@intx/inference-testing\";\n\nconst harness = setupHarness();\n\n// Pre-register every expected inference turn. Each call enqueues a\n// one-shot OpenAI SSE response carrying the tool calls the test\n// wants the next-fetched agent to issue.\nharness.scenario.replyOnce(\"openai\", {\n  toolCalls: [\n    { callId: \"...\", name: \"proposeTask\",   argsJSON: \"...\" },\n    { callId: \"...\", name: \"finalizePlan\",  argsJSON: \"{}\" },\n  ],\n  predicate: (req) =\u003e req.method === \"POST\"\n    \u0026\u0026 req.url.endsWith(\"/chat/completions\"),\n});\n\nconst dispatchPromise = runDispatch(spec, {\n  provider: {\n    baseURL: \"https://opencode-go.test/v1\",\n    apiKey: \"smoke-test-key\",\n    adapter: \"openai\",\n  },\n  deps: harness.deps,\n});\n\n// Service the scheduled SSE chunks against the parked fetches. The\n// harness asserts quiescence at the end — every parked fetch must\n// have matched a registered scenario.\nawait harness.run();\nconst finalRun = await dispatchPromise;\n```\n\nThe harness is the **only** supported test seam for the inference\nboundary. The orchestrator does not expose per-role agent factory\noverrides; every agent role (planner, implementer, critic,\ngate-critic, fix agent, attribution, greybeard) talks to the same\nfetch instance, and the harness routes parked requests to registered\nmatchers in observation order.\n\n`RunDispatchOptions` does still expose two non-inference seams for\noperators retargeting the shell boundary:\n\n- `buildGateRunner` — used by `verifyAgainstBaseline(...)` to run the\n  configured build gate.\n- `taskVerifier` — used to run per-task verification commands.\n\nThese are independent of the inference path and do not require the\nharness.\n\n## Where the code lives\n\n```\nsrc/\n  agents/          Per-role agent factories (planner, implementer, critic,\n                   gate-critic, greybeard) — each is an @intx/agent runtime\n                   wired to a posix tool surface and exactly one terminal\n                   tool.\n  orchestrator/    The orchestrator's main loop and its constituent\n                   stages: initRun, plan, runLevel, commitLevel, gate,\n                   verifyAgainstBaseline (Phase 5), resume.\n    phase5/        The attribution + fix + rebuild + re-critique engine\n                   that drives the Phase 5 verification loop.\n    resume/        Seven independent case handlers, one per interruption\n                   point from spec.md §632-§677.\n  state/           Persisted Run document — arktype schemas, atomic YAML\n                   writes, single source of truth for the orchestrator.\n  cli.ts           interchange-demo-dispatch binary (verbs: default = run;\n                   teardown; clean).\n  agent-trace.ts   AgentTrace contract + drainAgentStream — the shared\n                   helper every spawn site uses to drain an agent's\n                   inference event stream and forward formatted lines\n                   to the operator-supplied trace sink.\n  dag-validate.ts  Pure DAG validation used by both the planner agent\n                   and resume.\n  karen.ts         Deterministic policy: per-deviation severity → action.\n                   No I/O.\n  path-escape.ts   Filesystem middleware that prevents tool calls from\n                   reading or writing outside the agent's configured root.\n  skill-loader.ts  Bundles AGENTS.md, CONVENTIONS.md, README.md, and\n                   skills/*/SKILL.md from the target repo into a single\n                   seed blob for the planner and critics.\n  terminal-tool.ts Helper that turns an @intx/agent tool call into a\n                   Promise the orchestrator can await.\n  json-schema-fixup.ts\n                   Stamps `type: \"string\"` onto enum-only JSON Schema\n                   nodes so Moonshot-flavored validators (opencode-go's\n                   kimi-k2.6) accept the arktype-emitted tool surface.\nexamples/          Smoke spec, fixture target, harness-driven smoke\n                   test (`smoke-test.ts`).\ntests/fixtures/    Per-module test fixtures.\nspec.md            The brief that drove the build.\n```\n\n## What this is not\n\n- **Not a finished product.** It is a proof-of-concept. The smoke spec\n  is a single demonstration of plumbing that works end-to-end; it is\n  not a general-purpose tool for production multi-agent workloads.\n- **No mutation testing.** The dispatch skill's `validate-fix` extension\n  is deliberately not implemented — the brief did not require it.\n- **No web UI.** CLI only. The brief is explicit on this.\n- **Two HTTP adapters.** `provider.adapter` in `dispatch-config.yaml`\n  selects `\"openai\"` (OpenAI-compatible endpoints, including\n  opencode-go) or `\"anthropic\"` (the Anthropic API). Per-role model\n  selection is configured in `dispatch-config.yaml`'s `modelConfig`\n  block; there is no model-routing layer beyond the adapter + the\n  per-role model string.\n- **Limited resume coverage.** The resume pass classifies on-disk\n  state into seven interruption cases and normalizes each. Forward-\n  path re-entry from `planning` / `gating-plan` is wired; re-entry\n  from later statuses (`executing`, `verifying`, `fixing-verification`,\n  `consolidating`) throws with a clear error and the operator-facing\n  workaround. The `clean` verb exists precisely so aborting + re-running\n  is a one-command workflow until the rest of resume is wired.\n\n## Architecture sketch\n\n```\n                          spec.md\n                             |\n                             v\n             +--------- runDispatch ---------+\n             |                               |\n             |   1. initRun                  |\n             |      (config + baseline +     |\n             |       integration branch)     |\n             |                               |\n             |   2. plan                     |\n             |      (spawn planner agent;    |\n             |       materialize DAG)        |\n             |                               |\n             |   3. for each level N:        |\n             |        runLevel  ----\u003e        |   per-level worktree;\n             |          fan implementers     |   parallel implementer\n             |          + Karen + greybeard  |   agents; submitOutput;\n             |          + operator escape    |   path-escape middleware\n             |          hatch                |\n             |                               |\n             |        commitLevel  ---\u003e      |   topological commits with\n             |          shared-file          |   shared-file attribution;\n             |          attribution          |   level boundary recorded\n             |                               |\n             |        gate  ---\u003e             |   per-task critic, level\n             |          critic + amendment   |   gate critic, bounded\n             |          loop (3/4+ caps)     |   amendment loop\n             |                               |\n             |   4. verifyAgainstBaseline    |\n             |      (Phase 5: normalize +    |\n             |       attribution agent +     |\n             |       fix phase + rebuild +   |\n             |       re-critique loop)       |\n             |                               |\n             |   5. writeFinalReport         |\n             +-------------------------------+\n\nState document persisted at every transition.\nResume picks up from any of seven interruption points.\n```\n\n## License\n\nLGPL-2.1-only.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffaremeter%2Finterchange-demo-dispatch","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ffaremeter%2Finterchange-demo-dispatch","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffaremeter%2Finterchange-demo-dispatch/lists"}