{"id":50547252,"url":"https://github.com/simonrowland/goal-flight","last_synced_at":"2026-06-04T00:00:41.929Z","repository":{"id":358131857,"uuid":"1240163717","full_name":"simonrowland/goal-flight","owner":"simonrowland","description":"Claude/Cursor plugin and scripts: long-running controller pattern to chunk and delegate code work to cli-agents, with embedded adversarial review and RAG corpus","archived":false,"fork":false,"pushed_at":"2026-06-01T08:17:23.000Z","size":1958,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-06-01T09:26:10.959Z","etag":null,"topics":["agentic","ai","claude","codingagent","devtools","llm","multiagent","python"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/simonrowland.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":"AGENTS.md","dco":null,"cla":null}},"created_at":"2026-05-15T20:46:25.000Z","updated_at":"2026-05-29T15:39:21.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/simonrowland/goal-flight","commit_stats":null,"previous_names":["simonrowland/goal-flight"],"tags_count":1,"template":false,"template_full_name":null,"purl":"pkg:github/simonrowland/goal-flight","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/simonrowland%2Fgoal-flight","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/simonrowland%2Fgoal-flight/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/simonrowland%2Fgoal-flight/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/simonrowland%2Fgoal-flight/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/simonrowland","download_url":"https://codeload.github.com/simonrowland/goal-flight/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/simonrowland%2Fgoal-flight/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":33884734,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-03T02:00:06.370Z","response_time":59,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["agentic","ai","claude","codingagent","devtools","llm","multiagent","python"],"created_at":"2026-06-04T00:00:32.861Z","updated_at":"2026-06-04T00:00:41.921Z","avatar_url":"https://github.com/simonrowland.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# goal-flight\n\n[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)\n![Python](https://img.shields.io/badge/python-3.10%2B-blue)\n![Last commit](https://img.shields.io/github/last-commit/simonrowland/goal-flight)\n![Stars](https://img.shields.io/github/stars/simonrowland/goal-flight)\n\ngoal-flight is a multi-agent orchestrator, which delegates coding /goal and parallel-review work to additional agent sessions. It lets you hand a frontier model a large software task to break down into closed chunks, and keep moving after the first context window would normally fall apart. It turns the work into durable project files: a plan, queue, environment caveats, worker status, review evidence, and resume notes that survive compaction, restarts, and overnight runs. Multi-hour runs can land as a clean stack of one-commit-per-chunk on main, with integrated self-reviews leveraging Gstack.\n\n**Orchestrator hosts.** [Claude Code](https://claude.ai/code) is the reference orchestrator. Goal Flight also ships orchestrator ports for [Codex](https://github.com/openai/codex), [Cursor](https://cursor.com), and [OpenCode](https://opencode.ai) — same `SKILL.md`, file-backed queue, and dispatch machinery, with host-specific install wrappers below. Workers include codex, cursor, grok, claude-cli, and other ACP or bash-tail adapters.\n\n**What the orchestrator is for**: high-level management, not execution. The orchestrator holds enough context about your project's goal, scenery (constraints, architecture, prior decisions, failure modes), and intent to exercise discretion and recommend the next move — then dispatches actual work to workers to run in an iterative code-review goal loop. This workflow allows lightly-supervised coding: you check in, ratify suggested moves, redirect when needed, and trust the orchestrator to keep the project anchored across compactions and unattended hours. The dispatch / review / handoff machinery below is what frees the orchestrator to do that job.\n\n[Features](#features) • [Quickstart](#quickstart) • [Architecture](docs/architecture.md) • [Commands](#commands)\n\n```bash\n# Claude Code (reference orchestrator):\ngit clone https://github.com/simonrowland/goal-flight.git ~/.claude/skills/goal-flight\n\n# Codex / Cursor / OpenCode — clone once, then one command per host (global + project):\ngit clone https://github.com/simonrowland/goal-flight.git ~/.goal-flight \u0026\u0026 cd ~/.goal-flight\n./install.sh cursor /path/to/your/project\n./install.sh opencode /path/to/your/project\n./install.sh codex\n```\n\nRestart the host, then run doctor: `python3 scripts/goalflight_doctor.py --project-root /path/to/your/project`.\n\n## Windows\n\nNative Windows support is read/plan control-plane only. Doctor, status, action\nrouting, capacity reads, and ledger reads work. Full worker dispatch requires\nWSL; live worker dispatch and file-backed review jobs refuse on native Windows\nwith a WSL next step:\n\n```powershell\nwsl --install\n```\n\nUse `%USERPROFILE%` / `$env:USERPROFILE`, not `~`, for native Windows paths:\n\n```powershell\ngit clone https://github.com/simonrowland/goal-flight.git \"$env:USERPROFILE\\.goal-flight\"\ncd \"$env:USERPROFILE\\.goal-flight\"\n.\\bin\\goalflight.ps1 core doctor read\npy -3 .\\scripts\\goalflight_doctor.py --project-root C:\\path\\to\\project --text\n```\n\nDoctor JSON includes a `wsl` field. `wsl.exe` alone is not enough: `wsl -l -q`\nmust list at least one installed distro. If install is declined during init, a\nproject-local stamp suppresses repeat prompts and native Windows continues as a\nnon-feature-complete control plane with degraded tracked-pid cleanup.\n\nNative Windows first action:\n\n```powershell\npy -3 .\\tests\\run_python.py\n```\n\nThat native runner executes the Python suite: Windows-activated tests fire, and\nPOSIX-only tests skip with a visible WSL reminder. The bash suite is POSIX/WSL\nonly; run `./tests/run.sh` from inside WSL for that layer.\n\nKeep two installs when you dispatch from WSL: one native Windows checkout for\nread/plan, one WSL checkout under the Linux home directory for dispatch. See\n[docs/hosts/windows.md](docs/hosts/windows.md) for the WSL baseline, manual\nreal-Windows acceptance gate, capability matrix, launcher details, CRLF caveat,\nand two-install procedure.\n\nSame flags via `setup.sh`: `--cursor-install`, `--opencode-install`, and `--codex-install` (each implies `--apply --yes`). Dry-run, link-to-Claude, and agents-standard paths are in [docs/hosts/cursor.md](docs/hosts/cursor.md) and [docs/hosts/opencode.md](docs/hosts/opencode.md).\n\nAfter source `SKILL.md`, `commands/`, `protocols/`, `templates/`, or `adapters/`\nchanges, copied host installs must be resynced from the source repo with\n`./install.sh \u003chost\u003e` unless the host skill is symlinked to the source. The\ndoctor's `installed_skill_drift` probe hashes only the installed `SKILL.md`\nfile (per host); a hash divergence WARN there means the wrapper is stale,\nwhich usually correlates with the other directories being stale too. Run\nthe resync command in the probe's `resync_command` field; text mode prints\n`installed_skill_md_hash` WARNs.\n\n## Features\n\n- Multi-hour unattended runs with light supervision\n- Verification-first dispatch (live files only)\n- Parallel cross-agent reviews (Claude + another model via gstack)\n- Two-axis routing (iteration pattern × comms shape)\n- Provider-aware rate-pressure walkback\n- Procedural runtime state + doctor checks\n- Self-delegation `/fork` pattern (opt-in)\n\n## What it gets you\n\n- **Multi-hour unattended runs.** Check in periodically or respond to decision notifications. The orchestrator's context primarily holds architecture, plan, and metadata (queue state, recent commits, in-flight dispatch headers); real work happens in subagent context windows.\n- **Verification-first dispatch.** Wrappers point at files for the agent to investigate, not pre-pasted \"facts\" that go stale on the timescale of minutes. Frontier models trust orchestrator-text uncritically; pointers force them to re-verify against live disk and surface drift.\n- **Parallel cross-agent reviews at milestone cadence.** Two independent reviewers (Claude + codex) address bugs and completion before pestering you. Via [gstack](https://github.com/garrytan/gstack)'s `/review` skill when installed.\n- **/goal native.** the orchestrator picks from per chunk. **Iteration pattern** (one-shot for most chunks, goal-mode loop for chunks that need plan/act/test/review-to-convergence, controller-direct for trivial work)\n- **Token Management.** Throw tokens at your problem for better code and less babysitting, but divide the usage and rate-limits between multiple agent vendors, task-by-task.\n- **Fancy Monitoring.** ACP for structured events from claude-cli, cursor, codex, grok. Uses bash-tail as fallback and to support generic agents. Orchestrator handles worker notifications, and can escalate messages to the user.\n- **Rate-limit walkback.** `goalflight_rate_pressure.py` watches the dispatch ledger for provider-level rate-limit signatures and surfaces a STATUS marker + recommended fallback when pressure crosses threshold. It tracks how many workers are on your machine to be mindful of capacity limits.\n- **Procedural runtime state.** Capacity, dispatch ledgers, compact status, log watching, doctor checks, ACP runs, rate-pressure detection, and file-backed review jobs live under `scripts/goalflight_*.py`. The doctor checks for cli-agent updates, so you can run the worker update command.\n\n## How it differs from the alternatives\n\n- vs. **running one host orchestrator naively** — the orchestrator doesn't itself do the work. It dispatches and verifies, which means it stays small and runs longer before compaction.\n- vs. **cloud agent swarms or editor agents** — runs on your machine, in your orchestrator host, with your existing local tools and adapters. It brings the benefits of a Claw team to your desktop.\n- vs. **writing prompts manually** — make the plan, not the code. The skill asks a frontier model to decompose your plan into chunks, flagging what can run in parallel and what can have a /goal pattern. Every `/goal` reviews to convergence by default; the 7-category adversarial self-review runs inside the loop until reviews pass, so the orchestrator never sees a non-converged result.\n\n## Quickstart\n\n```bash\n# In your project repo, in an orchestrator session (Claude Code wrapper shown):\n/goal-flight init \u003ctopic\u003e         # audit repo, scaffold AGENTS.md/SKILL.md + docs-private/,\n                                  # build optional RAG corpus, register codex-trust,\n                                  # probe box capacity + ACP-worker availability →\n                                  # docs-private/env-caveats.md\n/goal-flight decompose-plan       # break the plan into /goal chunks (SCOPE / CHECKLIST\n                                  # / ACCEPTANCE / FORBIDDEN), parallel reviewer pass\n/goal-flight execute              # per-chunk dispatch loop (ACP when available, else\n                                  # Bash-\u0026-tail-file), embedded self-review,\n                                  # milestone reviewer examples every K commits\n/goal-flight doctor               # validate wrapper/package health, companion tools,\n                                  # codex trust, context-mode, gstack, autoreview, ACP +\n                                  # surface model currency + rate-pressure\n/goal-flight update               # pull latest goal-flight from origin + run\n                                  # each worker CLI's self-update (codex /\n                                  # grok / cursor-agent / claude / -cli-acp)\n/goal-flight resume               # rebuild RESUME-NOTES from current git state\n                                  # (use when picking up across sessions)\n```\n\n\u003e **Working signal, not rigid gates**: the skill pins a `goal-\u003ctopic\u003e-\u003cdate\u003e.md` file at init for compaction-survival, but it's an anchor — not a contract. `decompose-plan` proceeds on whatever signal exists (the goal-statement when present, or the plan source, architecture doc, and in-session conversation), surfacing any inferred assumptions as inline-office-hours backlog items the user can validate during the run. Show up with \"here's my architecture doc plus ten minutes of context-setting chat\" and the skill takes it from there. The premises file accumulates validated answers as the run progresses. **DRAFT goal-statement is fine** — `decompose-plan` proceeds anyway; sharpen any time by editing `docs-private/goal-\u003ctopic\u003e-\u003cdate\u003e.md` directly.\n\n`/goal-flight` with no args prints `SKILL.md` — the full pattern reference.\n\n## Sub-commands\n\n| Command | What it does |\n|---------|--------------|\n| `/goal-flight init \u003ctopic\u003e` | Tool check, repo audit, scaffold, codex-trust registration |\n| `/goal-flight decompose-plan [\u003cplan\u003e]` | Break a plan into `/goal` chunks with parallel reviewer pass |\n| `/goal-flight ask-questions [\u003cscope\u003e]` | Anticipatory subagents; surface clarifying questions |\n| `/goal-flight execute [--parallel \u003cN\u003e]` | Per-chunk loop; sequential default, parallel-safe opt-in |\n| `/goal-flight doctor` | Read-only health check for plugin/package/runtime readiness, model currency, rate-pressure |\n| `/goal-flight update` | Pull latest goal-flight from origin + run each worker CLI's self-update |\n| `/goal-flight build-corpus [\u003cflags\u003e]` | Extend / rebuild the optional RAG corpus |\n| `/goal-flight resume` | Rebuild RESUME-NOTES from current git state |\n| `/goal-flight goal \u003cSLUG\u003e` | Append one goal to the queue |\n| `/goal-flight register-codex [\u003cpath\u003e]` | Register a project as codex-trusted |\n| `/goal-flight validate-dispatch [\u003cslug\u003e]` | Render a chunk's dispatch wrapper without dispatching |\n| `/goal-flight validate-queue [\u003cpath\u003e]` | Schema-check the goal-queue |\n\nPlus an opt-in self-delegation pattern via `/fork` — orchestrator writes a marker contract; forked session detects via env var and follows the contract; orchestrator monitors compact status. See `protocols/self-delegation.md`; fork instructions are not always-loaded.\n\nDetailed operating procedures are split into load-on-demand files under\n`protocols/`. The always-loaded `SKILL.md` is intentionally small.\n\n## Dispatch routing (two orthogonal axes)\n\n**Iteration pattern** — how many turns the chunk needs:\n\n| Pattern | When | Cost |\n|---|---|---|\n| **`controller-direct`** | Trivially small (single-file, \u003c ~30 LoC), OR orchestrator already has the session-loaded context a fresh subagent would have to re-discover | Inline; no subagent |\n| **one-shot subagent** | Default for most chunks. Frontier model picks the executor target based on chunk shape | One subagent dispatch per chunk |\n| **goal-mode loop** | Multi-step refactor, code migration, prototype implementation, converge code to ground-truth — anything that benefits from plan/act/test/review-to-convergence | Multi-hour autonomous session (codex `/goal` natively, or orchestrator-driven iteration loop) |\n\n**Comms shape** (orthogonal axis) — how the orchestrator observes the worker. Goal-flight uses the [Agent Client Protocol](https://agentclientprotocol.com) wherever the worker has an adapter (codex / cursor / claude / grok all do today); bash-tail with a `tail -f`-style marker-grep watcher is the cold-storage fallback. ACP composes with `goal-mode` for any worker; `goal-mode + bash-tail` composes only with codex `/goal` today (codex emits a Final-response marker the watcher detects; other workers' headless modes don't).\n\nThe orchestrator picks executor + comms per chunk based on chunk shape, available adapters, and the rate-pressure walkback's recent observations. The shipped routing defaults lean toward sub-billed workers (codex / cursor / grok) for code-writing — calibrated against the maintainer's current vendor plans, not a project-wide prescription. Adjust to your environment by editing the routing table in `SKILL.md` \"Worker Routing\"; the walkback adapts dynamically when any one provider gets pressured.\n\n## Multi-node fleet (1.0)\n\nFor remote workers over SSH, bootstrap a fleet store and use `goalflight_fleet.py`\ndispatch / watch / reconcile. OpenCode, Cursor, Codex, and Claude ACP workers\ncan run on registered nodes while the orchestrator stays local. See\n[docs/fleet.md](docs/fleet.md) for the operator guide; live smoke:\n`GOALFLIGHT_LIVE_SSH=1 ./tests/manual/test_fleet_live_smoke.sh`.\n\nUnified CLI: `bin/goalflight \u003cdomain\u003e \u003cresource\u003e \u003cverb\u003e` (action router over\n`config/actions/`).\n\n## When NOT to use this\n\n- **One-off scripts or quick fixes.** Overhead unjustified for \u003c8 chunks or \u003c2000 LoC delta. Pre-flight gates auto-skip the RAG corpus for projects this small.\n- **Pair-programming sessions.** Designed for unattended runs. If you're steering every turn, the wrapper overhead just slows you down.\n- **Sensitive operations the orchestrator shouldn't autonomously trigger.** Production deploys, prod data writes, credential rotations — human-in-the-loop is the point.\n- **Projects without test signal.** Self-review and milestone-review depend on tests / grep invariants / verification commands existing.\n\n## Companion tools (strongly recommended)\n\n- **[gstack](https://github.com/garrytan/gstack)** — Garry Tan's skill pack provides `/review`, `/office-hours`, `/plan-eng-review`, `/cso`, `/investigate` for both Claude Code and codex. Goal-flight invokes `/review` as the **default independent reviewer** for chunk-level pre-commit review (`protocols/chunk-review.md`) and for milestone reviews (`protocols/milestone-review.md`, gstack + concern-diverse sweep); `/office-hours` covers fuzzy-goal interrogation at init. **Optional** — without gstack, goal-flight falls back to local prompts at `prompts/gstack-claude-review.md` + `prompts/gstack-codex-challenge.md` (and embedded executor self-review still catches most issues). With gstack installed, you get consistent severity-ranking framing across both review lenses, which is meaningfully higher quality on long runs.\n- **autoreview** — Complementary diff-local pre-commit pass (`protocols/chunk-review.md`, `./scripts/autoreview.sh`). Runs in parallel with gstack at chunk level when the orchestrator chooses; does **not** replace gstack as the default review path. Catches diff-local issues (API footguns, missing tests on touched paths, regression invariants) that a structural reviewer may not prioritize. Requires upstream autoreview (typically the Cursor autoreview skill or `AUTOREVIEW_HELPER`); doctor reports WARN when absent.\n- **[context-mode](https://github.com/simonrowland/context-mode)** — MCP plugin that offloads large command outputs (diffs, integration test runs, codex tail files, large greps) to an FTS5 sandbox queried by pattern. The multiplier that makes 12-hour unattended runs feasible — without it, tool-output fills the orchestrator's context fast and you hit compaction early.\n\n## Maintainer test tiers\n\nDefault `./tests/run.sh` stays hermetic and cheap. Set\n`GOALFLIGHT_AUTOREVIEW=1` to include `tests/bash/test-autoreview-smoke.sh`,\nwhich runs `scripts/autoreview.sh --engine claude` against a known-good fixture\ncommit through `scripts/autoreview_claude_acp`. Each invocation consumes one\nClaude ACP-sub-billed autoreview pass.\nSet `GOALFLIGHT_ACP_LIVE=1` to include the real codex-acp dispatch smoke.\n\n## Adapting\n\nThis skill ships tuned for high-accuracy scientific programming but the patterns generalize. Workflow: clone the repo, open it in your orchestrator host (Claude Code wrapper today), and ask the host to adapt the skill for a domain project with a north star, verification command, invariants, and any domain-specific self-review categories. A strong subagent can read the whole thing, propose a diff, and apply it in one pass.\n\nMain tuning knobs:\n\n- **North star + asking discipline + token bias** — `SKILL.md` hard-conventions section.\n- **Self-review categories** — `prompts/executor-self-review.md` lines 14–35. Seven abstract categories; add domain-specific ones (e.g. SCHEMA GAP for ETL, A11Y GAP for frontend).\n- **Review cadence K** — `commands/execute.md` step 4 (\"Every K commits, default K=5\"). Change the K literal or pass `--review-every \u003cK\u003e` per run.\n- **RAG corpus slice mix + word budgets** — `templates/rag-corpus-schema.md.tpl`.\n- **`/goal` mode prompt shape** — `templates/codex-goal-prompt.md.tpl` (Objective / Workspace / Rules / Acceptance / Test gates / Final response schema).\n\n## License\n\nMIT — see [LICENSE](LICENSE).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsimonrowland%2Fgoal-flight","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsimonrowland%2Fgoal-flight","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsimonrowland%2Fgoal-flight/lists"}