{"id":50351088,"url":"https://github.com/srid/agency","last_synced_at":"2026-05-29T21:01:20.149Z","repository":{"id":349048505,"uuid":"1200855116","full_name":"srid/agency","owner":"srid","description":"My near-autonomous workflow for coding agents","archived":false,"fork":false,"pushed_at":"2026-05-29T13:25:56.000Z","size":566,"stargazers_count":19,"open_issues_count":29,"forks_count":2,"subscribers_count":3,"default_branch":"master","last_synced_at":"2026-05-29T14:18:08.247Z","etag":null,"topics":["ai","skills"],"latest_commit_sha":null,"homepage":"https://agency.srid.ca","language":"Astro","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/srid.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-04-03T22:52:40.000Z","updated_at":"2026-05-29T09:34:11.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/srid/agency","commit_stats":null,"previous_names":["srid/agency"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/srid/agency","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/srid%2Fagency","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/srid%2Fagency/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/srid%2Fagency/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/srid%2Fagency/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/srid","download_url":"https://codeload.github.com/srid/agency/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/srid%2Fagency/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":33670211,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-05-29T02:00:06.066Z","response_time":107,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai","skills"],"created_at":"2026-05-29T21:01:19.254Z","updated_at":"2026-05-29T21:01:20.133Z","avatar_url":"https://github.com/srid.png","language":"Astro","funding_links":[],"categories":[],"sub_categories":[],"readme":"# agency\n\nAgency[^agency] is a near-autonomous workflow for coding agents, packaged as an [APM](https://github.com/microsoft/apm) package. Two skills: **`talk`** for design and code exploration (read-only), **`do`** for shipping end-to-end (research → implement → structural review → CI → done).\n\n\u003e [!IMPORTANT]\n\u003e Agency has mainly been tested with Claude Code \u0026 Codex; opencode is supported but less battle-tested. YMMV with other agents.\n\n## How the loop works\n\n`talk` and `do` are separate entry points, not steps in a single flow. Pick based on whether the spec is already clear:\n\n- **Spec is clear** → `/do \u003cthing\u003e` directly. Concrete change, scope obvious, you know what \"done\" looks like.\n- **Spec isn't obvious** → `/talk` first. Discuss the approach, read code, sketch an interface, let `hickey` and `lowy` chew on the sketch. Once you both converge on a plan, hand it to `/do`.\n\n`/do` does not include a design-exploration phase. Its \"research\" step is implementation-level — what files to touch, what APIs to use — not \"what should we build\". Pasting a thin plan (or a one-shot outline from another model) straight into `/do` produces a thin implementation; the fix isn't to make `/do` smarter, it's to do the design work in `/talk` first.\n\n### Feedback is the bottleneck\n\nThe autonomous loop is only as good as the feedback signal it gets. If the agent can't tell whether its change actually worked, no amount of model capability papers over that gap. End-to-end tests are a prerequisite, not a nice-to-have — unit tests and type-checks alone aren't enough. What \"e2e\" means depends on the surface:\n\n- **Frontend** → screenshot evidence in PRs ([example](https://github.com/juspay/kolu/pull/791#issuecomment-4352641175)). First-class evidence as a workflow step is tracked in [#106](https://github.com/srid/agency/issues/106).\n- **Nix-based infra** → NixOS VM tests. Honest tradeoff: a VM isn't a live environment, mocking is often required, and reaching real fidelity for non-trivial infra takes effort — but it's still the closest thing to an executable spec the agent can drive.\n- **Backend / library** → fast, deterministic e2e suites the agent can run in a tight loop. Slow or flaky suites destroy the loop; a 30-second deterministic run beats a 10-minute thorough one.\n\n`/do` enforces this end of the bargain: any change that introduces new behavior (services, endpoints, configuration, env vars, network connectivity, persistence) writes its test before the implementation, and the `test` step fails the run if no test actually exercises the changed paths. Bug fixes get the same test-first treatment; refactors and pure docs are exempt only when there's no observable behavior to verify.\n\n### State, within and across PRs\n\n**Within a PR**, `/do` writes per-step lifecycle, status, and timing to `.do-results.json` at the repo root. The `do-stop-guard` Stop hook reads this so the agent can't bail mid-workflow — if a run is still `working`, stops are blocked until it reaches `done` or is explicitly marked `failed`.\n\n**Across PRs**, there is no built-in memory, by design. Scope each PR small enough to land end-to-end in a day or two; branches that linger longer are a smell. When a piece of work genuinely doesn't fit in one PR, have `/talk` produce a GitHub issue with explicit phases, then run `/do` against each phase as its own PR — the issue is the cross-PR memory. See [juspay/kolu#514](https://github.com/juspay/kolu/issues/514) for the shape.\n\n### Structural reviews catch what tests don't\n\nType-checkers, tests, and CI catch correctness; they don't catch design. An LLM-generated diff can pass every automated gate and still complect two roles into one construct, or draw a module boundary along the wrong axis of change.\n\n`/do` closes that gap with two structural-review passes that run **post-implement on the concrete diff** as parallel sub-agents:\n\n- **`hickey`** — accidental complexity, after Rich Hickey's [*Simple Made Easy*](https://www.infoq.com/presentations/Simple-Made-Easy/).\n- **`lowy`** — volatility-based decomposition, after Juval Lowy's [*Righting Software*](https://rightingsoftware.org/) (building on [Parnas 1972](https://www.win.tue.nl/~wstomv/edu/2ip30/references/criteria_for_modularization.pdf)).\n\nEvery finding lands as its own commit in the same PR — there is no Defer disposition, no follow-up issue, no \"out of scope\" exit. The PR's scope expands to absorb each finding, even when the fix grows the diff substantially; the alternative is shipping the complected version and trusting a \"broader refactor\" follow-up that statistically never happens. Reviewers default to whole-module scope, not just the diff lines — recurring patterns in the same file are in scope even when the trigger pointed only at one symptom. PR history reads as a progression from primary implementation through each structural refinement; the full findings ledger ships as a PR comment. Both default to Sonnet on Claude Code to keep review cheap enough to run on every task; on harnesses that don't honor the `model:` skill extension (opencode, Codex), reviews use the active model. Both are also auto-invoked from `/talk` against design sketches, so the same lenses shape the spec before you ship it.\n\nRead [**Hickey/Lowy on kolu.dev**](https://kolu.dev/blog/hickey-lowy/) for the full framing — what each lens looks for and why the pair catches what tests miss. Both can be extended with project-specific patterns via `.agency/hickey.md` / `.agency/lowy.md` (see [Project config](#project-config)).\n\n## Quickstart\n\nPaste this into your AI agent (Claude Code, Codex, opencode) at the root of the repo you want to set up:\n\n```\nSet up this repo to use srid/agency by following the instructions at\nhttps://github.com/srid/agency/blob/master/docs/agency-setup.md\n```\n\nThe setup instructions are repository documentation, not an installed skill. The agent will:\n\n- Run `apm` via `uvx` (no install needed; falls back to `nix shell nixpkgs#uv -c uvx` if you have Nix but not `uvx`)\n- Create or extend `apm.yml` and run `apm install` (plus `apm compile -t codex,opencode` when those hosts are declared, since they need a project-root `AGENTS.md`)\n- Offer relevant companion skill packs — [`juspay/skills`](https://github.com/juspay/skills) (Nix and language-specific) and [`anthropics/skills/skills/frontend-design`](https://github.com/anthropics/skills/tree/main/skills/frontend-design) (UI work) — based on what's in the repo, and add the ones you confirm to `apm.yml`\n- Migrate any pre-existing `AGENTS.md` / `CLAUDE.md` content into `.apm/instructions/` so `apm compile` doesn't overwrite hand-written instructions ([#132](https://github.com/srid/agency/issues/132))\n- Draft `.agency/do.md` from your project's existing scripts\n\nReview the staged changes before committing. Pasting the same prompt again later acts as an **update** — it detects the existing install, refreshes `srid/agency` to the latest commit on its pinned ref (via `apm deps update srid/agency`), and regenerates the host folders.\n\n## What's included\n\n### Primary skills\n\n- **`do`** — Full pipeline: research → implement → structural review (`hickey`, `lowy`) → quality gate (`code-police`) → CI → evidence (opt-in) → ship. Skip specific steps by mentioning them in the prompt, or pass **`--minimal`** to skip docs / structural review / police / evidence wholesale on trivially-scoped diffs (one-line fixes, typos, config tweaks).\n- **`talk`** — Conversation-and-research mode. Discuss ideas, explore approaches, read code, inspect upstream sources in temporary scratch space when needed — read-only by default. Auto-runs `hickey` + `lowy` on design sketches. Pass **`--html`** to write the response as a self-contained HTML artifact in `$PWD` instead of replying in chat — pair with a runner that can render and select-to-comment on the artifact (e.g. [juspay/kolu#922](https://github.com/juspay/kolu/pull/922)) for tight comment-driven iteration on the same file. When the topic involves UI work, the artifact embeds *rendered* HTML/CSS prototypes of the proposed components so you can react to the visual itself, not a prose description of it.\n- **`ralph`** — Iterative measurement-driven improvement loop. Measure, profile, mutate, re-measure, commit. Works for performance, bundle size, complexity — anything quantifiable.\n\n### Supporting skills\n\n- **`hickey`** — Structural simplicity evaluation, shipped as a sub-agent (`@agent-hickey`) so it can run in parallel with `lowy`.\n- **`lowy`** — Volatility-based decomposition review, shipped as a sub-agent (`@agent-lowy`).\n- **`code-police`** — Three-pass quality gate (rule checklist, fact-check for logic errors, elegance). Rules and fact-check run as parallel sub-agents on fresh contexts so the implementer's main context — which just wrote the diff and is biased to rationalize it — can't launder violations through. The elegance pass runs sequentially after them and delegates to Claude Code's `/simplify` when available, otherwise runs an iterative refinement loop. Defaults to Sonnet on Claude Code to keep the gate cheap enough to run on every diff.\n- **`fact-check`** — Standalone correctness audit: silent error swallowing, unjustified fallbacks, wishful thinking, logic errors. Prosecutor posture, no self-dismissals.\n- **`elegance`** — Iterative elegance pass: understand, research, apply, verify. 3 iterations by default, each building on the last.\n- **`forge-pr`** — PR titles and descriptions devs actually want to read. Narrative paragraphs for the why; lists/tables/diagrams when the content is genuinely structured. GitHub today; Bitbucket support tracked in [#10](https://github.com/srid/agency/issues/10).\n\n### Hooks \u0026 instructions\n\n- **`do-stop-guard`** — Reads `.do-results.json` to keep the agent from stopping mid-`do` workflow.\n- **`apm-sources`** — Tells agents that `.claude/` is generated — edit `.apm/` sources instead.\n\n## Project config\n\nEach agency skill reads its project-specific configuration from a single file named after itself, under a top-level `.agency/` directory:\n\n| File | Read by | Contains |\n|------|---------|----------|\n| `.agency/do.md` | `/do` | `## Check command` / `## Format command` / `## Test command` / `## CI command` / `## Documentation` (required for those steps to run) and an optional `## PR evidence` section that opts into the evidence step |\n| `.agency/code-police.md` | `code-police` | extra quality rules layered on top of the built-in checklist |\n| `.agency/hickey.md` | `hickey` | extra complecting/fragmentation patterns extending the Layer 4 catalog |\n| `.agency/lowy.md` | `lowy` | project-declared areas of volatility used by the review pass |\n\nAll four files are **plain Markdown** — no frontmatter, no `applyTo:`, no APM ceremony — and **opt-in** (the consuming skill silently skips its extension behavior when the file is missing). Each skill reads only its own file; nothing crosses, so a project can adopt skills à la carte without touching the others.\n\nContent is free-form: inline prose, a pointer to another file (`See ./code-police-rules.md`), or a script reference all work. Example `.agency/do.md`:\n\n```markdown\n# /do config\n\n## Check command\njust check\n\n## Format command\njust fmt\n\n## Test command\njust test\n\n## CI command\njust ci\n\n## Documentation\nKeep README.md in sync with user-facing changes.\n\n## PR evidence\nFor every PR that touches the UI:\n\n1. Use the `chrome-devtools` MCP to launch `npm run dev` and navigate to the affected route.\n2. Capture a screenshot of the new state and upload it via `gh api` to the repo's release-asset endpoint.\n3. Embed the resulting URL inline in the PR comment under `## Evidence`.\n```\n\nAgency does not prescribe any specific tool or format — `chrome-devtools` MCP, `hyperfine`, `asciinema`, custom scripts all work.\n\nSee [Kolu's `.agency/`](https://github.com/juspay/kolu/tree/master/.agency) for a worked example.\n\n## Examples\n\n- **[Kolu](https://github.com/juspay/kolu)** — Terminal multiplexer that uses agency for its autonomous development workflow. See its [`apm.yml`](https://github.com/juspay/kolu/blob/master/apm.yml) and [`.apm/`](https://github.com/juspay/kolu/tree/master/.apm) for how project-specific instructions layer on top of agency's generic workflow.\n\n## Development\n\n```bash\njust apm       # install/regenerate\njust apm-audit # security audit\njust apm-sync  # verify nothing drifted\n```\n\n## Resources\n\n- [Video walkthrough: adding a /code-police rule](https://youtu.be/IFp0bb2D0ZE?si=1ISdAYeFw5LTaMW1\u0026t=426)\n\n[^agency]: _\"as the term ‘pure intent’ refers to an intimate connection betwixt the near-purity of the sincerity of naiveté and the pristine-purity of that actual innocence which is inherent to living life as a flesh-and-blood body only (i.e., sans identity in toto/ the entire affective faculty) then the benedictive/ liberative impetus, or **agency** as such, *stems from and/or flows from that which is totally other than ‘me’/ completely outside of ‘me’* (this factor is very important as *it is vital that such impetus, such **agency**, be not of ‘me’ or ‘my’ doings*) and literally invisible to ‘me’ … namely: that flesh-and-blood body only being thus apperceptively conscious (i.e., apperceptively sentient).\"_ — [Pure Intent](https://actualfreedom.com.au/library/topics/pureintent.htm)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsrid%2Fagency","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsrid%2Fagency","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsrid%2Fagency/lists"}