{"id":50999013,"url":"https://github.com/workos/case","last_synced_at":"2026-06-20T12:34:22.384Z","repository":{"id":345077295,"uuid":"1177008280","full_name":"workos/case","owner":"workos","description":null,"archived":false,"fork":false,"pushed_at":"2026-05-19T00:10:30.000Z","size":1772,"stargazers_count":22,"open_issues_count":2,"forks_count":4,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-05-19T00:51:24.117Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"TypeScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/workos.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":"AGENTS.md","dco":null,"cla":null}},"created_at":"2026-03-09T15:48:16.000Z","updated_at":"2026-05-18T22:38:26.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/workos/case","commit_stats":null,"previous_names":["workos/case"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/workos/case","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/workos%2Fcase","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/workos%2Fcase/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/workos%2Fcase/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/workos%2Fcase/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/workos","download_url":"https://codeload.github.com/workos/case/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/workos%2Fcase/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":34570538,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-20T02:00:06.407Z","response_time":98,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2026-06-20T12:34:20.976Z","updated_at":"2026-06-20T12:34:22.376Z","avatar_url":"https://github.com/workos.png","language":"TypeScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Case\n\n\u003cimg width=\"500\" height=\"500\" alt=\"Case\" src=\"docs/case-logo.svg\" /\u003e\n\nCase is the reliability layer for agent-authored pull requests.\n\nIts job is narrow: turn a clearly scoped task into a reviewed PR with evidence, and make the next run better when this one fails. Case is not a generic agent platform, a dashboard product, or a place to accumulate every possible workflow idea. Humans steer. Agents execute. The harness keeps the work reviewable.\n\n## Why It Exists\n\nAgents are useful when the surrounding system makes good work easier than bad work. Case provides that surrounding system:\n\n- A shared map of target repos, commands, architecture notes, and conventions.\n- A task format that separates human intent from machine-updated state.\n- A small multi-agent pipeline with isolated responsibilities.\n- Evidence gates for tests, manual verification, review, and PR creation.\n- Retrospective learning so repeated failures become docs, playbooks, or enforcement.\n\nThe north star:\n\n\u003e Case exists to make agent-authored PRs reliable, reviewable, and self-improving.\n\n## Core Loop\n\nFrom a target repo:\n\n```bash\nca 1234\n```\n\nCase detects the repo, fetches the GitHub issue, creates task files, runs a baseline check, and dispatches the pipeline:\n\n```text\nscout -\u003e implementer -\u003e verifier -\u003e reviewer -\u003e closer -\u003e retrospective\n```\n\nFor unclear work, use the human steering path:\n\n```bash\nca --agent\nca --agent 1234\n```\n\n`ca --agent` starts an interactive orchestrator session. It can inspect context, fetch issues, help shape the task, create the task file, and then run the pipeline. It should not implement directly. This is the primary interface for “humans steer.”\n\nFor an existing task file:\n\n```bash\nca run --task .case/tasks/active/cli-1-issue-53.task.json\n```\n\nTo resume an interrupted issue run, re-run the same command:\n\n```bash\nca 1234\n```\n\nCase reuses the existing task when it finds one and resumes from stored state.\n\n## What Belongs\n\nCase should stay focused on the PR loop. A feature belongs when it does at least one of these:\n\n- Makes `ca \u003cissue\u003e` or `ca --agent \u003cissue\u003e` more likely to produce a correct PR.\n- Converts an observed agent failure into a repeatable guardrail.\n- Preserves context isolation, evidence, or resumability.\n- Can be tested hermetically without depending on one user's machine.\n\nCurrent non-goals:\n\n- Generic agent platform features.\n- Local dashboards and webhook services.\n- Human approval browser UI between pipeline phases.\n- Specialized reviewer fleets.\n- Ideation/spec execution as a first-class runtime.\n\nThose ideas may be revisited only after the core PR loop is boringly reliable.\n\n## Setup\n\nRequires [Bun](https://bun.sh) \u003e= 1.0.\n\n```bash\nbun install\nbun link\nca init\n```\n\n`ca init` creates `~/.config/case/` and migrates local state from the repo when run from the case checkout. Re-running it is safe.\n\nBuild a standalone binary:\n\n```bash\nbun run build:binary\ncp dist/ca /usr/local/bin/ca\n```\n\n`build:binary` regenerates the embedded package asset manifest before compiling. The resulting `dist/ca` is portable: agent prompts, docs, playbooks, and AST rules are bundled into the executable. The binary is `ca` because `case` is a reserved word in bash and zsh.\n\n## CLI\n\nPrimary commands:\n\n```bash\nca 1234                 # create or resume a GitHub issue run\nca DX-1234              # create or resume a Linear issue run\nca --agent              # interactive steering session\nca --agent 1234         # steering session with issue context\nca onboard \u003cpath\u003e                    # add a repo to projects.json\nca onboard \u003cpath\u003e --interview        # add a repo with an interactive interview\nca onboard \u003crepo\u003e --re-interview     # re-interview an already-onboarded repo\nca run --task \u003cfile\u003e    # run an existing task JSON\nca watch \u003ctask-slug\u003e    # live-tail the event log\n```\n\nAgent-facing commands:\n\n```bash\nca session \u003crepo-path\u003e --task \u003ctask.json\u003e\nca status \u003ctask.json\u003e [field value...]\nca mark-tested\nca mark-manual-tested\nca mark-reviewed --critical 0\nca update-memory --state \"...\" --approach \"...\" --file \u003cpath\u003e\nca upload \u003cfile\u003e\nca snapshot \u003cagent-name\u003e\nca create --repo \u003cname\u003e --title \u003ctitle\u003e --description \u003ctext\u003e --evidence \u003cexpectations\u003e\nca analyze-failure \u003ctask.json\u003e \u003cagent\u003e \u003cerror\u003e\nca bootstrap \u003crepo\u003e\nca check [--repo \u003crepo\u003e]\n```\n\nCommon flags:\n\n```bash\nca --model claude-opus-4-5 1234\nca run --task \u003cfile\u003e --mode unattended\nca run --task \u003cfile\u003e --dry-run\nca run --fresh 1234\n```\n\n## Storage Layout\n\nPackage-level config lives under `~/.config/case/`. Per-repo runtime state lives under each target repo's ignored `.case/` directory:\n\n```text\n~/.config/case/\n  config.json\n  projects.json\n  agent-versions/\n\n\u003ctarget-repo\u003e/.case/\n  active\n  learnings.md\n  amendments/\n  run-log.jsonl\n  tasks/\n    active/\n      \u003ctask-slug\u003e.md\n      \u003ctask-slug\u003e.task.json\n  \u003ctask-slug\u003e/\n    events/\n    plan.json\n    working-memory.json\n```\n\nOverride the config/cache directory with:\n\n```bash\nCASE_DATA_DIR=/tmp/case-test ca init\n```\n\nStatic package assets are versioned with Case and embedded into the standalone binary: `agents/`, markdown under `docs/`, and text rules under `ast-rules/`. When running from a checkout, disk files win so local prompt/doc edits are picked up immediately; set `CASE_PACKAGE_ROOT=/path/to/case` to force a specific checkout as the disk override.\n\nEach entry in `projects.json` may optionally include `credentials` (per-repo secrets needed for verification) and `verificationNotes` (free-form context the verifier should know about the repo).\n\nFor portable binary installs, keep `projects.json` in `~/.config/case/` via `ca init --projects \u003cpath\u003e` or `ca init --migrate-from \u003ccase-checkout\u003e`. Repo paths in a portable `projects.json` should be absolute or relative to that `projects.json` file.\n\n## Pipeline\n\nThe runtime uses a deterministic TypeScript pipeline executor for phase transitions. The LLMs do the work inside each phase; TypeScript decides which phase runs next.\n\nProfiles:\n\n- `standard`: scout, implement, verify, review, close, retrospective.\n- `tiny`: implement, review, close, retrospective. Use only for docs, typos, and mechanical config changes where independent verification is not useful.\n\nRevision loops are evaluator-driven. A verifier or reviewer rubric failure can send structured feedback back to the implementer. The default revision budget is two cycles. If consecutive cycles produce identical failure fingerprints (SHA-256 of failed categories + error summary), the pipeline aborts early instead of burning the remaining budget.\n\nEvery run writes an append-only event log under `\u003ctarget-repo\u003e/.case/\u003ctask-slug\u003e/events/`. `ca watch \u003ctask-slug\u003e` renders those events while a run is active.\n\nEvery task carries `evidenceExpectations` — the concrete artifacts the verifier must produce. The orchestrator writes these based on the target repo's `evidenceStrategy` so the verifier knows what counts as proof up front.\n\n## Agent Roles\n\n| Agent         | Responsibility                                                       | Does Not Do                         |\n| ------------- | -------------------------------------------------------------------- | ----------------------------------- |\n| Orchestrator¹ | Parses issues, creates tasks, runs baseline, dispatches the pipeline | Implement code                      |\n| Scout         | Explores the target repo read-only and returns structured findings   | Edit code, write files              |\n| Implementer   | Writes the fix, runs automated tests, commits                        | Manual browser testing, PR creation |\n| Verifier      | Tests the specific user-facing scenario and records evidence         | Edit code                           |\n| Reviewer      | Reviews the diff against golden principles and conventions           | Edit code or create PRs             |\n| Closer        | Creates the PR after evidence gates pass                             | Implement or test                   |\n| Retrospective | Records learnings and proposes harness improvements                  | Edit target repo code               |\n\n¹ The orchestrator runs as an LLM agent session via `ca --agent`, or as TypeScript runtime code for direct `ca \u003cissue\u003e` dispatch.\n\nThe key boundary is context isolation. Scout context is read-only exploration of the target repo; its structured findings (relevant files, patterns, test baseline) are synthesized by the orchestrator and injected into the implementer's prompt. Implementer context includes task details, playbooks, repo learnings, scout findings, and revision feedback. Verifier context is intentionally fresher. Reviewer context is focused on the diff and principles.\n\n## Evidence Gates\n\nEvidence markers live under the target repo's `.case/\u003ctask-slug\u003e/` directory:\n\n- `tested`: created by `ca mark-tested` from real test output.\n- `manual-tested`: created by `ca mark-manual-tested` from manual/browser verification evidence.\n- `reviewed`: created by `ca mark-reviewed --critical 0`.\n\nThe closer checks these markers before opening a PR. The point is not ceremony; it is making the PR auditable without trusting a chat transcript.\n\nEach repo declares an `evidenceStrategy` in `projects.json` that drives what the verifier produces:\n\n- `ui-screenshot`: Playwright before/after screenshots for user-facing UI changes.\n- `scenario-script`: a consumer script that exercises the specific user-facing scenario.\n- `test-output`: automated test output only (for libraries and non-UI code).\n\n## Self-Improvement\n\nAfter a run, the retrospective agent should leave the harness smarter:\n\n- Append tactical repo learnings under `\u003ctarget-repo\u003e/.case/learnings.md`.\n- Propose broader harness changes under `\u003ctarget-repo\u003e/.case/amendments/`.\n- Escalate repeated failures into docs, playbooks, conventions, or enforcement.\n\nRetrospective output is constrained. It should not expand the product surface by default. The fix for repeated agent failure is usually a clearer task, a better playbook, a sharper convention, or a mechanical guardrail.\n\n## Model Configuration\n\nConfigure models in `~/.config/case/config.json`:\n\n```json\n{\n  \"$schema\": \"https://raw.githubusercontent.com/workos/case/main/config.schema.json\",\n  \"models\": {\n    \"default\": { \"provider\": \"anthropic\", \"model\": \"claude-sonnet-4-20250514\" },\n    \"reviewer\": { \"provider\": \"google\", \"model\": \"gemini-2.5-pro\" },\n    \"verifier\": null\n  }\n}\n```\n\nPriority:\n\n```text\n--model flag \u003e explicit spawn options \u003e config file \u003e hardcoded default\n```\n\n## Repository Map\n\nTarget repos are listed in `~/.config/case/projects.json` (created by `ca init` + `ca onboard`). The schema is `projects.schema.json` in this repo.\n\nAdd a repo with:\n\n```bash\nca onboard \u003cpath\u003e                    # mechanical probe only\nca onboard \u003cpath\u003e --interview        # mechanical probe + interactive interview\nca onboard \u003crepo\u003e --re-interview     # update an existing entry by re-interviewing\n```\n\n`--interview` runs the interviewer agent after the mechanical probe to capture evidence strategy rationale, verification notes, conventions, and repo-specific learnings. The interview writes the seed `.case/learnings.md` and `CLAUDE.local.md` alongside the `projects.json` entry. `--re-interview` re-runs the interview for an existing repo and replaces its `projects.json` entry in place.\n\nThen add any needed architecture notes under `docs/architecture/` and verify with:\n\n```bash\nca check --repo \u003cname\u003e\n```\n\n## Development Checks\n\nFor case itself:\n\n```bash\nbun run typecheck\nbun test ./src/__tests__/\nbun run lint\nbun run format:check\n```\n\nFor target repos:\n\n```bash\nca bootstrap \u003crepo\u003e\nca check --repo \u003crepo\u003e\n```\n\n## Philosophy\n\nThe short version:\n\n- Humans steer. Agents execute.\n- The harness is the product; target repo code is the output.\n- When agents struggle, fix the harness.\n- Enforce mechanically, not rhetorically.\n- Test the specific fix, not the happy path.\n- Keep the tool small unless reliability demands complexity.\n\nSee [docs/philosophy.md](docs/philosophy.md) for the fuller version.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fworkos%2Fcase","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fworkos%2Fcase","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fworkos%2Fcase/lists"}