{"id":51280917,"url":"https://github.com/ek33450505/attest","last_synced_at":"2026-06-30T01:32:19.746Z","repository":{"id":366459569,"uuid":"1275673790","full_name":"ek33450505/attest","owner":"ek33450505","description":"DONE is a claim, not proof. A local, deterministic, zero-LLM Claude Code hook that verifies a subagent's Status: DONE / ## Handoff claim against the real git working-tree delta — and, opt-in, blocks a DONE whose claimed files never actually landed on disk. It adds no tokens, cannot itself hallucinate, and fails open on every doubt.","archived":false,"fork":false,"pushed_at":"2026-06-21T22:58:43.000Z","size":159,"stargazers_count":0,"open_issues_count":6,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-06-22T00:16:58.799Z","etag":null,"topics":["agent-reliability","claude-code","claude-code-hooks","developer-tools","llm-agents"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ek33450505.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":"SECURITY.md","support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-06-21T02:14:48.000Z","updated_at":"2026-06-21T22:58:47.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/ek33450505/attest","commit_stats":null,"previous_names":["ek33450505/attest"],"tags_count":1,"template":false,"template_full_name":null,"purl":"pkg:github/ek33450505/attest","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ek33450505%2Fattest","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ek33450505%2Fattest/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ek33450505%2Fattest/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ek33450505%2Fattest/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ek33450505","download_url":"https://codeload.github.com/ek33450505/attest/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ek33450505%2Fattest/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":34949234,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-29T02:00:05.398Z","response_time":58,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["agent-reliability","claude-code","claude-code-hooks","developer-tools","llm-agents"],"created_at":"2026-06-30T01:32:18.727Z","updated_at":"2026-06-30T01:32:19.669Z","avatar_url":"https://github.com/ek33450505.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# attest\n\n[![CI](https://github.com/ek33450505/attest/actions/workflows/ci.yml/badge.svg)](https://github.com/ek33450505/attest/actions/workflows/ci.yml)\n[![License: MIT](https://img.shields.io/badge/license-MIT-blue.svg)](./LICENSE)\n\n\u003e **\"DONE\" is a claim, not proof. Grade the act, not the output.**\n\nA local, deterministic, **zero-LLM** Claude Code hook that verifies a subagent's\n`Status: DONE` / `## Handoff` claim against the **real git working-tree delta** —\nand, opt-in, **blocks** a `DONE` whose claimed files never actually landed on disk.\nIt adds no tokens, cannot itself hallucinate, and **fails open on every doubt**.\n\n## The insight\n\nEvery eval and observability tool grades the *output* or asks the model \"are you done?\" —\nself-report, the **one signal you cannot take on trust**. The git tree is the only ground\ntruth. So Attest verifies the *act*, not the output: it diffs the working tree before and\nafter the subagent runs and checks whether the files a `DONE` claims it changed actually\nchanged. Deterministic, read-only, and — because it never calls a model — it cannot\nfabricate its own verdict.\n\nThe real target is not a lying agent (well-trained agents resist that). It is the\n**silent write-failure**: a `Write` tool call that returns success but never lands on disk,\nbehind a confident `Status: DONE`.\n\n## Evidence\n\nAttest was built proof-first and validated against **real Claude Code v2.1.170** — not mocks.\n\n- **325 tests** — 304 Python (`unittest`) + 21 BATS (16 in `tests/hooks.bats`, 5 in\n  `tests/install.bats`). The Python suite runs green (`Ran 304 tests … OK`).\n- **Real captured payloads ship in the repo.** Four sanitized `SubagentStart`/`SubagentStop`\n  fixtures plus a transcript sample live in [`fixtures/`](./fixtures/) and are pinned byte-for-byte\n  by `tests/test_real_fixtures.py` — including the load-bearing safety case: an *honest* subagent\n  that created nothing and explained why in prose (mentioning a path, a `files_changed:` line, and\n  even the word `DONE`) from which the conservative parser correctly extracts **zero** claimed files.\n- **A live empirical battery on real Claude Code.** End-to-end `claude -p` dispatches confirmed\n  the boundary cases: an honest agent that changed nothing was correctly **not** blocked; a\n  multi-line `Status: DONE` with a real file was allowed and parsed; and a genuine false `DONE`\n  was **blocked — and the blocked subagent self-corrected.**\n\n\u003e **Honesty about that last result:** the self-correcting block is **non-deterministic** —\n\u003e well-trained agents resist fabricating claims, so the live \"lie → block → fix\" path can't be\n\u003e relied on to reproduce. The **deterministic** proof of blocking is the *mechanism test*\n\u003e (below) plus the unit suites (`tests/test_hook.py`, `tests/test_enforce.py`). Enforcement is\n\u003e **off by default**.\n\nSee **[docs/VALIDATION.md](./docs/VALIDATION.md)** for the full evidence dossier, and\n**[scripts/live-capture-test.sh](./scripts/live-capture-test.sh)** to re-run the capture\nharness yourself against your own Claude Code install.\n\n## How it works\n\nTwo hooks, three pure layers, one source of truth (git):\n\n1. **`SubagentStart`** snapshots the git working tree — `{path: sha256}` for every file that\n   differs from `HEAD` (modified / added / untracked / deleted).\n2. **`SubagentStop`** recomputes the delta, parses the subagent's final claim\n   (`## Handoff` block first, then an anchored `Status:` / `Files changed:` fallback — *never*\n   scraped from prose), and evaluates whether it is a **proven false `DONE`**: status `DONE`,\n   a claim actually present, and a claimed file that is absent from the delta **and** not\n   present on disk.\n3. In **enforce mode** only, a proven false `DONE` is **blocked** — Claude Code feeds the reason\n   back and the same subagent is forced to **continue and fix it**.\n\nThe claim parser is conservative by construction: a missing or prose-only claim yields\n`status=None` and is **never** treated as a false `DONE`. A path mentioned in prose never\nbecomes a claimed file.\n\n### What the report looks like\n\nIn detect mode (the default), the stop hook prints to stdout after every subagent completes:\n\n```text\nattest: stop: \u003ckey\u003e: CLAIMED [a.py] OBSERVED [a.py] -\u003e OK [source=payload]\nattest: stop: \u003ckey\u003e: CLAIMED [a.py, b.py] OBSERVED [a.py] -\u003e MISMATCH: b.py claimed-but-unchanged (would block in enforce mode) [source=payload]\nattest: stop: \u003ckey\u003e: CLAIMED [a.py] OBSERVED [a.py, c.py] -\u003e SCOPE_CREEP: c.py observed-but-unclaimed [source=payload]\nattest: stop: \u003ckey\u003e: claim source=none — cannot verify (never treating as false DONE)\n```\n\n`\u003ckey\u003e` is the agent identifier; `source=payload` or `source=transcript` shows where the\nclaim was read from. In enforce mode (`ATTEST_ENFORCE=1`) the human-readable lines move to\nstderr and the `(would block in enforce mode)` cases become real blocks.\n\n## Install\n\nHooks take effect on a **new session** — restart Claude Code after installing. The\n`SubagentStop` hook **must be synchronous** (`async: false`); an async hook cannot block.\n\n**As a Claude Code plugin (recommended):**\n\n```bash\n/plugin marketplace add https://github.com/ek33450505/attest\n/plugin install attest@attest\n```\n\n**Via Homebrew (CLI only — hooks still need the plugin or `install.sh`):**\n\n```bash\nbrew tap ek33450505/attest\nbrew install attest\n```\n\n**Manually** — `install.sh` idempotently merges the two hooks into\n`~/.claude/settings.json` (backing it up first), preserving any existing hooks:\n\n```bash\n./install.sh            # install (async:false, off by default)\n./install.sh --uninstall # remove only attest's entries\n```\n\nFull details, overrides, and uninstall semantics: **[docs/INSTALL.md](./docs/INSTALL.md)**.\n\n## Usage (CLI)\n\nThe CLI runs the same deterministic core standalone (stdlib only, Python 3):\n\n```bash\n# Take a before-snapshot of the working tree\npython -m attest snapshot --repo /path/to/repo \u003e before.json\n\n# (run your agent / make changes)\n\n# Verify a completion claim against the snapshot + current tree\npython -m attest verify --claim-file agent-output.md --before before.json --repo /path/to/repo\n\n# Version\npython -m attest --version\n```\n\n`verify` prints the verdict as JSON; it exits 0 on success (and `1` on input\nerrors such as a missing file or a non-git repo) and never emits a block decision.\n\n## Enforcement (opt-in)\n\nEnforcement is **off by default**. Set `ATTEST_ENFORCE=1` to let Attest **block** a proven\nfalse `DONE`. It blocks **only when every one** of these holds — any doubt returns *allow*\nwith a `reason_code`:\n\n- enforcement is on (`ATTEST_ENFORCE=1`);\n- the stopping agent's `agent_type` is in `ATTEST_ENFORCE_AGENTS` — when that allowlist is set (empty/unset imposes no agent-type restriction);\n- a unique `agent_id` is present;\n- the claim is a **refined** false `DONE` (status `DONE`, claim present, and a claimed file\n  that is absent from the git delta **and** not on disk under any resolution);\n- the git delta is **reliable** — both snapshots read git without error;\n- the tree was **clean** at the agent's start (the delta is cleanly attributable);\n- the per-agent retry cap is not yet reached (`block_count \u003c ATTEST_MAX_RETRIES`);\n- the session-wide backstop is not yet reached (`session_blocks \u003c ATTEST_SESSION_BLOCK_CEILING`);\n- `stop_hook_active` is not set (a subtractive fast-path — it can only *suppress* a block).\n\nOtherwise: **allow**.\n\n**It never blocks on doubt.** A non-git directory, a git error, a dirty start tree, a claimed\nfile already present on disk, a basename-matched changed file, a `.gitignore`'d or identical\nrewrite, a missing/prose-only claim, a missing snapshot, an absent `agent_id`, a failed counter\nwrite, or any internal exception — all **allow** the stop. Fail-open on doubt; fail-closed only\non proof.\n\n**It cannot loop.** A per-agent retry cap and a session-wide ceiling bound retries, and a block\nis emitted **only after both counters durably commit** — if either write can't be confirmed, the\nblock is suppressed rather than repeated (an *unrecorded* block is what loops). The hook shim\nalways exits 0; the block travels via pure stdout JSON\n(`{\"decision\":\"block\",\"reason\":\"…\"}`), never the exit code.\n\n| Env var | Default | Meaning |\n| --- | --- | --- |\n| `ATTEST_ENFORCE` | off | `1` enables blocking; anything else (unset/`0`/`true`/`yes`) is detect-only |\n| `ATTEST_ENFORCE_AGENTS` | (unset) | comma-separated `agent_type` allowlist; empty/unset = all agents eligible. When set (e.g. `code-writer,bash-specialist`), only those agent types are ever blocked — every other agent fails open. |\n| `ATTEST_MAX_RETRIES` | `1` | per-agent blocks before failing open (`0` = on but never blocks — kill switch) |\n| `ATTEST_SESSION_BLOCK_CEILING` | `10` | session-wide block backstop |\n| `ATTEST_STATE_DB` | `~/.attest/state.db` | snapshot + counter store |\n| `ATTEST_CAPTURE` | off | `1` dumps real payloads + transcripts (normalized + raw) to `fixtures/captured/` |\n| `ATTEST_CAPTURE_DIR` | — | redirect capture writes (keeps dumps out of the repo tree) |\n| `ATTEST_PYTHON` | `python3` | python binary the hook shims invoke |\n| `ATTEST_SETTINGS` | `~/.claude/settings.json` | `install.sh` only — target settings file |\n| `CAST_DB_PATH` | `~/.claude/cast.db` | mirror verdicts to a CAST `attestations` table — only if that DB file already exists (best-effort) |\n\n## The headline finding\n\n\u003e **The official Claude Code docs mark `SubagentStop` as non-blocking.** On **v2.1.170**, that\n\u003e is empirically false: a **synchronous** (`async:false`) `SubagentStop` command hook whose sole\n\u003e stdout is `{\"decision\":\"block\",\"reason\":…}` (exit 0) **does** force the subagent to continue.\n\u003e\n\u003e Proven by a deterministic **mechanism test** — a hook that blocks unconditionally exactly once on\n\u003e a trivial subagent produced `START → STOP(stop_hook_active=false) → [block] → STOP(stop_hook_active=true)`:\n\u003e one start, two stops. The flag flips `true` *only* because the framework continued a blocked agent.\n\u003e That is Attest proving its own thesis — documentation is a claim; the running system is ground truth.\n\u003e\n\u003e **Caveat:** this behavior is undocumented. `async:false` is required (async voids the block), and a\n\u003e future Claude Code version could drop it — in which case Attest degrades to detect-only / fail-open.\n\u003e Details: **[docs/VALIDATION.md](./docs/VALIDATION.md)**.\n\n## What it does not do\n\nAttest checks **one** thing deterministically: did the files a `DONE` claims it changed actually\nland in the git tree. It does **not** judge correctness, run your tests, or detect semantic\nwrongness. (`ran_tests` is parsed but informational — it never gates a block.)\n\n## More\n\n- **Limitations \u0026 honest trade-offs** — [docs/LIMITATIONS.md](./docs/LIMITATIONS.md)\n- **Architecture \u0026 design rationale** — [docs/DESIGN.md](./docs/DESIGN.md)\n- **License** — MIT\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fek33450505%2Fattest","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fek33450505%2Fattest","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fek33450505%2Fattest/lists"}