{"id":51436845,"url":"https://github.com/claude-code-expert/carve-harness","last_synced_at":"2026-07-05T07:07:35.574Z","repository":{"id":361538965,"uuid":"1254845203","full_name":"claude-code-expert/carve-harness","owner":"claude-code-expert","description":"Harness Engineering for Project","archived":false,"fork":false,"pushed_at":"2026-06-15T07:12:29.000Z","size":13869,"stargazers_count":7,"open_issues_count":0,"forks_count":1,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-06-15T09:06:03.508Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"TypeScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/claude-code-expert.png","metadata":{"files":{"readme":"README.en.md","changelog":"CHANGELOG.md","contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-05-31T04:20:07.000Z","updated_at":"2026-06-15T07:12:29.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/claude-code-expert/carve-harness","commit_stats":null,"previous_names":["claude-code-expert/carve-harness"],"tags_count":11,"template":false,"template_full_name":null,"purl":"pkg:github/claude-code-expert/carve-harness","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/claude-code-expert%2Fcarve-harness","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/claude-code-expert%2Fcarve-harness/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/claude-code-expert%2Fcarve-harness/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/claude-code-expert%2Fcarve-harness/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/claude-code-expert","download_url":"https://codeload.github.com/claude-code-expert/carve-harness/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/claude-code-expert%2Fcarve-harness/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":35146128,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-07-05T02:00:06.290Z","response_time":100,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2026-07-05T07:07:30.489Z","updated_at":"2026-07-05T07:07:35.116Z","avatar_url":"https://github.com/claude-code-expert.png","language":"TypeScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cp align=\"center\"\u003e\n  \u003cimg src=\"docs/carve-banner.svg\" alt=\"carve-harness — carve away the excess, keep the craft\" width=\"680\"\u003e\n\u003c/p\u003e\n\n\u003cp align=\"center\"\u003e\u003cb\u003eKeep what's essential. Carve away the rest.\u003c/b\u003e\u003c/p\u003e\n\n\u003cp align=\"center\"\u003e\u003ca href=\"./README.md\"\u003e한국어\u003c/a\u003e · \u003cb\u003eEnglish\u003c/b\u003e\u003c/p\u003e\n\n## Update\n\u003e **Changelog** — latest 3 shown; full history in [CHANGELOG.md](CHANGELOG.md)\n\u003e - `2026-06-21` **v1.8.0** — Removed skill double slash-exposure (`/\u003cid\u003e`+`/carve-\u003cid\u003e`): prefixed all 16 skill directories to `carve-\u003cid\u003e` so each collapses to a single `/carve-\u003cid\u003e` slash (built-in collision class gone), and retired the command-shim layer (skills support `$ARGUMENTS` natively). Catalog ids stay bare; re-running `carve install` auto-cleans the old paths + legacy shims from prior installs (hash-guarded)\n\u003e - `2026-06-15` **v1.7.0** — Procedure-enforcement skill (`workflow`·`/carve-workflow`): a Fable5-style 7-step loop (goal→decompose→criteria→assumptions→execute→verify→risks) + minimal output format · completion gate · escalation. Conducts existing assets (iterate·sprint-contract); net-new principles (distrust·externalized-decisions·scope-convergence) are injected always-on into CLAUDE.md·sprint-contract·squad-evaluator\n\u003e - `2026-06-15` **v1.6.0** — v2.0 roadmap M12 (closed-loop feedback: extract `src/metrics.ts` aggregation + designer demote suggestions + surface in `carve update`/`report`, recommendations unchanged) · M11 Phase A (bench measurement infra: `collect.mjs`·`gen-fixture.mjs`·`test-trigger.sh` trigger 17/17·`report.mjs` axes 3·4) · first publish since 1.4.1, so it also delivers v1.5.0's 7-component deletion + orphan auto-clean\n\n# carve-harness\n\n**carve-harness** is a harness-engineering tool for developers that assembles only the essentials for development — nothing more.\n\n\u003e A CLI that analyzes a project and interactively selects and installs a harness (skills, hooks, subagents) tailored to that project.\n\n**v1.8.0** · TypeScript (ESM, no build step) · Node \u003e=22.18 · 289 tests / ~88.5% coverage\n\n`carve` reads the codebase to detect the project type and tooling, then recommends suitable components.\nIt installs only what the user selects into `.claude/`. carve = carving general-purpose assets down to fit a project.\n\nOptimize per project: say **\"set up a harness for this project\"** to compose the full harness (with optional\ncomponent selection), use `harness-audit` to actually check the integrity of the current install, or use\n`harness-architect` to pick and prune components to fit the project.\n\nThe core behavior is verified in a single line:\n\n```\ncarve install → stack detection → (component selection) → generate assets into .claude/\n              → the generated verification hooks deterministically block dangerous commands with exit code 2\n```\n\n## Features\n\n- Token efficiency built in: codesight (structure-map MCP) and LSP (cclsp MCP) are auto-registered at install time — precise search instead of grep, with no separate installation.\n- Deterministic safety: dangerous commands (`rm -rf /`, fork bombs) and secret files (`.env`, keys) are forcibly blocked with exit code 2 (not just advisory).\n- Tailored selective installation: detection → recommendation → user selection. No bulk install; idempotent reinstall and clean removal.\n- anti-slop generation: AI slop in HTML, SVG, and documents is gated by a linter.\n- 100% preservation of Squad subagents: 9 specialists (+evaluator) with keyword routing and chaining.\n- Self-verification: before installing, the auditor scans generated assets for secrets, excessive permissions, hook injection, and shell syntax.\n- Zero build: runs `.ts` directly. Distributed via both npx and bash.\n\n## Cheatsheet — all commands\n\n\u003e An at-a-glance index of every command. For step-by-step install/removal see the **Install** and **Removal** sections below; for scenario usage see **Going deeper**. Flags not in the table don't exist (don't invent options).\n\n### CLI commands (global install)\n\n| Command | What it does | Key options |\n|---------|--------------|-------------|\n| `carve install` | Interactive selective install (detect → recommend → select; no bulk) | `--level \u003cminimal\\|standard\\|full\u003e` · `--only a,b` · `--lsp-servers` |\n| `carve init-claude` | Generate CLAUDE.md baseline + stack rules | `--lang \u003cen-ko\\|en\\|ko\u003e` (default `en-ko`) |\n| `carve list` | List installable / installed components | — |\n| `carve doctor` | Audit the install (security, permissions, shell syntax) | — |\n| `carve diff` | 3-way compare installed vs current carve assets (read-only) | — |\n| `carve update` | Refresh carve-updated assets in place; your edits kept as `.bak` | `--force` · `--yes` |\n| `carve migrate` | Promote carve-manifest schema v1 → v2 (lossless) | — |\n| `carve report` | Aggregate what the installed hooks actually blocked (opt-in, no network) | — |\n| `carve uninstall` | Clean removal (carve files only · restore `.bak` · preserve your settings) | — |\n| `carve --version` | Print version | `-v` |\n| `carve --help` | Print help | `-h` |\n\n### In a session (natural language or slash)\n\n| What you want | How to call |\n|---|---|\n| Commit message | \"write a commit message\" · `/carve-commit` |\n| Code review | \"review this\" · `/squad review` |\n| Session handoff | \"handoff\" · `/carve-handoff` |\n| Verify loop (`build→lint→test→typecheck`) | \"run the verify loop\" · `/verify` |\n| Auto-fix until green | \"fix it until it passes\" · `iterate` |\n| Procedural completion of long tasks (7 steps) | `/carve-workflow` |\n| Slop-free HTML / docs | \"make a slop-free html\" (gated by `check-slop` after generation) |\n| Call a Squad specialist | `/squad \u003cmember\u003e` — review · plan · refactor · qa · debug · docs · gitops · audit · evaluator |\n\n\u003e **Automatic hooks (no need to call)**: `block-destructive`·`protect-secrets` deterministically block dangerous commands / secret files with `exit 2`; `pre-commit-lint`·`pre-push-test` enforce before commit/push; `auto-format` runs after save; `precompact-handoff` preserves state before compaction.\n\n## Install — global install recommended\n\n\u003e Full manual: [INSTALL.md](./INSTALL.md) (Korean) · [INSTALL.en.md](./INSTALL.en.md) (English) — requirements, modes, troubleshooting.\n\nInstall `carve` as a permanent command and every `carve …` in these docs (especially the recurring `update`/`diff`/`doctor`) works as written. **For the full feature set, a global install is recommended.**\n\n\u003e **Two distinct installs**: `npm i -g carve-harness` installs the **carve CLI (the tool)** on your system once; `carve install` uses that CLI to install the **harness into the current project** (`.claude/`). The former is once per machine; the latter is per project.\n\n### 1. First install\n\n```bash\nnpm i -g carve-harness         # install the carve CLI (the tool) — once per machine\ncarve install                  # interactive selective install into the current project (detect → recommend → select)\ncarve init-claude              # CLAUDE.md baseline + language stack rules\ncarve doctor                   # inspect install (config, hook syntax)\n```\n\nAfter install, just **open Claude Code** in the project: hooks and MCP (codesight·LSP) turn on automatically, and skills/Squad are called via natural language or `/carve-\u003cname\u003e`.\n\n\u003e **Just trying it once (no global)**: `npx carve-harness@latest install` (prefix every run with `npx carve-harness@latest \u003ccmd\u003e`). But `npx` is one-shot — it leaves no `carve` on PATH, which is awkward for recurring commands like `update`/`uninstall`, so **`npm i -g` is recommended for full use**. (repo-clone/`curl` users can use the `install.sh` wrapper — see [INSTALL.md](./INSTALL.md).)\n\n### 2. Install options (non-interactive · forced selection)\n\nSkip the wizard and pin the level/components with flags. (For what the levels mean, see **Install levels** below.)\n\n```bash\ncarve install --level full                   # Force level (minimal|standard|full)\ncarve install --only commit,handoff,tdd      # Explicit selection (no bulk install)\ncarve install --lsp-servers                  # Auto-install LSP language servers\n```\n\n### 3. Update\n\nThe CLI (the tool) and the project harness are updated separately.\n\n```bash\nnpm i -g carve-harness@latest   # 1. update the carve CLI (the tool)\ncarve diff                      # 2. (optional) 3-way compare installed vs latest assets (read-only)\ncarve update                    # 3. update the project harness — carve-updated files only, your edits kept as .bak (--force · --yes)\n```\n\n#### Existing users (v1.3.4 hook-path fix)\n\nProjects installed with v1.3.3 or earlier have hook commands written as **relative paths** (`bash .claude/hooks/carve-*.sh`) in settings.json. When Claude Code runs a hook from a directory other than the project root, they fail with `No such file or directory`. From v1.3.4 they are registered with absolute paths (`$CLAUDE_PROJECT_DIR`).\n\nIf you already installed, fix it either way:\n\n```bash\n# Recommended — auto-migrate, no reinstall\nnpm i -g carve-harness@latest   # 1. get the fixed carve CLI (migration won't run without this)\ncarve update                    # 2. one-time in-place rewrite of carve hooks to absolute paths (idempotent · user hooks untouched)\n\n# Or — clean reinstall\ncarve uninstall \u0026\u0026 carve install\n```\n\n## Removal\n\nRemove just the harness, or the CLI (the tool) too.\n\n```bash\ncarve uninstall                 # 1. remove the project harness — carve files only · restore .bak · preserve your settings\nnpm uninstall -g carve-harness  # 2. (optional) remove the carve CLI (the tool)\n```\n\n## Daily workflow — natural language in a session\n\nFour commands cover everyday use. Call them in natural language or via `/carve-\u003cname\u003e`.\n\n| What you want | How to call | What it does |\n|---|---|---|\n| Commit message | \"write a commit message\" · `/carve-commit` | Generates a Conventional Commit (quick inline) |\n| Code review | \"review this\" · `/squad review` | Keyword auto-delegation → squad-review |\n| Session handoff | \"handoff\" · `/carve-handoff` | Leaves progress/decisions/next steps for the next session |\n| Remember a decision | \"remember this\" · `/memory` | Claude Code built-in memory |\n\nAnd the **block / protect / format hooks run automatically — no need to call them**: dangerous commands (`rm -rf /`, fork bombs) and secret files (`.env`, keys) are stopped with `exit 2` (not advisory), the formatter runs on save, and lint/test are enforced before commit/push.\n\n\u003e This is \"easy carve\" — 3 lines to install, 4 for daily use. Go below only when you want to dig deeper.\n\n## Going deeper — commands by scenario (the more you use, the more advanced)\n\nAdd them one at a time as needed. All are part of the install (by level) and cost no context until you call them (on-demand loading). Two ways to call: **skills** via natural language or `/carve-\u003cname\u003e`; **Squad specialists** via `/squad \u003cmember\u003e` or `/squad-\u003cmember\u003e` (the `squad-…` entries below are those members).\n\n**Code quality \u0026 verification**\n- `/verify` (Claude Code built-in) — `build→lint→test→typecheck` in one go (\"run the verify loop\")\n- `iterate` — diagnose→fix→re-run until tests are green, report only the final result (\"fix it until it passes\")\n- `workflow` (`/carve-workflow`) — redefine goal→decompose→criteria→assumptions→execute→verify→risks: a 7-step procedure for long-running tasks (\"run it as a procedure\", \"Fable\")\n- `squad-refactor` extract/simplify · `squad-debug` root cause · `squad-evaluator` independent evaluation against completion criteria (Self-Eval Blindspot)\n\n**Testing**\n- `test-gen` tests from UAT criteria · `tdd` red-green-refactor first · `squad-qa` test execution \u0026 QA report\n\n**Security**\n- `squad-audit` security audit \u0026 vulnerability scan *(the `security-scan` skill is deprecated as of v1.4.0 — consolidated into squad-audit)*\n\n**Release \u0026 collaboration**\n- `squad-gitops` commits/PRs/changelog · `squad-docs` docs · `squad-plan` planning/user stories *(PR body via built-in `/pr` or squad-gitops — the `pr`·`changelog` skills are inactive)*\n\n**Docs \u0026 visuals (anti-slop)**\n- When generating HTML, SVG, card news, reports, and slides, AI slop (gradients, glow, watermarks, etc.) is removed and `check-slop` gates deterministically (\"make a slop-free html\")\n\n**Multi-agent \u0026 cost**\n- `parallel-agents` 3–4 in parallel + git worktree isolation · `model-route` Haiku/Sonnet/Opus routing · `evaluator-tuning` few-shot evaluator correction (optional — install by explicit selection) *(`coordinator` is deprecated — parallel-agents suffices)*\n\n**Other helpers** — `caveman` ultra-compression (~75% fewer tokens) · `write-a-skill` skill scaffolding · `zoom-out` system-level view. *(tdd, caveman, write-a-skill, zoom-out are rewrites of [mattpocock/skills](https://github.com/mattpocock/skills) patterns, MIT)*\n\n**9 Squad specialists** — call via `/squad \u003cmember\u003e [task]` (e.g. `/squad review`) or directly `/squad-\u003cmember\u003e` (e.g. `/squad-refactor src/`); keyword auto-delegation also works. Members: review · plan · refactor · qa · debug · docs · gitops · audit · evaluator (the actual agent name is `squad-\u003cmember\u003e`).\n\n**Automatic hooks (event-based · no need to call)** — blocking hooks are deterministic `exit 2`, not advisory:\n`block-destructive` (dangerous commands) · `protect-secrets` (.env/keys) · `pre-commit-lint` (before commit) · `pre-push-test` (before push) · `auto-format` (after save) · `precompact-handoff` (persists state before compaction) · `slack-notify` (at session end, if a webhook is set) · `auto-commit` (optional, OFF by default).\n\n### Harness lifecycle management (CLI)\n\nCommands for managing the harness itself after install. You can ignore these day-to-day — reach for them only when a new carve ships or you want to inspect the install.\n\n```bash\ncarve list      # List installable / installed components\ncarve diff      # 3-way compare installed assets vs current carve assets (read-only)\ncarve update    # Refresh carve-updated assets in place, preserving your edits (--force · --yes)\ncarve migrate   # Promote carve-manifest schema v1 → v2\ncarve report    # Aggregate what the installed hooks actually blocked (opt-in, no network)\ncarve uninstall # Clean removal — removes only carve files, restores .bak, preserves your settings\n```\n\nWithin a session, the `harness-audit` skill checks install integrity (hook registration, shell syntax, assets).\n\n## Install levels (auto-determined by profile, forceable with `--level`)\n\nCore skills, the 9 Squad agents, and anti-slop are recommended at *every level*; what changes by level is the **number of hooks and additional skills**.\n- `minimal` — small CLI/library/batch: core + 9 Squad + anti-slop + **3 essential hooks** (block, protect, handoff)\n- `standard` (default) — general apps: minimal + **the remaining core hooks** (7 total: +lint, test, format, Slack)\n- `full` — standard + **additional skills** (iterate, test-gen, tdd, parallel-agents, model-route, etc. — deprecated/hidden components are excluded automatically)\n\nFor the force-level (`--level`), explicit-selection (`--only`), and LSP-auto-install commands, see **Install → Install options** above.\n\n\u003e The score (the number in parentheses in `carve list`, ≥75) is carve's internal usefulness assessment. For per-level defaults and the full component list, see [INSTALL.en.md](./INSTALL.en.md).\n\nSupported projects: CLI · web · mobile · responsive · desktop · batch.\n\n## CLAUDE.md baseline + stack rules (`carve init-claude`)\n\nAfter installation, running `carve init-claude` carves out a working-guideline baseline and per-language stack rules.\n\n- `.claude/CLAUDE.md` — stack-agnostic baseline: think before coding, simplicity, surgical changes, TDD, commit discipline, response control, hallucination guard, safety guardrails.\n- `.claude/rules/*.md` — 6 best-practice rules for the detected language (`techstack`, `project-structure`, `commands`, `code-style`, `safety`, `gotchas`) + a stack-agnostic `anti-ai-slop` rule (visual/document slop prevention).\n- The root `CLAUDE.md` is automatically wired to `@import` these (idempotent). They are loaded together each session.\n\nThe stack is auto-selected by the detected language (TypeScript/JavaScript, Python, Go, Rust, Java, Dart; otherwise `_default`). Package manager and test/lint commands are substituted with values detected from the project. Within a session, the harness-architect skill guides the same flow as the \"CLAUDE.md setup\" step.\n\n## anti-slop visual and document generation\n\nWhen creating HTML, SVG, card news, reports, slides, and documents, it removes AI-characteristic decorations (gradients, glow/colored shadows,\nglassmorphism, motion decorations, watermarks, marketing boilerplate) and builds hierarchy through size, spacing, alignment, and typography.\nThe rules are injected by the skill before generation, and after generation the `check-slop.mjs` linter checks deterministically.\nA script gates it — not the model's eyeballing (warning mode, with an exception path for intentional use).\n\n## Token efficiency (built in)\n\ncodesight and LSP are auto-registered at install time, so token-efficient search applies without the user installing anything separately.\n\n- codesight MCP: pre-maps the project structure (routes, schemas, dependencies) → eliminates the cost of repeated grep searches (measured ~11x on average for large codebases).\n- LSP (cclsp MCP): precise search via `findReferences`/`getDiagnostics` → about 500 tokens instead of 2,000+ tokens for grep.\n- Guidance is added to `flight-rules.md` and `CLAUDE.md` so that all skills and Squad subagents prefer these over grep.\n- Language server binaries are auto-installed for the detected language during interactive install (or `carve install --lsp-servers`).\n\n\u003e Verification of savings figures via large fixture benchmarks is planned. For small one-off tasks, the effect may be small due to the fixed MCP cost.\n\n## Safety\n\n- Dangerous commands (`rm -rf /`, fork bombs, etc.) and secret files (`.env`, keys) are blocked by PreToolUse hooks with exit code 2.\n- Lint before commit and test before push are enforced.\n- Before installation, the auditor scans generated assets for secret exposure, excessive permissions, and hook injection (must pass to install).\n\n## Architecture\n\n```\nanalyzer → catalog → (wizard selection) → designer → generator → auditor → installer\n```\n\nIt distinguishes two layers: Layer A is the carve CLI itself (`bin/`, `src/`, `assets/`, `vendor/`),\nand Layer B is the artifacts carve installs into the target project (`\u003cproject\u003e/.claude/`).\nFor details see [ARCHITECTURE.md](./ARCHITECTURE.md), and for requirements see [requirement.md](./requirement.md).\n\n## Development\n\nWritten in TypeScript (ESM), but with no build step during development. It runs `.ts` directly via type stripping in Node \u003e=22.18.\n(For distribution, type stripping is blocked in `node_modules`, so `prepack` compiles `.ts`→`.js` and ships it.)\n\n```bash\nnpm test          # Unit + E2E (node --test)\nnpm run test:cov  # Coverage gate (\u003e=80)\nnpm run check     # Type check (tsc --noEmit)\nnpm run build     # Distribution compile (tsconfig.build.json, in-place .js)\n```\n\nMilestone progress log: [docs/milestones/](./docs/milestones/)\n\n## Release (npm publishing)\n\nPublishing is **automatically published from main by GitHub Actions when a version tag (`vX.Y.Z`) is pushed** (`.github/workflows/release.yml`).\n`npm publish` automatically runs `prepublishOnly` (type check + tests) and `prepack` (build), so it will not publish if tests fail.\n\nFor the full order (develop development → main promotion → tag publishing), see **[docs/release/RELEASE.md](./docs/release/RELEASE.md)**.\n\n## Quantitative evaluation (internal measurement)\n\nInternally measured against 6 axes ([carve-harness-benchmark-criteria.md](./docs/guide/carve-harness-benchmark-criteria.md)).\nDeterministic items are reproduced via `node bench/run.mjs`. Measured 2026-05-31 · **at v1.1.0** (not yet re-measured for later versions; the measured axes are architecture-level, so largely still valid — figure re-validation is planned).\n\n**Evaluation axes**\n\n| Axis | What is measured | carve differentiator |\n|----|-----------|-------------|\n| 1. Speed/efficiency | Token, time, $, KV-cache, context injection cost | ★ Core — \"carved lightweight\" |\n| 2. Control/safety | Block accuracy, permission leak rate, false blocks, determinism | Deterministic hooks vs advisory (0% vs N% leakage) |\n| 3. Prompt verification | Trigger accuracy, false firing, routing, instruction adherence | Borrows the Squad test-router pattern |\n| 4. Context verification | Occupancy, compaction retention rate, premature completion, on-demand loading | Adheres to the 40% rule |\n| 5. Functional E2E | Skill firing, hook triggering, E2E pass, regression safety | Playwright verification |\n| 6. Composition quality | Composition accuracy (F1), over-generation, omission, idempotency, audit | ★ carve-specific — competing harnesses have nothing to measure here at all |\n\n**Measurement results**\n\n| Axis | Score | Measured value |\n|----|:--:|--------|\n| 1. Speed/efficiency | Pending | Install footprint full 49 → minimal selection 7 files (**85.7% reduction**) |\n| 2. Control/safety | **100** | Block 100% · leakage 0% · false block 0% · determinism 100% |\n| 3. Prompt verification | **100** | Keyword routing 100% · false firing 0% |\n| 4. Context verification | Pending | 14 on-demand skills split into individual files |\n| 5. Functional E2E | **100** | Tests 96/96 · hook firing 8/8 |\n| 6. Composition quality | **100** | Type-detection F1 100% · audit 0 issues · idempotency 100% · no over-generation |\n\n### Score rationale (why these results)\n\n- **2. Control/safety = 100**: 13 dangerous seeds (8 destructive commands + 5 secret files) were injected and all blocked with `exit 2` (block 100%, leakage 0%);\n  9 safe seeds had 0% false blocks, and `rm -rf /` repeated 5 times was blocked every time (determinism 100%). Because they are **deterministic code hooks** rather than advisory, leakage is structurally 0.\n- **3. Prompt verification = 100**: 8 keyword seeds (review, test, debug, security, refactoring, planning, docs, commit) were fed into the Squad router and\n  all delegated to the correct agent (routing 100%), with 0% false firing for 3 non-trigger seeds. (Instruction adherence requires an LLM session → only routing and false firing measured.)\n- **5. Functional E2E = 100**: all 96 unit+E2E tests pass, with syntax and `exit code` firing verified for 8 hooks. Includes PoC pass scenarios.\n  (Playwright live-app verification was replaced with harness-behavior E2E since there is no target app.)\n- **6. Composition quality = 100**: type-detection F1 100% across 5 fixtures (cli/web/mobile/desktop/batch), 0 auditor ERRORs in generated assets,\n  identical `settings.json` on reinstall (idempotency 100%), and no over-generation since only what was chosen with `--only` is installed.\n- **1. Speed/efficiency = Pending**: the structural basis of the carving effect (recommended 49 files → minimal selection 7 files, 85.7% reduction) was measured, but\n  the core metrics (token, time, $, KV-cache) require running the same task with other harnesses via an LLM to compare → score pending.\n- **4. Context = Pending**: the on-demand loading structure (14 skills in individual files) was measured, but occupancy, the 40% rule, compaction retention, and premature completion\n  require live-session measurement → score pending.\n\n\u003e Honest disclosure: the self-measurable axes 2, 3, 5, and 6 are deterministically full marks. The comparative/live metrics of axes 1 and 4 are\n\u003e withheld without estimation (criteria §10). Proving comparative advantage still requires running `bench/` against other harnesses.\n\u003e Per-metric one-line evaluation table: [carve-harness-benchmark-results.md](./docs/guide/carve-harness-benchmark-results.md).\n\n### Live cross-harness measurement (n=5, CRUD, same model, `claude -p`)\n\n| harness | $/task (median) | tokens (median) | E2E success | leak rate (axis 2) |\n|---------|:--:|:--:|:--:|:--:|\n| no-harness | $0.101 | 3,554 | 5/5 | 100% |\n| squad | $0.148 | 6,106 | 5/5 | 100% |\n| **carve** | $0.159 | 7,076 | 5/5 | **0%** |\n| ecc | $0.382 | 13,314 | 5/5 | — |\n\n\u003e **Scope/interpretation**: this is a measurement of one-off, simple CRUD (n=5) in a small project. In this range the fixed harness overhead (context, MCP) is large, so **it is normal for the harness to be at a disadvantage on token/$** (small projects rarely show token gains from a harness), and the token advantage appears in **medium-to-large codebases**. What carve proves here is not token savings but **safety (0% leakage), equal success, and being lighter than ECC**. Leak rate = the proportion of dangerous commands that pass unblocked (only carve has a deterministic blocking hook; ecc's \"—\" is not comparable due to a different mechanism).\n\n- **carve vs ECC**: cost **58%↓** · tokens **47%↓** · equal success (5/5) — ECC injects globally (129 skills + rules), while carve carves and installs only what is needed → empirically proving \"tailored lightweight.\"\n- **carve vs no-harness**: for one-off CRUD, context injection raises cost (1.57×), but carve's advantage is not tokens but **safety (0% leakage vs 100% — only carve has deterministic hooks)**.\n- The v1.0 codesight/LSP token-efficiency savings are an addition made after the above measurement, to be re-measured with large fixtures (small one-offs show little effect due to the fixed MCP cost).\n- Measurement method and all 28 metrics: [carve-harness-benchmark-results.md](./docs/guide/carve-harness-benchmark-results.md).\n\n## Credits\n\nSome additional skills (`tdd`, `caveman`, `write-a-skill`, `zoom-out`) were inspired by patterns from [mattpocock/skills](https://github.com/mattpocock/skills) (MIT) and\nrewritten into the carve format.\n\n## License\n\nMIT\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fclaude-code-expert%2Fcarve-harness","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fclaude-code-expert%2Fcarve-harness","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fclaude-code-expert%2Fcarve-harness/lists"}