{"id":51070509,"url":"https://github.com/emre-t808/harness-harness","last_synced_at":"2026-06-23T10:00:52.776Z","repository":{"id":348707257,"uuid":"1198503916","full_name":"emre-t808/harness-harness","owner":"emre-t808","description":"The self-improving agentic harness for Claude Code. Coding agents harness LLMs — but who harnesses the coding agents?","archived":false,"fork":false,"pushed_at":"2026-04-02T16:32:00.000Z","size":86,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-04-03T01:49:35.438Z","etag":null,"topics":["agentic","ai-tools","claude-code","context-management","developer-tools","llm"],"latest_commit_sha":null,"homepage":"https://github.com/emre-t808/harness-harness","language":"JavaScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/emre-t808.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-04-01T13:40:04.000Z","updated_at":"2026-04-02T16:32:04.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/emre-t808/harness-harness","commit_stats":null,"previous_names":["emre-t808/harness-harness"],"tags_count":1,"template":false,"template_full_name":null,"purl":"pkg:github/emre-t808/harness-harness","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/emre-t808%2Fharness-harness","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/emre-t808%2Fharness-harness/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/emre-t808%2Fharness-harness/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/emre-t808%2Fharness-harness/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/emre-t808","download_url":"https://codeload.github.com/emre-t808/harness-harness/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/emre-t808%2Fharness-harness/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":34684686,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-23T02:00:07.161Z","response_time":65,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["agentic","ai-tools","claude-code","context-management","developer-tools","llm"],"created_at":"2026-06-23T10:00:30.627Z","updated_at":"2026-06-23T10:00:52.762Z","avatar_url":"https://github.com/emre-t808.png","language":"JavaScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Harness Harness\n\n**The self-improving agentic harness for Claude Code (which is a harness for Claude LLM).**\n\nCoding agents harness LLMs. But who harnesses these coding agents?\n\nMeet **Harness Harness** — a self-organizing harness that works alongside your existing hooks and keeps your agent **focused**, **updated**, **reminded**, **refueled**, and **learning** over time, automatically.\n\n\u003e **Disclaimer:** Harness Harness currently works with Claude Code only. Support for other coding agents is on the way. Use at your own risk (and enjoyment). All comments and suggestions are welcome.\n\n## What It Does\n\nClaude Code is a harness for the Claude LLM. Harness Harness is a harness for Claude Code. It observes what context your agent actually uses, measures its effectiveness, and continuously evolves to deliver better results — without you having to manually tune anything.\n\n- **Focused** — Intent-based routing loads only the rules that matter for each task. Frontend work gets design system rules. Backend work gets API conventions. No more dumping everything into every prompt.\n- **Updated** — Session state (objectives, decisions, blockers) persists to the filesystem. When Claude's context compacts, nothing is lost. Your agent always knows what it was doing.\n- **Reminded** — Budget-aware assembly fills four priority slots (identity, route rules, working memory, traces) within a token budget. The right rules are always present, in the right order.\n- **Refueled** — After compaction, critical context is re-injected automatically. Your agent recovers from context compression instead of losing its way.\n- **Learning** — Every rule is scored by how often Claude actually references it. Weekly analysis generates proposals to optimize your config. Rules that help get promoted. Rules that waste tokens get demoted. The harness improves itself.\n\n## How It Works\n\n```\nYou send a message\n    ↓\nSmart Assembler classifies intent (frontend? backend? docs?)\n    ↓\nLoads the matching route config with effectiveness-ranked rules\n    ↓\nFills budget slots: Identity → Route Rules → Working Memory → Traces\n    ↓\nInjects \u003charness-context\u003e into Claude's system prompt\n    ↓\nEvery tool call is traced (what was read, edited, referenced)\n    ↓\nSession ends → effectiveness scores calculated per rule\n    ↓\nWeekly analysis → proposals to promote/demote/rebalance\n    ↓\nYou review → approved changes update route configs\n    ↓\nNext session starts with a better harness\n```\n\n## Installation\n\n```bash\nnpm install -g harness-harness\n```\n\nOr use without installing:\n\n```bash\nnpx harness-harness init\n```\n\n**Requirements:** Node.js 20+, Python 3 (for trace capture hook), Claude Code CLI.\n\n## Quick Start\n\n```bash\n# 1. Navigate to your project\ncd my-project\n\n# 2. Initialize the harness\nharness-harness init\n\n# 3. Edit your routes with project-specific rules\n$EDITOR .harness/routes/general.md\n\n# 4. Use Claude Code normally — tracing starts automatically\n\n# 5. After a few sessions, check your dashboard\nharness-harness health\n\n# 6. Run analysis to generate optimization proposals\nharness-harness analyze\n```\n\n## Commands\n\n| Command | Description |\n|---------|-------------|\n| `init` | Scaffold harness into your project |\n| `health` | Show effectiveness dashboard |\n| `analyze` | Run effectiveness analysis (weekly or on-demand) |\n| `apply` | Apply approved route override proposals |\n| `cleanup` | Delete expired trace files (keeps summaries) |\n| `routes list` | List configured routes with budget breakdown |\n| `routes create \u003cname\u003e` | Create a new custom route |\n| `tail` | Stream `.harness/local/events.ndjson` — see hook activity live |\n| `explain [session]` | Reconstruct a session's hook timeline (default: most recent) |\n| `revert [event_id]` | List or roll back an autonomous change |\n\nAll mutating commands support `--dry-run` to preview changes.\n\n## Observability\n\nEvery hook invocation appends one structured NDJSON record to `.harness/local/events.ndjson`. The schema is small enough to scan by eye:\n\n```json\n{\"ts\":\"2026-05-09T07:30:01.000Z\",\"event_id\":\"evt_a3\",\"hook\":\"PostToolUse\",\"handler\":\"trace-capture.sh\",\"phase\":\"end\",\"exit_code\":0,\"session_id\":\"session-1778310466583\"}\n```\n\nPhases:\n- `start` / `end` — bracket a hook invocation; `explain` pairs them and computes duration\n- `error` — non-fatal failure; includes `step` name and `error` message (replaces silent catches)\n- `decision` — autonomous action taken; includes `decision.action`, `decision.rule`, `decision.reason`\n\n```bash\nharness-harness tail               # Follow the events log\nharness-harness explain --last     # Show last session's timeline\nharness-harness explain \u003csession\u003e  # Specific session\nharness-harness explain --last --json   # Machine-readable\n```\n\nA session whose hook crashed shows up as an `orphan` entry — a `start` record with no matching `end`. The log rotates at 10 MB.\n\n## Autonomous Rule Management\n\nAfter every weekly analysis, the harness can move rules between `must-load`, `load-if-budget`, and `skip` slots without human review. Three gates protect against false positives:\n\n| Gate | Promotion | Demotion |\n|------|-----------|----------|\n| Stability | rated above threshold for ≥ 3 weekly runs | — |\n| Sample size | ≥ 5 sessions injected | ≥ 5 sessions on the same route |\n| Score | mean + 0.5σ Elo (or `weeks_above_threshold ≥ 3`) | average score = 0 |\n| Cool-down | same rule cannot move more than once per 7 days | same |\n\nEvery auto-application snapshots the affected route configs into `.harness/local/reverts/` and writes a `decision` event. To roll back any change:\n\n```bash\nharness-harness revert                    # list available reverts\nharness-harness revert auto_\u003cts\u003e_\u003crule\u003e   # restore a specific snapshot\n```\n\nTo disable: `{ \"autonomy\": { \"enabled\": false } }` in `.harness/config.json`.\n\n## Delegate Assembler\n\nProjects with their own context-assembly script can take over by setting:\n\n```json\n{ \"assembler\": { \"delegate\": \"scripts/my-assembler.js\" } }\n```\n\n`hh-assembler.js` will exec the configured script with stdin piped through and emit its output. If the delegate fails, falls through to the harness-harness assembler. Useful when you'd otherwise be running two assemblers in parallel and getting two `\u003charness-context\u003e` blocks per prompt.\n\n## What Gets Created\n\n```\nyour-project/\n├── .harness/\n│   ├── config.json              # Harness configuration\n│   ├── routes/                  # Intent-based route configs\n│   │   ├── general.md           # Default fallback route\n│   │   ├── coding-frontend.md   # Frontend development\n│   │   ├── coding-backend.md    # Backend development\n│   │   └── coding-meta.md       # Tooling and harness work\n│   ├── memory/\n│   │   ├── harness-effectiveness.md  # Rule scores by route\n│   │   ├── route-overrides.md        # Pending proposals\n│   │   ├── trace-patterns.md         # Session pattern log\n│   │   └── work-status.md            # Current work state\n│   └── sessions/                # Session state directories\n└── .claude/\n    ├── hooks/                   # Auto-installed hooks\n    │   ├── hh-assembler.js      # Smart context assembler\n    │   ├── hh-trace-capture.sh  # Tool call tracer\n    │   ├── hh-session-summary.js # Session analysis\n    │   ├── hh-state-nudge.sh    # State update nudges\n    │   └── hh-assembler-fallback.sh\n    └── traces/                  # Raw trace data\n        └── {date}/\n            ├── {session}.jsonl\n            ├── {session}-manifest.json\n            └── {session}-summary.md\n```\n\n## Route Configuration\n\nRoutes are markdown files with YAML frontmatter that tell the assembler what to load for each type of task:\n\n```markdown\n---\nintent: coding:frontend\nbudget:\n  identity: 10\n  route_context: 35\n  working_memory: 10\n  traces: 10\n  reserved: 35\n---\n\n## Identity\nYour project description and key conventions.\n\n## Route Context\n### Must Load\n- UI-001: Use design tokens, not raw values\n- UI-002: All components must be accessible\n\n### Load If Budget Allows (ordered by effectiveness score)\n- docs/design-system.md (full)\n- docs/component-patterns.md (full)\n\n### Skip (low effectiveness for this route)\n- Backend-only rules\n```\n\n**Budget slots** divide the context window:\n- **Identity** (10-15%) — Project description, always loaded\n- **Route Context** (25-35%) — Task-specific rules and reference files\n- **Working Memory** (10-15%) — Session state, active work\n- **Traces** (10%) — Patterns from recent similar sessions\n- **Reserved** (35%) — Left for Claude's actual work\n\n## Custom Routes\n\nCreate routes for your project's specific workflows:\n\n```bash\n# Create a route for Python data science work\nharness-harness routes create coding-python\n\n# Create a route for infrastructure/DevOps\nharness-harness routes create infra\n\n# Edit the route with your rules\n$EDITOR .harness/routes/coding-python.md\n```\n\nAdd custom intent keywords in `.harness/config.json`:\n\n```json\n{\n  \"customIntents\": [\n    [\"coding:python\", [\"pandas\", \"numpy\", \"jupyter\", \"dataframe\", \"matplotlib\"]],\n    [\"infra\", [\"terraform\", \"docker\", \"k8s\", \"deploy\", \"ci\", \"pipeline\"]]\n  ]\n}\n```\n\n## Effectiveness Scoring\n\nEvery rule injected into Claude's context is scored at session end:\n\n| Score | Evidence | Meaning |\n|-------|----------|---------|\n| 0.0 | ignored | Rule was injected but never referenced |\n| 0.5 | implicit | Claude followed the rule without citing it |\n| 1.0 | referenced | Claude explicitly mentioned the rule |\n| 2.0 | prevented-mistake | Claude read an anti-pattern and avoided the mistake |\n\nOver time, these scores drive automatic optimization:\n- Rules scoring \u003e0.75 everywhere → promoted to Identity layer\n- Rules scoring \u003c0.10 in a route → proposed for demotion to Skip\n- \"Load If Budget Allows\" items → auto-reordered by score (no approval needed)\n\n## The Self-Improvement Loop\n\n```\n  Observe          Analyze          Propose          Review          Apply\n┌──────────┐   ┌──────────┐   ┌──────────┐   ┌──────────┐   ┌──────────┐\n│  Trace    │──→│  Weekly  │──→│ Generate │──→│  Human   │──→│  Update  │\n│  every    │   │  score   │   │ promote/ │   │ approve/ │   │  route   │\n│  tool     │   │  aggre-  │   │ demote/  │   │  reject  │   │  configs │\n│  call     │   │  gation  │   │ reorder  │   │          │   │          │\n└──────────┘   └──────────┘   └──────────┘   └──────────┘   └──────────┘\n      ↑                                                            │\n      └────────────────────────────────────────────────────────────┘\n                        Next session uses improved harness\n```\n\n**Safety guarantees:**\n- Rules with \"prevented-mistake\" evidence are never proposed for demotion\n- Only reordering is auto-applied; promotions/demotions require your approval\n- All changes are committed to git with full history\n- `--dry-run` flag on every mutating command\n\n## File-to-Rules Mapping\n\nMap your rule files to their IDs for effectiveness-based ordering:\n\n```json\n{\n  \"fileToRules\": {\n    \"design-system.md\": [\"UI-001\", \"UI-002\", \"UI-003\"],\n    \"api-conventions.md\": [\"API-001\", \"API-002\"],\n    \"testing-standards.md\": [\"TEST-001\", \"TEST-002\"]\n  }\n}\n```\n\nThis tells the assembler which rule IDs correspond to which files, enabling accurate effectiveness scoring and budget-aware loading.\n\n## FAQ\n\n**Does this slow down Claude Code?**\nNo. The assembler runs in \u003c100ms. Trace capture is async and adds \u003c20ms per tool call. Session summaries run at session end only.\n\n**Does this read my code or messages?**\nTraces capture tool names, file paths, output sizes, and (as of v0.4.0) the first 2KB of content written by `Edit` / `Write` tool calls (base64-encoded, stored as `response_snippet`). The snippet powers semantic verification of behavioral rules via the `content_includes` signal. If you prefer the v0.3.x behavior, set `trace.captureResponseSnippets: false` in `.harness/config.json`. Summaries still analyze which rules were referenced, not what you discussed.\n\n**Can I use this with other AI coding tools?**\nCurrently designed for Claude Code's hook system. Support for other coding agents is planned. The core libraries (intent classification, effectiveness scoring, budget assembly) are generic and could be adapted for any tool with hook/plugin support.\n\n**What happens if I delete .harness/?**\nClaude Code continues to work normally. You lose your route configs, effectiveness history, and session state. Traces in .claude/traces/ are unaffected.\n\n**Does this work with my existing hooks?**\nYes. Harness Harness is designed to work *alongside* your existing hooks, not replace them. All HH hooks are prefixed `hh-` to avoid collisions. The `init` command detects conflicts and offers three modes: `--merge` (install alongside), `--replace` (backup and replace), or `--trace-only` (just observe, don't inject context).\n\n## Team Usage\n\n### Initial Setup (team lead, once per repo)\n\n```bash\nharness-harness init\ngit add .harness/routes/ .harness/config.json .harness/memory/\ngit commit -m \"feat: add harness-harness config\"\n```\n\n### Developer Onboarding (each developer, on clone)\n\n```bash\ngit clone \u003crepo\u003e\nnpm install        # or: npm i -g harness-harness\nharness-harness init --local-only\n```\n\n### What to Commit vs Gitignore\n\n| Commit (shared team config) | Gitignore (per-developer) |\n|------------------------------|---------------------------|\n| `.harness/routes/` | `.harness/local/` |\n| `.harness/config.json` | `.claude/hooks/hh-*` |\n| `.harness/memory/harness-effectiveness.md` | `.claude/settings.json` |\n| `.harness/memory/route-overrides.md` | `.claude/traces/` |\n\n### Developer Overrides\n\nCreate files in `.harness/local/` to override team config without affecting others:\n\n```bash\n# Override a route's budget and rules\ncp .harness/routes/coding-backend.md .harness/local/routes/coding-backend.md\n# Edit .harness/local/routes/coding-backend.md with your preferences\n\n# Override project config (custom intents, fileToRules)\necho '{\"customIntents\": [[\"coding:python\", [\"pandas\"]]]}' \u003e .harness/local/config.json\n```\n\nDeveloper overrides are gitignored. No PR needed, no team impact.\n\n### Sharing Improvements\n\n1. Run `harness-harness analyze` locally\n2. Review proposals: `cat .harness/local/memory/route-overrides.md`\n3. If a proposal benefits the team, edit the shared route in `.harness/routes/`\n4. Open a PR for team review\n\n## License\n\nBusiness Source License 1.1 — Free for personal and non-commercial use. Commercial use requires a license. Converts to Apache 2.0 on April 1, 2029.\n\nSee [LICENSE](./LICENSE) for full terms.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Femre-t808%2Fharness-harness","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Femre-t808%2Fharness-harness","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Femre-t808%2Fharness-harness/lists"}