{"id":51172192,"url":"https://github.com/noamsto/agent-smith","last_synced_at":"2026-06-27T01:09:59.847Z","repository":{"id":361835118,"uuid":"1256037490","full_name":"noamsto/agent-smith","owner":"noamsto","description":"An Agent that propagates improvements into other agents — mines session history + audits freshness, then PRs fixes.","archived":false,"fork":false,"pushed_at":"2026-06-22T06:49:33.000Z","size":466,"stargazers_count":3,"open_issues_count":2,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-06-22T07:23:51.985Z","etag":null,"topics":["ai-agents","claude-code","developer-tools","duckdb","llm","meta-agent","nix","self-improving"],"latest_commit_sha":null,"homepage":null,"language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/noamsto.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-06-01T12:01:57.000Z","updated_at":"2026-06-22T06:34:50.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/noamsto/agent-smith","commit_stats":null,"previous_names":["noamsto/agent-smith"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/noamsto/agent-smith","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/noamsto%2Fagent-smith","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/noamsto%2Fagent-smith/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/noamsto%2Fagent-smith/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/noamsto%2Fagent-smith/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/noamsto","download_url":"https://codeload.github.com/noamsto/agent-smith/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/noamsto%2Fagent-smith/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":34838089,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-26T02:00:06.560Z","response_time":106,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai-agents","claude-code","developer-tools","duckdb","llm","meta-agent","nix","self-improving"],"created_at":"2026-06-27T01:09:59.622Z","updated_at":"2026-06-27T01:09:59.807Z","avatar_url":"https://github.com/noamsto.png","language":"Go","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cdiv align=\"center\"\u003e\n\n# 🕶️ agent-smith\n\n**An Agent whose only purpose is propagating improvements into other agents.**\n\n*It patrols the agents-matrix, finds the glitches where reality stuttered, and rewrites the rules so the déjà vu stops happening.*\n\n[![status](https://img.shields.io/badge/status-phase%201%20live-00ff41?labelColor=1a1a1a)](docs/HANDOFF.md)\n[![engine](https://img.shields.io/badge/engine-DuckDB%20%2B%20Opus-00ff41?labelColor=1a1a1a)](#the-two-ways-it-sees)\n[![built for](https://img.shields.io/badge/built%20for-Claude%20Code-8A2BE2?labelColor=1a1a1a)](https://docs.claude.com/en/docs/claude-code)\n[![packaged with](https://img.shields.io/badge/packaged%20with-Nix-5277C3?logo=nixos\u0026logoColor=white\u0026labelColor=1a1a1a)](https://nixos.org)\n[![license](https://img.shields.io/badge/license-MIT-00ff41?labelColor=1a1a1a)](LICENSE)\n\n\u003c/div\u003e\n\n---\n\n\u003e *\"Never send a human to do a machine's job.\"* — Agent Smith\n\u003e\n\u003e You keep telling your agents the same thing. *Read skeletons, not whole files.\n\u003e That flag was renamed. Don't retry the failing command.* They nod, and three\n\u003e sessions later they do it again. That repetition **is** the glitch in the\n\u003e matrix — and agent-smith exists to patch it at the source: the instructions\n\u003e themselves.\n\nagent-smith reads how your Claude Code agents *actually behaved* across hundreds\nof past sessions, checks whether what their instructions *claim* is still true in\nthe world, and opens pull requests that fix the agent — not the symptom.\n\nIt edits the things that steer agents: subagent definitions, skills, `CLAUDE.md`\nfiles, slash commands. Then it writes down **why**, and later checks whether the\nglitch actually stopped recurring.\n\n---\n\n## 🕶️ The two ways it sees\n\nagent-smith runs two intelligence tracks into one mind. One looks *backward* at\nwhat happened; the other looks *outward* at what's still true.\n\n```mermaid\nflowchart TD\n    subgraph A[\"🔍 TRACK A · corpus mining — what did the agents do wrong?\"]\n        direction LR\n        sh[\"session history\u003cbr/\u003e~hundreds of .jsonl\"] --\u003e|\"duckdb · jq · cheap\"| ext[\"extractor\"] --\u003e inc([\"incidents\"])\n    end\n\n    subgraph B[\"🌐 TRACK B · freshness audit (planned) — is what they claim still true?\"]\n        direction LR\n        art[\"the artifacts\u003cbr/\u003etools · flags · APIs · URLs\"] --\u003e claim[\"claim\u003cbr/\u003eextractor\"] --\u003e|\"web · docs · context7\"| exp[\"explorers\"] --\u003e ver([\"claim verdicts\"])\n    end\n\n    inc --\u003e AN\n    ver --\u003e AN\n\n    AN{{\"🧠 ANALYST · Opus\u003cbr/\u003eclusters glitches · diagnoses the fix · writes the reason\"}}\n    AN --\u003e|\"proposals + reason logs\"| AP[\"🤖 APPLIER\u003cbr/\u003efinds the repo that owns the artifact → opens a PR there\"]\n    AP --\u003e|\"after the PR merges\"| DV[\"🕶️ DÉJÀ-VU\u003cbr/\u003ere-mines later sessions — did the glitch stop?\"]\n\n    classDef track fill:#0d1117,stroke:#30363d,color:#8b949e;\n    classDef node fill:#0d1117,stroke:#00ff41,color:#c9d1d9;\n    classDef brain fill:#0d2818,stroke:#00ff41,color:#00ff41;\n    class sh,ext,inc,art,claim,exp,ver,AP,DV node;\n    class AN brain;\n    class A,B track;\n```\n\n**Track A — corpus mining.** Pure SQL (DuckDB) over your `.jsonl` session logs.\nNo model, no token cost, runs over everything. It hunts four kinds of glitch:\n\n| Signal | The tell |\n|--------|----------|\n| **tool_error** | a tool call came back as an error |\n| **retry** | the identical call re-issued within a few turns — and the earlier attempt **failed** (intentional successful re-runs don't count) |\n| **user_correction** | you said *\"no\"*, *\"actually…\"*, *\"revert that\"* — or interrupted |\n| **inefficiency** | unbounded whole-file reads where a skeleton would do |\n\n*Repeated guidance* — the same glitch across **≥3 distinct sessions** — is the\nanalyst's clustering threshold, applied on top of every signal: a pattern, not a\nfluke. Big clusters are fed to the Oracle as a **session-stratified sample**\n(breadth across sessions before depth) with truthful totals, so even a\n2,000-incident cluster fits in one diagnosis.\n\n**Track B — freshness audit** *(planned)*. The backward look can't catch a rule that was right\nwhen written and rotted since. So agent-smith reads the artifacts, extracts every\nexternal claim — a tool name, a CLI flag, a library API, a URL, a \"best practice\" —\nand **fans out one explorer per claim** to check it against the live world\n(`context7`, web search, changelogs). `changed` and `dead` claims become fixes.\n\nThe design bet: **the extractor is dumb and cheap, the analyst is smart and narrow.**\nCost scales with the number of glitches, not the size of your history.\n\n---\n\n## What it actually changes\n\nEvery fix is one of six moves. The interesting one is the last.\n\n| Fix | When | What it does |\n|-----|------|--------------|\n| **add** | no guidance exists | write the missing rule |\n| **strengthen** | the rule exists but gets ignored | raise it, sharpen it, make it imperative |\n| **fix-stale** | a flag/API/file the rule names has changed | correct the reference |\n| **remove** | the guidance contradicts itself or causes the glitch | cut it |\n| **escalate-out-of-instructions** | a *prose* rule keeps failing no matter how loud | stop asking nicely — propose a **hook** |\n| **skip** | the current harness/system prompt already enforces it | decline — redundant instructions are their own glitch source |\n\nThat last move is the whole philosophy: when a rule is reliably ignored, the\nanswer isn't a louder rule, it's **defining the error out of existence** —\nconverting a suggestion the model can rationalize past into deterministic\nenforcement the harness runs. agent-smith is allowed to propose that.\n\nNothing lands silently. Every change ships as a **draft pull request** against\nwhichever repo owns the artifact — gated by a deterministic **preflight** (title\nlint, exactly one commit over `origin/\u003cbase\u003e`, no files beyond what the editor\nreported) and a subagent **verify** pass — with a **reason log** entry: the\ndiagnosis, the evidence, the expected effect. You merge; nothing merges itself.\nLater, `déjà-vu` re-mines and records whether the glitch rate actually dropped.\n*Cause, effect, receipts.*\n\n---\n\n## 📦 Install\n\nagent-smith is a **Claude Code plugin** (this repo doubles as its own\nsingle-plugin marketplace):\n\n```\n/plugin marketplace add noamsto/agent-smith\n/plugin install agent-smith@agent-smith\n```\n\nThen run it:\n\n```\n/agent-smith:run          # the whole loop, autonomously → draft PRs\n/agent-smith:mine         # extractor → clusters\n/agent-smith:propose      # Oracle per cluster → proposals (review-only)\n/agent-smith:apply [\u003cid\u003e] # editor → verify → draft PR\n/agent-smith:status       # where things stand\n```\n\nFirst run bootstraps everything: the `extractor`/`analyst`/`applier` binaries\n(and the `duckdb` CLI, if you don't have one) download automatically for your\nOS/arch into `~/.cache/agent-smith/bin`. The only assumptions are `git` and an\nauthenticated `gh`. Binaries already on PATH — e.g. nix-installed — are used\nas-is, never downloaded over.\n\n\u003cdetails\u003e\n\u003csummary\u003eDeclarative install (settings.json)\u003c/summary\u003e\n\n```json\n{\n  \"extraKnownMarketplaces\": {\n    \"agent-smith\": { \"source\": { \"source\": \"github\", \"repo\": \"noamsto/agent-smith\" } }\n  },\n  \"enabledPlugins\": { \"agent-smith@agent-smith\": true }\n}\n```\n\n\u003c/details\u003e\n\n\u003cdetails\u003e\n\u003csummary\u003eNix / Home Manager (engine on PATH)\u003c/summary\u003e\n\nAdd the flake as an input and import the Home Manager module; it puts the\n`extractor`/`analyst`/`applier` binaries on PATH with their deps (duckdb, git,\ngh) already wrapped, so the plugin never has to download them:\n\n```nix\n# flake.nix\ninputs.agent-smith.url = \"github:noamsto/agent-smith\";\n\n# home.nix (or any Home Manager module)\nimports = [ inputs.agent-smith.homeManagerModules.default ];\nprograms.agent-smith.enable = true;\n```\n\nEnable the plugin itself via the `settings.json` keys above.\n\n\u003c/details\u003e\n\n## 🔧 Develop\n\n```bash\nnix develop                            # devshell: go, duckdb, jq, git, gh, goreleaser\ngo test ./...                          # full suite\nnix build .#default                    # → result/bin/{extractor,analyst,applier}\ngoreleaser release --snapshot --clean  # local release dry-run → dist/\n```\n\n---\n\n## Status\n\n\u003e *\"It is inevitable.\"*\n\n**Phase 1 is live.** The loop — extractor → analyst (Oracle) → applier — is built,\ntested, and proven end-to-end on a real corpus. Fittingly, the acceptance-bar\nglitch itself (agents ignoring the skeleton-first reading rule: 147 incidents\nacross 87 sessions) came out the other side as the **first real pull request** —\nproposing a PreToolUse hook, the `escalate-out-of-instructions` move, exactly as\ndesigned.\n\n📄 **Design:** [`docs/specs/2026-06-01-agent-smith-design.md`](docs/specs/2026-06-01-agent-smith-design.md) · **Working state:** [`docs/HANDOFF.md`](docs/HANDOFF.md)\n\n**Roadmap**\n\n- ✅ **Phase 1 — MVP.** Track A → analyst → draft PR + reason logs; the\n  `/agent-smith` plugin; pre-PR preflight. Acceptance bar met — and shipped as a\n  real PR.\n- ⏭ **Next.** Declarative install wiring ([#3](https://github.com/noamsto/agent-smith/issues/3)) ·\n  HTML status dashboard ([#2](https://github.com/noamsto/agent-smith/issues/2)) ·\n  **Track B** freshness audit.\n- 🔄 **Phase 2 — the loop.** `déjà-vu` trend validation; scheduled runs;\n  auto-commit for self-owned artifacts. The self-improving flywheel.\n- 🪝 **Phase 3 — the hook.** Inline capture so future mining gets even cheaper.\n\n---\n\n\u003cdiv align=\"center\"\u003e\n\n🕶️\n\n*There is no spoon. There is only the diff.*\n\n\u003c/div\u003e\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnoamsto%2Fagent-smith","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fnoamsto%2Fagent-smith","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnoamsto%2Fagent-smith/lists"}