{"id":48085187,"url":"https://github.com/john-wilmes/claude-agentic-coding-playbook","last_synced_at":"2026-04-20T01:00:43.159Z","repository":{"id":339889127,"uuid":"1163661407","full_name":"john-wilmes/claude-agentic-coding-playbook","owner":"john-wilmes","description":"Evidence-based practices for LLM-assisted development. Install scripts, Claude Code configuration, and a research-backed best practices guide with 34 verified citations.","archived":false,"fork":false,"pushed_at":"2026-04-06T00:57:24.000Z","size":1641,"stargazers_count":0,"open_issues_count":1,"forks_count":1,"subscribers_count":0,"default_branch":"master","last_synced_at":"2026-04-06T02:31:35.187Z","etag":null,"topics":["agentic-development","ai-coding","best-practices","claude-code","code-quality","cursor","developer-tools","prompt-engineering","security","testing"],"latest_commit_sha":null,"homepage":null,"language":"JavaScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/john-wilmes.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":"SECURITY.md","support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-02-22T00:22:12.000Z","updated_at":"2026-04-06T00:57:27.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/john-wilmes/claude-agentic-coding-playbook","commit_stats":null,"previous_names":["john-wilmes/agentic-coding-playbook","john-wilmes/claude-agentic-coding-playbook"],"tags_count":1,"template":false,"template_full_name":null,"purl":"pkg:github/john-wilmes/claude-agentic-coding-playbook","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/john-wilmes%2Fclaude-agentic-coding-playbook","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/john-wilmes%2Fclaude-agentic-coding-playbook/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/john-wilmes%2Fclaude-agentic-coding-playbook/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/john-wilmes%2Fclaude-agentic-coding-playbook/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/john-wilmes","download_url":"https://codeload.github.com/john-wilmes/claude-agentic-coding-playbook/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/john-wilmes%2Fclaude-agentic-coding-playbook/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32028547,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-20T00:18:06.643Z","status":"ssl_error","status_checked_at":"2026-04-20T00:17:31.068Z","response_time":55,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["agentic-development","ai-coding","best-practices","claude-code","code-quality","cursor","developer-tools","prompt-engineering","security","testing"],"created_at":"2026-04-04T15:14:13.314Z","updated_at":"2026-04-20T01:00:43.146Z","avatar_url":"https://github.com/john-wilmes.png","language":"JavaScript","funding_links":[],"categories":["Plugins"],"sub_categories":["All Plugins"],"readme":"[![Test Install Script](https://github.com/john-wilmes/claude-agentic-coding-playbook/actions/workflows/test-install.yml/badge.svg)](https://github.com/john-wilmes/claude-agentic-coding-playbook/actions/workflows/test-install.yml)\n\n# Agentic Coding Playbook\n\nEvidence-based practices for LLM-assisted software development. Built for [Claude Code](https://claude.com/claude-code). The principles in [best-practices.md](docs/best-practices.md) are conceptually portable, but all hooks, skills, and scripts target Claude Code specifically.\n\n## Why This Exists\n\nAI-assisted code contains 1.7x more issues than human-written code and introduces 10x more security findings in enterprise settings. Developer productivity gains range from -19% (experienced OSS developers) to +55% (controlled tasks) depending on task type, tool, and developer experience. The practices in this playbook are designed to capture the upside while mitigating the documented risks.\n\nKey findings from the research:\n\n| Metric | Value | Source |\n|--------|-------|--------|\n| AI code issues vs human code | 1.7x more | CodeRabbit (470 PRs) |\n| Security findings in AI-assisted teams | 10x increase | Apiiro (Fortune 50) |\n| Prompt injection success rate | 94% | PMC controlled study |\n| AI code review defect detection | 44-82% | Greptile, Macroscope benchmarks |\n| Fresh session vs exhausted session cost | ~10x cheaper | Anthropic |\n| Model routing savings (Haiku vs Opus) | 5-20x | Anthropic pricing |\n| Prompt cache hit savings | 90% | Anthropic |\n| Teams with AI review seeing quality gains | 81% vs 55% | Qodo |\n\nFull details with citations: [docs/best-practices.md](docs/best-practices.md)\n\n## How This Works\n\nMost Claude Code setups rely on CLAUDE.md instructions alone. Instructions are advisory — compliance ranges from ~50-90% in our testing (published research reports lower rates for complex instruction sets). This playbook takes a different approach:\n\n- **Designed for bypass mode.** The biggest productivity gains come from running Claude Code autonomously (`--dangerously-skip-permissions` or `bypassPermissions` in settings). Without guardrails, bypass mode means no safety net — destructive commands, runaway file creation, and context exhaustion go unchecked. With this playbook's hooks, you get deterministic enforcement of safety rules even when the agent has full permissions: prompt injection is blocked, context limits are enforced, destructive git operations are caught, and PII is redacted — all without permission prompts interrupting flow.\n- **Hooks enforce rules deterministically.** 35+ hooks — each fires on the relevant tool calls (not all on every call) — catching context exhaustion, prompt injection, sycophantic compliance patterns, and file bloat before they cause problems. Hooks achieve near-100% enforcement for deny rules and \u003e95% for advisory rules — where instructions alone cannot.\n- **Structured logging makes agent behavior observable.** Every hook decision is logged to JSONL. Analysis tools (`analyze-logs.js`) report context usage, stuck loops, model routing, and hook effectiveness per session — so you can measure what's working and what isn't.\n- **Practices are validated by running them.** The playbook is [dogfooded](docs/dogfooding.md) against real codebases with a 100-task framework. Bugs found during dogfooding (context guard effectiveness, task queue edge cases, implicit completion detection) feed directly back into the hooks and scripts.\n- **Zero npm dependencies.** All hooks use Node.js stdlib only. No `node_modules`, no build step, no supply chain risk.\n\n## Try a Single Hook\n\nWant to test the waters before a full install? Copy `context-guard.js` (context window monitoring) and its logger dependency:\n\n```bash\nmkdir -p ~/.claude/hooks\ncurl -fsSL https://raw.githubusercontent.com/john-wilmes/claude-agentic-coding-playbook/master/templates/hooks/context-guard.js -o ~/.claude/hooks/context-guard.js\ncurl -fsSL https://raw.githubusercontent.com/john-wilmes/claude-agentic-coding-playbook/master/templates/hooks/log.js -o ~/.claude/hooks/log.js\nchmod +x ~/.claude/hooks/context-guard.js ~/.claude/hooks/log.js\n```\n\nThen add to `~/.claude/settings.json`:\n\n```json\n{\n  \"hooks\": {\n    \"PostToolUse\": [\n      { \"matcher\": \"\", \"hooks\": [{ \"type\": \"command\", \"command\": \"node ~/.claude/hooks/context-guard.js\" }] }\n    ]\n  }\n}\n```\n\nThis gives you context window warnings after every tool call. The full install also registers `context-guard.js` on `SessionStart` and wires up the remaining hooks.\n\n## Quick Install\n\n```bash\ngit clone https://github.com/john-wilmes/claude-agentic-coding-playbook.git\ncd claude-agentic-coding-playbook\nchmod +x install.sh\n./install.sh\n```\n\n### Prerequisites\n\n- **Bash on Linux or macOS** (Windows requires [WSL](https://learn.microsoft.com/en-us/windows/wsl/install) — native CMD/PowerShell is not supported. The install script, hooks, and test suite all assume a POSIX environment.)\n- **Node.js 18+** (for hooks and test scripts; Node.js 22+ is required for the knowledge system — `knowledge-db.js` uses `node:sqlite`)\n- **git** (for version control and install script)\n\n### Install Options\n\n| Flag | Description |\n|------|-------------|\n| `--root \u003cpath\u003e` | Controls where `research/` is created (default: `~/Documents`). Config always goes to `~/.claude/` |\n| `--knowledge-repo \u003curl\u003e` | Git URL for a shared knowledge repository (cloned to `~/.claude/knowledge/`) |\n| `--wizard` | Interactive merge with your existing configuration |\n| `--force` | Overwrite existing files without prompting |\n| `--dry-run` | Preview what would be installed |\n| `--extras` | Install optional extras (e.g., SWE-Bench scripts, fleet indexer) |\n| `--uninstall` | Remove all installed playbook files from `~/.claude/` |\n\n### What Gets Installed\n\n```\n~/.claude/                             # Playbook configuration (always here)\n  CLAUDE.md                            #   Combined dev + research workflows\n  skills/\n    checkpoint/SKILL.md               #   /checkpoint - save state, commit, end session\n    create-project/SKILL.md           #   /create-project - scaffold a new project\n    investigate/SKILL.md              #   /investigate - full investigation lifecycle\n    learn/SKILL.md                    #   /learn - capture knowledge entries\n    playbook/SKILL.md                 #   /playbook - analyze and improve config\n    promote/SKILL.md                  #   /promote - promote lessons to global scope\n  hooks/                               #   35+ hooks — safety, quality, resource management (see docs/hooks.md)\n  rules/\n    hooks.md                          #   Hook development conventions (globs: templates/hooks/**)\n    testing.md                        #   Test conventions (globs: tests/**)\n    codebase-reference.md             #   Template: org-specific repo ownership (populate for your team)\n    operations.md                     #   Template: MCP tool/data access policy (populate for your org)\n  templates/\n    project-CLAUDE.md                 #   Template for project-level CLAUDE.md\n    knowledge/\n      pre-commit                      #   Git pre-commit hook (blocks secrets, large files)\n\n\u003cinstall-root\u003e/                        # e.g. ~/Documents (set with --root)\n  research/                            # Research/investigation workspace\n  project-a/                           # Dev project (created with /create-project)\n  project-b/                           # Dev project\n```\n\nThe installer **will not overwrite** existing skills or configuration without prompting. Use `--wizard` to analyze your current setup and merge intelligently.\n\n## What You Get\n\n### Skills\n\n**Development:**\n- **`/checkpoint`** -- Save all work, update memory with Current Work section, commit, push, and run a devil's advocate check. Designed for clean session handoffs.\n- **`/create-project`** -- Scaffold a new project with git, .gitignore, CLAUDE.md, AGENTS.md, GitHub repo. Projects are created as siblings to `.claude/`.\n- **`/playbook`** -- Analyze your CLAUDE.md configuration and suggest improvements. Modes: `global`, `project`, `check`.\n- **`/learn`** -- Capture a non-obvious lesson as a structured knowledge entry for future sessions.\n- **`/promote`** -- Promote a project-level lesson to global scope.\n**Research:**\n- **`/investigate`** -- Full investigation lifecycle with multi-agent evidence collection, synthesis, tagging, and PHI sanitization. Subcommands: `new`, `run`, `collect`, `synthesize`, `close`, `status`, `list`, `search`.\n\n### Hooks\n\nCLAUDE.md rules are advisory (~50-90% compliance in our testing; published research reports lower rates for complex instruction sets). Hooks are deterministic (\u003e95%) — they run scripts at specific points in the agent's workflow, guaranteeing enforcement. See [Hook Reference](docs/hooks.md) for the full guide including configuration, customization, and the \"why hooks\" philosophy.\n\n**Session lifecycle:**\n- **Session start** -- Injects memory, knowledge entries, and git context. Warns when MEMORY.md or CLAUDE.md exceed size thresholds.\n- **Session end** -- Auto-commits memory changes, detects retrieval misses, archives stale knowledge entries.\n- **Pre-compact** -- Saves context state before `/compact` runs, preserving critical information.\n- **Post-compact** -- Re-injects memory and task context after auto-compaction.\n\n**Safety:**\n- **Prompt injection guard** -- Blocks high-confidence injection patterns in Bash commands (designed for zero false positives).\n- **Sanitize guard** -- Runtime PII/PHI detection and redaction. Scans tool output (PostToolUse) and blocks writes containing PII (PreToolUse). Opt-in per repo via `.claude/sanitize.yaml`.\n- **Skill guard** -- Validates skill invocations and prevents unauthorized skill execution.\n- **MCP safety** -- `mcp-data-guard` blocks PHI field access in MCP tool calls; `mcp-query-interceptor` intercepts and sanitizes MCP queries before execution; `mcp-result-advisor` scans MCP results for PHI leakage and advises on safe handling.\n\n**Quality:**\n- **Post-tool verify** -- Auto-runs project tests after Edit/Write on code files with debouncing.\n- **PR review guard** -- Enforces code review before merging. Blocks `gh pr merge` until CodeRabbit has reviewed the PR.\n- **Context guard** -- Dual-mode context window monitoring. Thresholds: 35% (suggest subagents), 42% (persist unsaved findings), 57% (warn user), 60% (advisory block, writes context-high flag), 75% (failsafe sentinel for claude-loop restart).\n- **Stuck detector** -- Detects and breaks agent loops when the same action repeats.\n- **Sycophancy detector** -- Detects behavioral patterns indicating sycophancy — rubber-stamping, compliance without investigation, shallow reviews. Warns via PostToolUse advisory.\n- **Evidence/reasoning** -- `evidence-gate` blocks research synthesis steps if supporting evidence is insufficient; `rejection-advisor` surfaces alternative interpretations when the agent accepts a framing without challenge.\n\n**Resource management:**\n- **Model router** -- Auto-selects Haiku/Sonnet/Opus for Task and Agent tool calls based on prompt signals. Warns when allowed-tools exceeds 10.\n- **Filesize guard** -- Warns when reading or writing large files that waste context.\n- **Bloat guard** -- Detects runaway file creation and flags unexpected project growth.\n- **Markdown size guard** -- Warns when CLAUDE.md or MEMORY.md approach size thresholds.\n- **Read-once dedup** -- Blocks re-reads of unchanged files (38-40% context savings, observed in author testing).\n- **Dedicated tool guard** -- Warns when a general-purpose tool (Bash) is used where a dedicated tool (Glob, Grep, Read) would be more efficient.\n- **Memory guards** -- `memory-accumulation-guard` warns when MEMORY.md grows faster than expected; `memory-index-guard` enforces structural conventions on the memory index to keep it navigable.\n- **Checkpoint discipline** -- Enforces periodic checkpointing; warns when a long session has no checkpoint.\n- **Protect main** -- Blocks direct commits and force-pushes to main/master.\n\n**Enforcement:**\n- **Checkpoint gate** -- Enforces checkpoint-exit and context-critical boundaries. Blocks sessions from continuing past failsafe thresholds without checkpointing.\n- **Multi-image guard** -- Blocks reading 2+ image files per session, guiding to subagent delegation for bulk image work.\n- **Orphan file guard** -- Blocks creating new files not referenced by any existing file. Prevents file bloat.\n- **MCP server guard** -- Advisory warning when `enableAllProjectMcpServers: true` in global settings. Warns once per session.\n\n**Knowledge:**\n- **Knowledge capture** -- Extracts reusable lessons from session activity for the knowledge database.\n- **Knowledge database** -- Retrieves relevant knowledge entries via BM25 search at session start.\n\n**Subagent and failure handling:**\n- **Subagent context** -- Injects project context and loop warnings into spawned subagents at SubagentStart.\n- **Subagent recovery** -- Detects truncated subagent output after Task tool calls and writes recovery state.\n- **Tool failure logger** -- Logs tool errors to `~/.claude/logs/tool-failures.jsonl` on PostToolUseFailure.\n- **Task completed gate** -- Quality gate on TaskCompleted: blocks teammate task completion if tests fail.\n- **Teammate idle** -- Nudges idle teammates to check their TaskList.\n\nUtility modules (`log.js`, `bm25.js`, `pii-detector.js`, `knowledge-capture.js`, `knowledge-db.js`) are shared libraries used by the hooks above. See [Hook Reference](docs/hooks.md) for details on every hook.\n\n### CLAUDE.md Rules\n\nThe `rules/` directory contains four files installed to `~/.claude/rules/`. `hooks.md` and `testing.md` are pre-populated conventions for this playbook. `codebase-reference.md` and `operations.md` are starter templates for org-specific customization: populate `codebase-reference.md` with your repo ownership map and key contacts, and `operations.md` with your MCP tool access policy and approved data sources. These files are included via Claude Code's glob-based rules system.\n\nThe combined CLAUDE.md includes:\n\n- **Dual workflow** -- Development (Explore-Plan-Code-Verify-Commit) and Research (Question-Collect-Synthesize-Close), auto-selected by working directory\n- **Reasoning standards** -- evidence-based debugging, two-hypothesis minimum, no cargo-culting\n- **Model routing** -- use Haiku for exploration, Sonnet for implementation, Opus for planning\n- **Testing as feedback loop** -- verify continuously, not just at the end\n- **Code review enforcement** -- review staged changes before every commit\n- **Evidence discipline** -- numbered observations with source, relevance, and 3-line max\n- **PII/PHI protection** -- Regex-based PII auto-sanitization for investigation files\n- **Security baseline** -- sandbox mode, credential protection, MCP server restrictions\n- **Efficiency rules** -- parallel tool calls, no re-reads, two-attempt limit\n- **Memory discipline** -- Current Work tracking for session continuity\n\n## Testing\n\nRun the full test suite:\n\n```bash\n# Hook tests (Node.js)\nfor t in tests/hooks/*.test.js; do node \"$t\" || exit 1; done\n\n# Script tests (Bash and Node.js)\nfor t in tests/scripts/*.test.sh; do bash \"$t\" || exit 1; done\nfor t in tests/scripts/*.test.js; do node \"$t\" || exit 1; done\n\n# Skills tests\nfor t in tests/skills/*.test.sh; do bash \"$t\" || exit 1; done \u0026\u0026 for t in tests/skills/*.test.js; do node \"$t\" || exit 1; done\n\n# Fleet tests (Node.js)\nfor t in tests/fleet/*.test.js; do node \"$t\" || exit 1; done\n\n# Investigation tests (Node.js)\nfor t in tests/investigate/*.test.js; do node \"$t\" || exit 1; done\n\n# Or all at once\nfor t in tests/hooks/*.test.js; do node \"$t\" || exit 1; done \u0026\u0026 for t in tests/fleet/*.test.js; do node \"$t\" || exit 1; done \u0026\u0026 for t in tests/scripts/*.test.sh; do bash \"$t\" || exit 1; done \u0026\u0026 for t in tests/scripts/*.test.js; do node \"$t\" || exit 1; done \u0026\u0026 for t in tests/skills/*.test.sh; do bash \"$t\" || exit 1; done \u0026\u0026 for t in tests/skills/*.test.js; do node \"$t\" || exit 1; done \u0026\u0026 for t in tests/investigate/*.test.js; do node \"$t\" || exit 1; done\n```\n\n## CLI Scripts\n\nStandalone tools installed to `~/.local/bin/`:\n\n| Script | Description |\n|--------|-------------|\n| `q` | Lightweight CLI for direct Anthropic API Q\u0026A. Uses Haiku by default for fast, cheap answers. |\n| `qa` | File-capable agentic CLI using the Anthropic API with tool use (bash + text editor). No hooks or MCP. |\n| `claude-loop` | Auto-restart wrapper for Claude Code sessions. Supports `--task-queue`, `--status-json`, `--log-file`, and `--report`. |\n| `knowledge-consolidate` | Deduplicate and consolidate knowledge entries using the claude CLI for pairwise overlap analysis. |\n| `repo-fleet-index` | CLI wrapper for the repo fleet indexer and MCP server. Builds manifests and a digest across your repos. |\n| `sanitize.sh` | Redact PII/PHI from files using regex patterns (SSN, email, phone, credit card). Supports `--check` mode (detect without modifying) and falls back from Presidio to regex if Presidio is unavailable. |\n\n## Log Analysis\n\nHooks log decisions to `~/.claude/logs/YYYY-MM-DD.jsonl`. Analyze with:\n\n```bash\n# Full report\nnode scripts/analyze-logs.js\n\n# Filter by date range\nnode scripts/analyze-logs.js --since 2026-03-01\n\n# Filter by session or hook\nnode scripts/analyze-logs.js --session abc123 --hook context-guard\n\n# Session timeline — merges hook log events with transcript tool calls\nnode scripts/analyze-logs.js --timeline SESSION_ID --project-dir /path/to/project\n\n# Cross-session aggregate metrics\nnode scripts/analyze-logs.js --aggregate\n```\n\nOutput includes context-guard progression per session, stuck-detector triggers, model-router distribution, and prompt-injection blocks.\n\n`--timeline SESSION_ID` requires `--project-dir PATH` to locate transcript files. The timeline shows tool calls (with `[ERROR]` markers), hook interventions (`\u003c!\u003e` warn, `!!!` block/escalate, `---` info), context-guard percentages, and a summary line.\n\n`--aggregate` reports cross-session metrics: session count, context usage stats (avg/median/max), hook fire rates per session, and session health rates (stuck-detector triggers, sycophancy warnings, model routing distribution).\n\n## Existing Users\n\nIf you already have a `~/.claude/CLAUDE.md` and custom skills:\n\n```bash\n# Preview what would change\n./install.sh --dry-run\n\n# Interactive merge -- backs up your files, shows conflicts, lets you choose\n./install.sh --wizard\n```\n\nThe wizard will:\n1. Detect your existing CLAUDE.md and show its sections\n2. Offer to backup + replace, skip, or abort\n3. Skip skills that already exist (e.g., if you have your own `/checkpoint`)\n\n## Documentation\n\n- **[Best Practices Guide](docs/best-practices.md)** -- the full evidence-backed guide with 58 citations (57 with direct links, 1 via indirect reference)\n- **[Project CLAUDE.md Template](templates/project-CLAUDE.md)** -- starting point for per-project instructions\n- **[Dogfooding Guide](docs/dogfooding.md)** -- how to design and run a sustained dogfood campaign against real codebases, with a 100-task worked example\n- **[Dogfood Playbook](docs/dogfood-playbook.md)** -- manual interactive testing checklist for verifying the full user experience\n\n### Case Studies\n\n- **[Agent Failure Analysis](docs/case-study-agent-failure.md)** -- detailed post-mortem of a production agent failure with root cause analysis\n- **[Agent Failure Transcript](docs/transcript-2026-02-24-agent-failure.md)** -- raw session transcript from the failure event\n- **[Feature + Debugging Walkthrough](docs/transcript-2026-03-22-feature-with-debugging.md)** -- annotated session showing the Explore-Code-Verify workflow with a real debugging detour\n\n### Architecture\n\nThe playbook uses a single `combined` profile in `profiles/combined/` that covers both development and research workflows. The install script copies its `CLAUDE.md` and `skills/` to `~/.claude/`.\n\n## Benchmarks\n\nThe playbook includes a SWE-Bench benchmarking script that compares Claude Code's performance on real-world bug fixes with and without the playbook installed.\n\n```bash\n# Validate setup (no API calls)\nbash scripts/swe-bench.sh --dry-run\n\n# Run 5 SWE-Bench Lite tasks (estimate: $10-25 in API costs; varies by model and pricing)\nbash scripts/swe-bench.sh\n\n# Full 25-task run (estimate: $100-250; varies by model and pricing)\nbash scripts/swe-bench.sh --full\n```\n\nSee [docs/swe-bench-methodology.md](docs/swe-bench-methodology.md) for task selection, scoring, and limitations.\n\n## MCP Servers\n\nPHI-sanitizing MCP servers for safe AI-assisted queries against healthcare data stores. See [`mcp-servers/`](mcp-servers/) for setup and configuration.\n\n| Server | Data store | PHI protection |\n|--------|-----------|----------------|\n| `mongodb-sanitizer` | MongoDB | Drops PHI fields, redacts string values, Presidio NLP second pass |\n| `snowflake-sanitizer` | Snowflake | Drops PHI columns from SELECT results, read-only enforcement |\n| `datadog-sanitizer` | Datadog Logs | Strips names, emails, SSNs, tokens from log output (Python server) |\n| `slack-sanitizer` | Slack | Regex + Presidio redaction of emails, phones, SSNs, tokens; read-only |\n\nMongoDB, Snowflake, and Datadog use a shared `phi-config.yaml` to define which columns and tables are PHI — no code changes required to adapt to your data model. Slack applies string-level redaction (no field blocklist, as it is not a PHI database).\n\nThe Node.js servers (MongoDB, Slack, Snowflake) share modules in `mcp-servers/shared/`: `sanitizer-core.js`, `phi-config-loader.js`, and `phi-defaults.yaml` (with `phi-config.example.yaml` and `phi-defaults.json` for config scaffolding). These handle PHI config loading and sanitization centrally.\n\n## Roadmap\n\n- **Subagent overflow recovery (claude-loop)** -- When a subagent runs out of turns or context, detect the truncation via a PostToolUse hook on Task, write a state file with remaining work, and have claude-loop inject it as the prompt for a fresh session to finish the job.\n- **Multi-agent coordination testing** -- Dogfood test team workflows (TeamCreate, SendMessage, shared task lists) in real coding sessions to validate coordination patterns and discover emergent issues.\n\n## Limitations\n\n- **Claude Code only**: All hooks, skills, and scripts target Claude Code. The principles in `best-practices.md` are conceptually portable to Cursor, Copilot, etc., but the tooling is not.\n- **Hook startup overhead**: 35+ hooks are installed, but not all fire on every call — active hooks add ~50-100ms per tool call. Negligible for most workflows, noticeable in rapid-fire operations.\n- **CLAUDE.md budget**: The combined profile's CLAUDE.md consumes instruction budget. Projects with large existing CLAUDE.md files may hit the ~150-200 instruction line ceiling.\n- **Node.js 18+ required**: Hooks use Node.js built-in modules (`fs`, `path`, `os`, `crypto`, `child_process`) with CommonJS `require()`. Node.js 22+ is required for the knowledge system (`knowledge-db.js` uses `node:sqlite`).\n- **Single maintainer**: This is a personal project, not backed by a company or large team.\n\n## Contributing\n\nIssues and PRs welcome. See [CONTRIBUTING.md](CONTRIBUTING.md) for citation standards, style guide, and local testing instructions.\n\n## License\n\n[MIT](LICENSE)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjohn-wilmes%2Fclaude-agentic-coding-playbook","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fjohn-wilmes%2Fclaude-agentic-coding-playbook","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjohn-wilmes%2Fclaude-agentic-coding-playbook/lists"}