{"id":46059369,"url":"https://github.com/blackwell-systems/claudewatch","last_synced_at":"2026-03-04T06:07:47.673Z","repository":{"id":340876362,"uuid":"1168000685","full_name":"blackwell-systems/claudewatch","owner":"blackwell-systems","description":"Get measurably better at AI-assisted development. Generates CLAUDE.md improvements from session data, then proves they worked with before/after effectiveness scoring.","archived":false,"fork":false,"pushed_at":"2026-02-27T02:13:50.000Z","size":325,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-02-27T05:34:37.553Z","etag":null,"topics":["ai","ai-agents","ai-observability","analytics","anthropic","claude","claude-code","cli","developer-tools","go","golang","observability","sqlite"],"latest_commit_sha":null,"homepage":"","language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/blackwell-systems.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-02-26T23:00:57.000Z","updated_at":"2026-02-27T02:13:48.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/blackwell-systems/claudewatch","commit_stats":null,"previous_names":["blackwell-systems/claudewatch"],"tags_count":2,"template":false,"template_full_name":null,"purl":"pkg:github/blackwell-systems/claudewatch","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/blackwell-systems%2Fclaudewatch","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/blackwell-systems%2Fclaudewatch/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/blackwell-systems%2Fclaudewatch/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/blackwell-systems%2Fclaudewatch/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/blackwell-systems","download_url":"https://codeload.github.com/blackwell-systems/claudewatch/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/blackwell-systems%2Fclaudewatch/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":29967930,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-03-01T10:55:55.490Z","status":"ssl_error","status_checked_at":"2026-03-01T10:55:55.175Z","response_time":124,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai","ai-agents","ai-observability","analytics","anthropic","claude","claude-code","cli","developer-tools","go","golang","observability","sqlite"],"created_at":"2026-03-01T11:00:54.132Z","updated_at":"2026-03-04T06:07:47.666Z","avatar_url":"https://github.com/blackwell-systems.png","language":"Go","funding_links":[],"categories":[],"sub_categories":[],"readme":"# claudewatch\n\n[![Blackwell Systems™](https://raw.githubusercontent.com/blackwell-systems/blackwell-docs-theme/main/badge-trademark.svg)](https://github.com/blackwell-systems)\n[![CI](https://github.com/blackwell-systems/claudewatch/actions/workflows/ci.yml/badge.svg)](https://github.com/blackwell-systems/claudewatch/actions)\n[![Release](https://img.shields.io/github/v/release/blackwell-systems/claudewatch)](https://github.com/blackwell-systems/claudewatch/releases/latest)\n[![Go Report Card](https://goreportcard.com/badge/github.com/blackwell-systems/claudewatch)](https://goreportcard.com/report/github.com/blackwell-systems/claudewatch)\n\nA dual observability layer for Claude Code — for developers, and for Claude itself.\n\nclaudewatch runs two layers at once. For developers: friction patterns, cost-per-outcome, agent success rates, and before/after CLAUDE.md effectiveness scoring — reads local files under `~/.claude/`, finds what's costing you time and money, fixes it, and measures the result. For Claude: a self-monitoring system that orients it at session start, alerts it mid-session when error loops or cost spikes are detected, and gives it queryable access to its own project history and live session health. Both layers run from the same local data, with no network calls and no telemetry.\n\n## Why\n\nEvery developer using AI tools is guessing at how to get better. You tweak your CLAUDE.md, try different prompting styles, maybe add a hook — and hope things improve. There's no feedback loop. Did that scope constraint actually reduce unrequested edits? Did the testing section cut your debugging cycles? You have no idea. You can't improve what you can't measure.\n\nClaude is guessing too. Every session starts fresh: no memory of which agent types failed on this project, what friction it generated last time, whether the approach it's about to take has a poor track record here. Claude makes decisions about parallelization, tool selection, and scope without any feedback from its own history. claudewatch closes that loop for both parties at once.\n\nThis is the layer nobody else occupies. LLM observability tools (LangSmith, Langfuse, Braintrust) give *humans* dashboards over API calls. claudewatch gives the *AI agent itself* queryable access to its own performance history and real-time session health — inside the session where decisions are being made. Post-hoc analytics for you, live self-reflection for Claude.\n\nBut queryable tools only help if Claude thinks to call them. claudewatch also runs a push layer: a SessionStart hook that injects a project health briefing before the first message, a PostToolUse hook that fires on error loops, context pressure, and cost spikes, and a behavioral contract in `~/.claude/CLAUDE.md` that tells Claude exactly what to do when those signals arrive. The result is a system that orients Claude at session start, alerts it mid-session when things go wrong, and gives it the vocabulary to respond — without requiring Claude to remember any of this from a previous conversation.\n\n## What claudewatch does\n\nClaude Code already records rich session data locally -- tool usage, friction events, satisfaction signals, agent lifecycles, commit patterns. claudewatch reads that data and turns it into actionable insights for both parties.\n\n**Give Claude a mirror.** `claudewatch install` writes a behavioral contract into `~/.claude/CLAUDE.md`. Two shell hooks — `claudewatch startup` (SessionStart) and `claudewatch hook` (PostToolUse) — orient Claude at session start and alert it mid-session when thresholds are crossed. The MCP server gives Claude queryable access to its own project health, agent history, and live session metrics. Together these form a self-monitoring layer that runs inside every Claude Code session: Claude knows what project it's on, what friction it generated last time, and when to stop and reassess — without requiring you to prompt it explicitly.\n\nFor multi-repo workflows, weighted attribution automatically routes sessions to their dominant project based on which files were actually touched, not just launch directory. Drift detection identifies when a session shifts from writing to reading-only (a signal you're stuck exploring). Factor analysis correlates session attributes against outcomes to answer \"what predicts success on this project?\" — all queryable by Claude mid-session.\n\n**Measure where you are.** `scan` scores every project's AI readiness. `metrics` shows session trends over time -- friction rate, correction rate, cost per outcome, model usage, cache efficiency, agent success rates. Cost-per-outcome connects your token spend to what you actually shipped: cost per commit, cost per file modified, and whether successful sessions cost more or less than failed ones. Model usage analysis shows which models are consuming your budget and flags overspend. Project confidence scoring tells you where Claude knows enough to act vs where it's stuck reading -- a proxy for whether your CLAUDE.md gives the AI enough context to be productive.\n\n```\n$ claudewatch metrics --days 30\n\n Session Trends (30 days)\n ---------------------------------------------------------------\n Sessions            42          (1.4/day)\n Avg duration        38 min\n Friction rate       32%         down from 45%\n Satisfaction        3.8/5       up from 3.2\n Commits/session     4.2\n Cost/commit         $1.42       down from $2.10\n Cost/session        $4.28\n\n Tool Usage\n ---------------------------------------------------------------\n Edit                38%         most used\n Bash                24%\n Read                19%\n Grep/Glob           12%\n Task (agents)        7%\n\n Agent Performance\n ---------------------------------------------------------------\n Total spawned       47          (1.2/session)\n Success rate        83%         up 5%\n Background ratio    68%\n Avg duration        42s\n Avg tokens/agent    12,400\n\n By type:\n  Explore            18  (92% success)  avg 15s\n  general-purpose    14  (71% success)  avg 68s\n  Plan                8  (88% success)  avg 45s\n  documentation       7  (86% success)  avg 52s\n\n Token Usage\n ---------------------------------------------------------------\n Total tokens        18.4M\n Input               14.2M\n Output               4.2M\n Cache hit rate       62%\n Input/output ratio   3.4:1\n Avg tokens/session   438K\n\n Model Usage\n ---------------------------------------------------------------\n claude-sonnet-4     $48.20 (78% of spend)   16.1M tokens (87%)\n claude-opus-4       $12.80 (21% of spend)    1.8M tokens (10%)\n claude-haiku-4       $0.60  (1% of spend)    0.5M tokens  (3%)\n\n ⚠ Potential savings: $9.40 if Opus usage moved to Sonnet\n\n Project Confidence\n ---------------------------------------------------------------\n shelfctl              score: 74  read: 28%  write: 52%  explore: 15%\n bubbletea-components  score: 68  read: 35%  write: 42%  explore: 25%\n crosschain-verifier   score: 31  read: 72%  write: 12%  explore: 80%\n   ⚠ low confidence — Claude spends most time reading, CLAUDE.md may need more context\n```\n\n**Find what's hurting you.** `gaps` surfaces missing context (no CLAUDE.md, no hooks, no testing section), recurring friction patterns, and stale problems that have persisted for weeks. `suggest` ranks improvements by impact so you know what to fix first.\n\n```\n$ claudewatch suggest --limit 3\n\n #1  Add scope constraints to shelfctl CLAUDE.md        impact: 8.4\n     Unrequested edits in 55% of sessions. Adding \"do not add\n     features beyond what is asked\" reduced this to 12% in similar\n     projects.\n\n #2  Skip plan mode for TUI features                    impact: 7.1\n     Plan agent killed in 40% of TUI sessions. Direct implementation\n     with a task list achieves the same outcome faster.\n\n #3  Add post-edit lint hook                             impact: 6.3\n     Tool errors from lint failures in 38% of Go sessions. A\n     PreToolUse hook running go vet catches these before they\n     cascade into multi-cycle debugging loops.\n```\n\n**Fix it automatically.** `fix` generates CLAUDE.md patches from your actual session data -- not templates, not guesses. Seven data-driven rules inspect your friction patterns, tool usage, agent kill rates, and zero-commit streaks to produce targeted additions. The `--ai` flag calls the Claude API for project-specific content grounded in your real usage.\n\n**Track whether it worked.** `track` snapshots your metrics to SQLite and diffs against previous snapshots so you can see exactly what changed. `watch` runs in the background and alerts you when friction spikes or quality degrades.\n\n```\n$ claudewatch track --compare\n\n Metric                  Before     Now        Delta\n ---------------------------------------------------------------\n Friction rate           45%        28%        -17%  (improved)\n Agent success rate      71%        89%        +18%  (improved)\n Avg corrections/session 2.4        0.8        -1.6  (improved)\n Commits/session         3.1        4.6        +1.5  (improved)\n Zero-commit sessions    18%        5%         -13%  (improved)\n```\n\n**Prove it with effectiveness scoring.** `metrics` automatically scores your CLAUDE.md changes -- it splits sessions at the modification timestamp, compares before/after on friction, tool errors, goal achievement, and cost per commit, then produces a -100 to +100 effectiveness score. Did adding that scope constraint actually reduce unrequested edits? Now you know.\n\n```\n$ claudewatch metrics --effectiveness\n\n CLAUDE.md Effectiveness\n ---------------------------------------------------------------\n Project             Score   Verdict      Changed\n shelfctl              +72   effective    2026-01-15\n crosschain-verifier   +34   effective    2026-01-20\n bubbletea-components   -8   neutral      2026-01-22\n\n shelfctl (detailed):\n   Friction rate       45% → 28%     -17%  (improved)\n   Tool errors/session 4.2 → 1.1     -74%  (improved)\n   Goal achievement    62% → 89%     +44%  (improved)\n   Cost/commit         $2.10 → $1.42 -32%  (improved)\n```\n\n## Multi-agent workflow analytics\n\nclaudewatch parses session transcripts to extract agent lifecycle data that isn't available anywhere else -- not in Claude Code's UI, not in the API, not in any third-party tool.\n\nIt reconstructs agent spans from JSONL transcripts: launch to completion, success or kill, parallel or sequential, duration and token cost. From that raw data it computes success rates by agent type, parallelization ratios, correction rates, and cost per task.\n\nA plan agent with a 40% kill rate is a signal that plan mode is costing you sessions for that project type. An explore agent that succeeds 95% of the time tells you to delegate more search tasks. claudewatch surfaces these patterns so you can adjust your workflow based on evidence, not intuition.\n\n## Installation\n\n**Homebrew (macOS/Linux):**\n```bash\nbrew install blackwell-systems/tap/claudewatch\n```\n\n**Direct download:**\n```bash\n# Download latest release for your platform\n# https://github.com/blackwell-systems/claudewatch/releases/latest\n\n# macOS/Linux: extract and move to PATH\ntar -xzf claudewatch_*_$(uname -s)_$(uname -m).tar.gz\nsudo mv claudewatch /usr/local/bin/\n\n# Windows: extract ZIP and add to PATH\n```\n\n**From source (requires Go 1.26+):**\n```bash\ngo install github.com/blackwell-systems/claudewatch/cmd/claudewatch@latest\n```\n\n**Build from source:**\n```bash\ngit clone https://github.com/blackwell-systems/claudewatch.git\ncd claudewatch\nmake build\n```\n\n## Quick start\n\n```bash\n# Get a baseline on all your projects\nclaudewatch scan\n\n# Find what's costing you time\nclaudewatch gaps\n\n# See what to fix first\nclaudewatch suggest --limit 5\n\n# Generate CLAUDE.md improvements from your session data\nclaudewatch fix myproject --dry-run   # preview first\nclaudewatch fix myproject             # apply interactively\n\n# Measure whether it helped\nclaudewatch track\n# ... work for a week ...\nclaudewatch track --compare\n```\n\n**Enable Claude's self-monitoring layer** (run once):\n\n```bash\n# Write the behavioral contract into ~/.claude/CLAUDE.md\nclaudewatch install\n\n# Add hooks to ~/.claude/settings.json\n# SessionStart: injects project health briefing before first message\n# PostToolUse: fires on error loops, context pressure, cost spikes\n```\n\n```json\n{\n  \"hooks\": {\n    \"SessionStart\": [{\"hooks\": [{\"type\": \"command\", \"command\": \"claudewatch startup\"}]}],\n    \"PostToolUse\":  [{\"hooks\": [{\"type\": \"command\", \"command\": \"claudewatch hook\"}]}]\n  }\n}\n```\n\nThen add the MCP server to `~/.claude.json`:\n\n```json\n{\n  \"mcpServers\": {\n    \"claudewatch\": {\n      \"command\": \"claudewatch\",\n      \"args\": [\"mcp\"]\n    }\n  }\n}\n```\n\n## Documentation\n\n| | |\n|---|---|\n| 📗 [Quickstart](docs/quickstart.md) | Install, baseline, fix, measure — the full cycle in one guide |\n| 📘 [CLI Reference](docs/cli.md) | All commands and flags: `scan`, `metrics`, `gaps`, `suggest`, `fix`, `track`, `log`, `watch`, `hook`, `startup`, `install` |\n| 📙 [MCP Reference](docs/mcp.md) | All 26 MCP tools, setup, recommended usage pattern, and data freshness notes |\n| 📕 [Effectiveness Scoring](docs/effectiveness.md) | How CLAUDE.md before/after scoring works, how to read verdicts, and what to do with regressions |\n\n---\n\n## Commands\n\n| Command | What it does |\n|---------|-------------|\n| `scan` | Score every project's AI readiness (0-100) |\n| `metrics` | Session trends: friction, cost per outcome, model usage, token breakdown, effectiveness scoring, agents, task planning |\n| `gaps` | What's missing: context, hooks, stale friction patterns |\n| `correlate` | Correlate session attributes against outcomes (friction, commits, cost, etc.) to find what predicts success |\n| `suggest` | Ranked improvements with impact scores |\n| `fix` | Generate and apply CLAUDE.md patches from session data |\n| `track` | Snapshot metrics to SQLite, diff against previous |\n| `log` | Inject custom metrics (scale, boolean, counter, duration) |\n| `watch` | Background daemon with desktop alerts on friction spikes |\n| `mcp` | Run an MCP stdio server — gives Claude real-time access to its own session metrics |\n| `hook` | PostToolUse shell hook — checks for error loops, context pressure, and cost spikes; exits 2 with a self-contained alert if action is needed |\n| `startup` | SessionStart shell hook — prints a compact briefing into Claude's context: project health, session count, friction level, MCP tool manifest |\n| `install` | Write the claudewatch behavioral contract into `~/.claude/CLAUDE.md`, delimited by markers; idempotent |\n\n### `claudewatch fix`\n\nThis is the command that closes the loop. Two modes:\n\n- **Rule-based** (default): Seven rules inspect friction patterns, tool usage, agent kill rates, and zero-commit rates. No external dependencies.\n- **AI-powered** (`--ai`): Calls the Claude API to generate project-specific content grounded in your session data and project structure. Requires `ANTHROPIC_API_KEY`.\n\n```bash\nclaudewatch fix shelfctl              # rule-based, interactive\nclaudewatch fix shelfctl --dry-run    # preview without applying\nclaudewatch fix shelfctl --ai         # AI-powered generation\nclaudewatch fix --all                 # fix all projects scoring \u003c 50\n```\n\n### `claudewatch watch`\n\nBackground monitoring with desktop notifications via Notification Center on macOS and libnotify on Linux.\n\n```bash\nclaudewatch watch                     # foreground, ctrl-c to stop\nclaudewatch watch --daemon            # background with PID file\nclaudewatch watch --interval 5m       # custom check interval\nclaudewatch watch --stop              # stop background daemon\n```\n\nNotifies on: friction spikes, new stale patterns, agent kill rate increases, zero-commit streaks.\n\n### `claudewatch mcp`\n\nClaude doesn't understand itself. It has no native access to its own session history, cost, friction patterns, or agent timing — that data lives in JSONL transcript files that require significant domain knowledge to parse correctly. Claude could read those files directly, but doing so burns context budget on infrastructure, and some data is structurally misleading without correction (background agent completion timestamps, for example, require joining across two different JSONL entry types to get accurate durations).\n\nclaudewatch is the mirror that lets Claude see itself. The MCP server transforms raw transcript data into structured, queryable tools so that Claude can ask \"how long did that parallel agent run take?\" or \"what has this session cost so far?\" and get an answer it can immediately reason about — without leaving the session, without parsing JSONL, and without spending context on plumbing.\n\nThe 26 MCP tools operate at two time scales:\n\n- **Historical** — project health, agent performance, friction patterns, effectiveness scores. Claude queries its own track record to make better decisions: \"plan agents get killed 40% of the time on this project, skip plan mode.\"\n- **Live** — token velocity, commit-to-attempt ratio, tool error rate, friction events. Claude monitors its own session in real time: \"I'm generating errors at 30% rate, slow down and read more before editing.\"\n\nNo other tool gives an AI agent queryable access to its own performance data. This is what makes the MCP server qualitatively different from the CLI commands: it closes the feedback loop *inside* the session where decisions are being made.\n\nRun claudewatch as an MCP ([Model Context Protocol](https://modelcontextprotocol.io)) stdio server.\n\n```bash\nclaudewatch mcp                    # start MCP server on stdio\nclaudewatch mcp --budget 20        # enable daily budget tracking ($20 limit)\n```\n\n**Configure in Claude Code** by adding to `~/.claude.json`:\n\n```json\n{\n  \"mcpServers\": {\n    \"claudewatch\": {\n      \"command\": \"/usr/local/bin/claudewatch\",\n      \"args\": [\"mcp\", \"--budget\", \"20\"]\n    }\n  }\n}\n```\n\n**Tools exposed (26 tools across 6 categories):**\n\n*Session \u0026 cost:*\n\n| Tool | Description |\n|------|-------------|\n| `get_session_stats` | Current session: cost, tokens, duration, project |\n| `get_cost_budget` | Today's estimated spend vs your daily budget |\n| `get_cost_summary` | Aggregated cost data: today, this week, all time, by project |\n| `get_recent_sessions` | Last N sessions with friction scores and cost |\n\n*Live self-reflection (real-time, current session):*\n\n| Tool | Description |\n|------|-------------|\n| `get_session_dashboard` | All live metrics in one call: token velocity, commit ratio, context pressure, cost velocity, tool errors, friction patterns. Replaces 6 individual tool calls with one round-trip. |\n| `get_token_velocity` | Tokens/minute with 10-min windowed rate — flowing, slow, or idle |\n| `get_commit_attempt_ratio` | Git commits vs Edit/Write attempts — efficient, normal, or guessing |\n| `get_live_tool_errors` | Error rate, errors by tool, consecutive errors, severity |\n| `get_live_friction` | Friction events detected so far — retries, error bursts, tool failures |\n| `get_context_pressure` | Context window utilization — comfortable, filling, pressure, or critical |\n| `get_cost_velocity` | Cost burn rate over the last 10 minutes — efficient, normal, or burning |\n| `get_drift_signal` | Drift detection — classifies last 20 tool calls as exploring, implementing, or drifting (stuck reading without writing) |\n\n*Project \u0026 pattern analysis:*\n\n| Tool | Description |\n|------|-------------|\n| `get_project_health` | Friction rate, agent success rate, zero-commit rate, top errors |\n| `get_project_comparison` | All projects ranked side by side — health, friction, CLAUDE.md status |\n| `get_suggestions` | Ranked improvement suggestions by impact score |\n| `get_stale_patterns` | Chronic friction that recurs across sessions with no CLAUDE.md fix |\n| `get_project_anomalies` | Detect sessions with abnormal cost or friction using z-score analysis — auto-refreshing baselines adapt to workflow changes |\n| `get_regression_status` | Check if project friction rate or cost has regressed beyond baseline threshold |\n\n*Agent \u0026 workflow analytics:*\n\n| Tool | Description |\n|------|-------------|\n| `get_agent_performance` | Agent metrics: success rate, duration, tokens by type |\n| `get_effectiveness` | CLAUDE.md before/after effectiveness scores per project |\n| `get_session_friction` | Friction events for a specific session |\n| `get_saw_sessions` | SAW parallel agent sessions with wave and agent counts |\n| `get_saw_wave_breakdown` | Per-wave timing and agent status for a SAW session |\n| `get_cost_attribution` | Break down token cost by tool type for a session — which tools consumed your budget |\n\n*Multi-project analysis:*\n\n| Tool | Description |\n|------|-------------|\n| `get_session_projects` | Weighted per-repo breakdown for sessions touching multiple repos — shows cost and activity distribution across projects |\n\n*Factor analysis:*\n\n| Tool | Description |\n|------|-------------|\n| `get_causal_insights` | Correlate session attributes (has_claude_md, is_saw, tool_call_count) against outcomes (friction, commits, cost) to find what predicts success |\n\n*Session management:*\n\n| Tool | Description |\n|------|-------------|\n| `set_session_project` | Override project attribution for a session |\n\n### Self-reflection architecture\n\nclaudewatch closes the feedback loop for Claude through three components that work at different layers of persistence. Understanding how they fit together matters for setup.\n\n**The push/pull problem**\n\nMCP tools are *pull* — Claude must think to call them. If Claude doesn't realize it's in trouble, it won't query `get_live_friction`. Hooks are *push* — they fire automatically after every tool use and inject signals whether Claude thinks to look or not. CLAUDE.md is *persistent* — behavioral rules that Claude Code loads at the start of every session and that remain in context regardless of how deep the conversation grows.\n\nEach component covers a gap the others leave open:\n\n**1. Startup briefing** (`claudewatch startup` as a SessionStart hook)\n\nFires at session start and prints a compact briefing directly into Claude's context: project name, session count, friction level, CLAUDE.md status, agent success rate, a context-specific tip, the full MCP tool manifest, and a PostToolUse hook reminder. This orients Claude to the project and the tools available before the first user message. Because it's injected context, it erodes as the conversation grows — useful for orientation at the start, not for behavioral rules that need to survive a 100-turn session.\n\nTwo elements of the briefing are dynamic. First, an optional regression warning line appears between the tip line and the tools line when the project's friction rate or avg cost has exceeded 1.5× its stored baseline — it is omitted entirely when the project is within baseline. Second, the tip is friction-based by default but is replaced with a SAW-correlation insight (`tip: SAW reduces zero-commit rate (X% vs Y% without)`) when there are ≥10 SAW sessions and ≥10 non-SAW sessions and the data shows a meaningful difference in zero-commit rate.\n\n**2. Behavioral contract** (`claudewatch install` → `~/.claude/CLAUDE.md`)\n\n`claudewatch install` writes a block of instructions into `~/.claude/CLAUDE.md`, delimited by `\u003c!-- claudewatch:start --\u003e` / `\u003c!-- claudewatch:end --\u003e` markers. The block tells Claude what to do when it sees the startup briefing (call `get_project_health` to calibrate) and what to do when the PostToolUse hook fires (stop, call `get_session_dashboard`). Without this, Claude sees the briefing but has no standing instruction to act on it. CLAUDE.md is loaded by Claude Code at session start and remains in context for the full session — it's where behavioral rules belong. Re-running `claudewatch install` updates the section in place; it's idempotent.\n\n**3. Reactive alerts** (`claudewatch hook` as a PostToolUse hook)\n\nFires after every tool use, rate-limited to once per 30 seconds via `~/.cache/claudewatch-hook.ts`. Checks three conditions in priority order: (1) three or more consecutive tool errors, (2) context pressure at \"pressure\" or \"critical\", (3) cost velocity \"burning\". Exits 0 silently if all clear. If a condition is met, exits 2 with a self-contained stderr message that names the MCP server, the tool to call (`get_session_dashboard`), and what that tool returns — so Claude with zero prior context about claudewatch knows exactly what to do. When a consecutive error alert fires, the message also names the chronic friction pattern if one is detected (a friction type appearing in \u003e30% of the project's last 10 sessions with no recent CLAUDE.md update), surfacing it as `(chronic: {type} in N% of recent sessions)` so Claude knows whether this is a systemic issue or an isolated event.\n\n**Why CLAUDE.md persistence matters**\n\nInjected context from the startup hook erodes as the conversation grows. By turn 50 it's buried under newer content. CLAUDE.md is loaded by Claude Code at the start of every session and remains in context regardless of depth. The behavioral rules — \"when the hook fires, stop and call get_session_dashboard\" — need to persist for the full session. The dynamic project data only needs to be fresh at session start.\n\n**Setup**\n\n```bash\n# Install behavioral contract into ~/.claude/CLAUDE.md\nclaudewatch install\n```\n\nAdd hooks to `~/.claude/settings.json`:\n\n```json\n{\n  \"hooks\": {\n    \"SessionStart\": [{\"hooks\": [{\"type\": \"command\", \"command\": \"claudewatch startup\"}]}],\n    \"PostToolUse\": [{\"hooks\": [{\"type\": \"command\", \"command\": \"claudewatch hook\"}]}]\n  }\n}\n```\n\n### `claudewatch metrics --json`\n\nMachine-readable JSON export for all metrics sections. Use for time-series analysis, cost dashboards, CI/CD integration, or custom queries.\n\n```bash\nclaudewatch metrics --json                        # full export\nclaudewatch metrics --days 7 --json \u003e week.json   # save to file\n```\n\n**Exported sections:**\n- `velocity` - productivity metrics (commits/session, files modified, lines added)\n- `efficiency` - tool usage, error rates, interruptions\n- `satisfaction` - weighted scores, outcome distribution\n- `agents` - agent performance by type, success/kill rates\n- `tokens` - input/output tokens, ratios, per-session averages\n- `commits` - commit patterns, zero-commit rate, detailed session list\n- `conversation` - correction rate, long message frequency\n- `confidence` - project confidence scores, read/write ratios\n- `friction_trends` - stale/improving/worsening friction patterns\n- `cost_per_outcome` - cost per commit/file/session, goal achievement (cache-adjusted)\n- `effectiveness` - CLAUDE.md before/after effectiveness scoring\n- `planning` - task completion rates and file churn intensity\n\n**Example queries:**\n\n```bash\n# Track cost trends\nclaudewatch metrics --json | jq '.cost_per_outcome.avg_cost_per_commit'\n\n# Find low-confidence projects\nclaudewatch metrics --json | jq '.confidence.projects[] | select(.confidence_score \u003c 40)'\n\n# Monitor effectiveness\nclaudewatch metrics --json | jq '.effectiveness[] | {project: .project_name, score, verdict}'\n\n# Export for analysis\nclaudewatch metrics --days 30 --json \u003e baseline.json\n# ... make CLAUDE.md changes ...\nclaudewatch metrics --days 30 --json \u003e after.json\n# Compare in Python/R/Excel\n```\n\n## Data sources\n\nAll data is read from local files. claudewatch never writes to these paths, never modifies them, and never reads anything outside `~/.claude/`.\n\n| Source | What it contains |\n|--------|-----------------|\n| `~/.claude/history.jsonl` | Conversation history |\n| `~/.claude/usage-data/session-meta/` | Session metadata (tools, commits, languages) |\n| `~/.claude/usage-data/facets/` | Session analysis (friction, satisfaction, goals) |\n| `~/.claude/stats-cache.json` | Aggregate token usage and cache statistics |\n| `~/.claude/todos/` | Task lists created during sessions (completion tracking) |\n| `~/.claude/file-history/` | File edit snapshots per session (churn analysis) |\n| `~/.claude/settings.json` | Global settings, hooks, permissions |\n| `~/.claude/projects/` | Project-specific settings and session transcripts |\n| `~/.claude/commands/` | Custom slash commands |\n\n## Privacy\n\nZero network calls. Reads only local files under `~/.claude/`. Writes only to a local SQLite database for snapshot storage. No telemetry, no analytics, no crash reporting, no update checks. Nothing leaves your machine.\n\n## Development\n\nPure Go, no CGO. Cross-compiles to linux/darwin/windows on amd64 and arm64.\n\n```bash\nmake build      # compile to bin/claudewatch\nmake test       # run all tests\nmake vet        # go vet\nmake lint       # golangci-lint\nmake snapshot   # goreleaser snapshot build (all platforms)\n```\n\n## License\n\nDual-licensed under [MIT](LICENSE) and [Apache 2.0](LICENSE-APACHE).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fblackwell-systems%2Fclaudewatch","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fblackwell-systems%2Fclaudewatch","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fblackwell-systems%2Fclaudewatch/lists"}