{"id":50937403,"url":"https://github.com/olafrv/nopii","last_synced_at":"2026-06-17T10:32:29.323Z","repository":{"id":364402510,"uuid":"1267673377","full_name":"olafrv/nopii","owner":"olafrv","description":"nopii — PII-redaction proxy for the Anthropic API in Claude Code (CLI)","archived":false,"fork":false,"pushed_at":"2026-06-12T21:42:53.000Z","size":13458,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-06-12T23:16:14.878Z","etag":null,"topics":["ai","claude","claude-code","nodejs","pii","pnpm"],"latest_commit_sha":null,"homepage":"","language":"JavaScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/olafrv.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-06-12T18:58:26.000Z","updated_at":"2026-06-12T21:44:06.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/olafrv/nopii","commit_stats":null,"previous_names":["olafrv/nopii"],"tags_count":null,"template":false,"template_full_name":null,"purl":"pkg:github/olafrv/nopii","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/olafrv%2Fnopii","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/olafrv%2Fnopii/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/olafrv%2Fnopii/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/olafrv%2Fnopii/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/olafrv","download_url":"https://codeload.github.com/olafrv/nopii/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/olafrv%2Fnopii/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":34445182,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-17T02:00:05.408Z","response_time":127,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai","claude","claude-code","nodejs","pii","pnpm"],"created_at":"2026-06-17T10:32:26.269Z","updated_at":"2026-06-17T10:32:29.293Z","avatar_url":"https://github.com/olafrv.png","language":"JavaScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"# nopii — PII-redaction proxy for the Anthropic API in Claude Code (CLI)\n\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"src/img/nopii-logo-dark.png\" alt=\"nopii logo\" width=\"420\"\u003e\n\u003c/p\u003e\n\n```\nClaude Code ──► nopii proxy (redacted) ──► api.anthropic.com\nClaude Code ◄── nopii proxy (rehydrated) ◄── api.anthropic.com\n```\n\nRedact personally identifiable information (PII) from your prompts **before they\nreach the Anthropic API**. `nopii` is a thin reverse proxy that sits between Claude\nCode and `api.anthropic.com`. It detects PII locally (GLiNER + regex), replaces it\nwith stable placeholder tokens, forwards the sanitized request to Claude, and\nrestores the original values in Claude's response so your experience is unchanged.\n\n\u003e **WARNING:** `ANTHROPIC_BASE_URL` redirection is a **`claude` CLI or Agent SDK** feature — Claude Desktop,\nthe VS Code extension, and the **claude.ai web app** do not honour that variable, so none of them can be\nrouted to the `nopii` proxy.\n\n## What gets redacted\n\n- **Only the user prompt.** `nopii` rewrites `role: \"user\"` messages — plain text,\n  `text` blocks, and `tool_result` content you feed back (on by default; disable\n  with `REDACT_TOOL_RESULTS=false`). System\n  prompts and assistant turns are untouched.\n- Placeholders are **deterministic**: `\u003cPERSON_3f9a2b10\u003e` = `\u003cTYPE\u003e_\u003csha256(value)[:8]\u003e`.\n  The same value always maps to the same token, so multi-turn conversations stay\n  consistent and prompt caching keeps working.\n- Claude's response is **rehydrated**: tokens in streamed text become the original\n  values again; tokens inside tool-call JSON inputs are restored with proper JSON\n  escaping so tool calls don't break.\n\n## Authentication\n\n`nopii` supports two ways to authenticate, set with `AUTH_MODE`. **OAuth (Option A) is\nthe more PII-protective choice**: its token isn't granted the `file upload` scope, so there\nis simply no path for an unredacted file to leave your machine (see Option A's corollary).\n`passthrough` (Option B) is the zero-config **default** and the simplest to set up, but an\nAPI key reaches **any** endpoint — including the file endpoints nopii can't redact.\n\n| | `oauth` (Option A, recommended) | `passthrough` (Option B, default) |\n|---|---|---|\n| What you use | Your Claude Pro/Max **subscription** | A pay-as-you-go **API key** |\n| Billing | Flat monthly subscription | Per-token (separate from any subscription) |\n| How auth works | nopii holds its own OAuth token and injects it | Claude Code's `x-api-key` is forwarded untouched |\n| PII surface | No file-upload path (scope declined) | Key reaches any endpoint, incl. file upload |\n| Setup | `pnpm run oauth-login` once (below) | Create a key (below) |\n\n## Setup\n\nRequires Node (version pinned in `.nvmrc`) and pnpm (provided by corepack).\n\n```bash\ncorepack enable               # provides pnpm (version pinned in package.json)\npnpm install --frozen-lockfile\ncp .env.example .env          # options documented inline; adjust as needed\n\n# Download the GLiNER ONNX weights into model/ (~392 MB, not committed)\npnpm run model:download       # -\u003e model/gliner_medium-v2.1/onnx/model_fp16.onnx\n```\n\n\u003e Prefer to fetch the weights by hand, or grab a different variant? See\n\u003e [model/README.md](./model/README.md).\n\n\u003e This project uses **pnpm** with supply-chain-security controls — see\n\u003e [PNPM_SECURITY.md](./PNPM_SECURITY.md). Use pnpm, not npm.\n\nThen pick an auth option below.\n\n### Option A — your Claude Pro/Max subscription (OAuth, recommended)\n\nUse your existing subscription instead of paying per token:\n\n```bash\n# in the shell you run the proxy:\npnpm run oauth-login  # opens browser -\u003e approve -\u003e tokens saved to ~/.nopii\nexport AUTH_MODE=oauth\npnpm start\n```\n\n\u003e **PNPM login:** it's `pnpm run oauth-login`, not `pnpm login` — `login` is a built-in\n\u003e pnpm command (it logs into the npm registry), so it would never run this script.\n\n\u003e **Security note:** in oauth mode your subscription tokens are stored **in\n\u003e plaintext** at `~/.nopii/credentials.json` (mode `0600`). Treat that file like a\n\u003e password. Override the location with `NOPII_CREDENTIALS_DIR`.\n\nnopii reads Claude Code's authentic request (its system prompt, beta headers and\nfingerprints are real, since the client *is* Claude Code), swaps in your OAuth\nBearer token, and refreshes it automatically (including a one-shot retry on a 401).\nWhen the refresh token finally expires, just `pnpm run oauth-login` again.\n\nThe consent screen nopii shows is **shorter** than the one the real Claude Code CLI\nshows. That's intentional: nopii requests only the two scopes a redaction proxy needs\n(`user:inference user:profile`), so it can forward inference on your subscription and\nread your profile — nothing more. The capabilities Claude Code asks for but nopii does\n**not**:\n\n| Consent prompt line | Scope | Requested by nopii? |\n|---|---|---|\n| Contribute to your Claude subscription usage | `user:inference` | ✅ yes |\n| Access your Anthropic profile information | `user:profile` | ✅ yes |\n| Access your Claude Code sessions | session | ❌ no |\n| Use and manage your connectors | connectors | ❌ no |\n| Upload files on your behalf | file upload | ❌ no |\n\nOverride the requested scopes with `OAUTH_SCOPES` if you ever need the broader grant.\n\n\u003e **Corollary — declining `file upload` is protective, not a gap.** nopii redacts\n\u003e *text only* (user-turn `text` and `tool_result` blocks of `/v1/messages`); it cannot\n\u003e redact file contents. By not requesting the `file upload` scope, the OAuth token\n\u003e simply *can't* upload files — so there is no unredacted file path in oauth mode. The\n\u003e standing limitation is independent of OAuth: inline `image`/`document` blocks (e.g. a\n\u003e pasted screenshot or PDF) are passed through unredacted, and in `passthrough` mode a\n\u003e Files-API upload (`/v1/files`) is transparent passthrough too. File text that Claude\n\u003e Code inlines into `tool_result`/`text` blocks *is* redacted.\n\n### Option B — API key (default mode)\n\n1. Sign in at **[console.anthropic.com](https://console.anthropic.com)**.\n2. **Settings → Billing** → add a payment method or buy prepaid credits (the API\n   is billed separately from any Pro/Max subscription).\n3. **Settings → API Keys → Create Key** and copy the `sk-ant-...` value.\n\n```bash\n# in the shell you run the proxy:\npnpm start\n```\n\n**What stops the PII leak here — and what doesn't.** The API key only authenticates and\nbills the request; it is **not** what protects your data — nopii's redaction is (the same\nsanitized request goes upstream regardless of which key you use). What an API key can and\ncan't do for you:\n\n| Control | Protects your PII? | What it actually does |\n|---|---|---|\n| nopii redaction (`text` / `tool_result` blocks) | ✅ yes | Strips PII before the request leaves your machine — the actual protection |\n| Capability/endpoint scoping (e.g. \"inference-only, no files\") | ❌ unavailable | No such Anthropic key exists; unlike Option A's `file upload` scope, a key reaches **any** endpoint |\n| Workspace [spend / rate limits](https://platform.claude.com/docs/en/manage-claude/workspaces) | ❌ no | Caps cost \u0026 throughput — limits blast radius if the key leaks, not what data is sent |\n| Workspace read-only vs full access | ❌ no | Limits what a leaked key can do — blast radius, not data |\n\n\u003e **Text-only limit — to avoid a leak.** Because no key can block file endpoints and nopii\n\u003e redacts *text* only, inline `image`/`document` blocks and Files-API (`/v1/files`) uploads\n\u003e pass through **unredacted** (see Option A's corollary). Keep PII out of pasted\n\u003e screenshots/PDFs and any file-upload path. For hardening, point nopii at a dedicated\n\u003e low-spend-cap workspace so a leaked key can't run up an unbounded bill or touch other\n\u003e projects.\n\n## Proxy Claude Code\n\n```bash\nexport ANTHROPIC_BASE_URL=http://localhost:8788\nexport ANTHROPIC_API_KEY=sk-ant-...\nclaude\n```\n\nOr persist it in `~/.claude/settings.json`, then run `claude`:\n\n```json\n{\n  \"env\": {\n    \"ANTHROPIC_BASE_URL\": \"http://localhost:8788\"\n  }\n}\n```\n\n## Verify redaction end-to-end\n\nNow anything you type containing a name, email, phone number, IP, etc. is replaced\nwith a token before it leaves your machine. Verify the proxy is live:\n\n```bash\ncurl -s http://localhost:8788/healthz\n# {\"ok\":true,\"upstream\":\"https://api.anthropic.com\"}\n```\n\nWith `NODE_ENV=development DEBUG=true` set, the proxy logs how many PII spans it\nredacted per request (counts only — never the values). You can also exercise the\nMessages API directly through the proxy (this `x-api-key` example assumes the default\n`passthrough` mode; in `oauth` mode the proxy supplies its own token, so the header is ignored):\n\n```bash\ncurl -s http://localhost:8788/v1/messages \\\n  -H \"x-api-key: $ANTHROPIC_API_KEY\" \\\n  -H \"anthropic-version: 2023-06-01\" \\\n  -H \"content-type: application/json\" \\\n  -d '{\n    \"model\": \"claude-opus-4-8\",\n    \"max_tokens\": 256,\n    \"messages\": [\n      {\n        \"role\": \"user\",\n        \"content\": \"Email Sarah Chen at sarah.chen@acme.com about Tuesday.\"\n      }\n    ]\n  }'\n```\n\nAnthropic receives `Email \u003cPERSON_xxxxxxxx\u003e at \u003cEMAIL_xxxxxxxx\u003e about Tuesday.`\nYou get a reply with the real name and email restored.\n\n\n## Run in a container (isolated from your host login)\n\nIf you don't want nopii's setup to disturb the `claude` you already use (e.g. you're\nlogged into claude.ai on the host), run both the proxy **and** Claude Code in\ncontainers. The containerised `claude` **never touches your host's `~/.claude` login** —\nits state lives in a repo-local, gitignored `data/.claude/` (history, project settings) plus\n`data/.claude.json` (onboarding: theme, API-key approval, folder trust) instead, so it stays\nisolated from your host while persisting across `stop`/`start` — no re-onboarding each run.\n(No source is mounted, so it can't see your repo either — this is for exercising the\nproxy/auth path, not editing host files.)\n\n```bash\n./claude-nopii.sh start   # start proxy, drop into claude (builds if missing)\n                          # `start` is default; extra args pass to claude\n./claude-nopii.sh shell   # bash prompt in the claude container (no TUI)\n./claude-nopii.sh build   # rebuild images after changing deps/Dockerfiles\n./claude-nopii.sh log     # print proxy logs (add -f to follow)\n./claude-nopii.sh stop    # tear down proxy and containers when done\n```\n\n**Anything after `start` is passed straight to `claude`** — so you can run any\n`claude` subcommand in the container instead of opening the interactive TUI. This\nis how you configure the container's own (isolated) claude without needing a shell:\n\n```bash\n# list / add / remove MCP servers for the containerised claude\n./claude-nopii.sh start mcp list\n./claude-nopii.sh start mcp add \u003cname\u003e -s user -- npx -y \u003cserver-pkg\u003e\n./claude-nopii.sh start mcp remove \u003cname\u003e\n\n# any other claude subcommand works too\n./claude-nopii.sh start --version\n./claude-nopii.sh start config ls\n```\n\nMCP servers added this way persist in `data/.claude.json` and run **inside the\nclaude container**, so the command must be runnable there (an `npx` stdio server,\nor an HTTP/SSE URL the container can reach — `proxy:8788` in-network,\n`host.docker.internal` for host services). Drop the args to get the TUI back.\n\nFor interactive admin work, `./claude-nopii.sh shell` drops you into a bash prompt\nin the claude container (same isolated mounts) instead of the TUI — handy for a\nseries of `claude mcp …`/`claude config …` commands or poking around. Pass a command\nto run-and-exit: `./claude-nopii.sh shell -c 'claude mcp list'`.\n\nThe **proxy** mounts your OAuth tokens from `~/.nopii` (read-write so token refresh\npersists), plus `./model` and live `./src`; **claude** mounts only `./data/.claude` and\n`./data/.claude.json` for its own state. To watch redaction happen, set `NODE_ENV=development` and `DEBUG=true` in `.env`\n(the proxy logs span **counts** only, never values) and run `./claude-nopii.sh log`.\n\n## Development\n\nRun the proxy with auto-reload (loads `.env`):\n\n```bash\npnpm dev                      # or: pnpm start\n# [nopii] proxy listening on http://localhost:8788 -\u003e https://api.anthropic.com\n```\n\nTests:\n\n```bash\npnpm test                                  # GLiNER leak-check (needs the model)\nnode --test test/rehydrate.test.js         # rehydration logic (no model needed)\n```\n\n`test/leak-check.js` is your CI gate against redaction regressions — add fixtures\nfrom real prompts as you find gaps. `test/rehydrate.test.js` covers the tricky\nstreaming path, including tokens split across SSE deltas and JSON-escaped tool\ninputs.\n\n### Leak statistics over a real PII dataset\n\nFor broader recall/precision numbers than the handful of fixtures, run the detector\nagainst the public [ai4privacy](https://huggingface.co/datasets/ai4privacy/pii-masking-300k)\ndataset. It's a benchmark, not a CI gate (the gate stays `leak-check.js`).\n\nFetch the dataset (~100 MB, gitignored under `datasets/`, mirroring the Hugging\nFace repo path), then run the benchmark:\n\n```bash\npnpm run dataset:download            # data/train/1english_openpii_30k.jsonl\npnpm run leak-stats                  # 1000 records, stride-sampled (~2 min)\npnpm run leak-stats -- --limit 5000  # bigger sample\npnpm run leak-stats -- --limit 0     # the full dataset (slow)\npnpm run leak-stats -- --json        # machine-readable\n```\n\nOther splits/languages: `pnpm run dataset:download -- --file german_openpii_30k.jsonl`,\nthen `pnpm run leak-stats -- --file datasets/.../data/train/german_openpii_30k.jsonl`.\n\nIt reports the **leak rate** (gold PII spans with no overlapping detection — the\nprivacy headline) in two scopes (labels nopii targets, vs every gold span),\nper-label coverage, strict per-type precision/recall/F1, and over-redaction. The\ndataset-label→nopii-type map lives at the top of `test/leak-stats.mjs`; adjust it if\nyou change the entity set in `src/ner.js`.\n\nSee [docs/LEAK_TEST.md](./docs/LEAK_TEST.md) for a sample run, how to read every\nsection, and concrete ways to improve the detection rate.\n\n### Wipe regenerable artifacts\n\nTo rebuild from a clean slate, `make wipe` deletes every git-untracked and\ngitignored path — `node_modules/`, the GLiNER weights, `datasets/`, caches, logs,\ntmp, and the container's generated Claude state (`data/.claude*`, re-created on\nnext login). It **preserves** `.env` (your secrets/config) and `OLAF.md`; lists\nexactly what will be removed; and asks for confirmation before deleting. Nothing\ngit-tracked is touched either (so committed files like `.vscode/settings.json`\nand `data/.claude/.gitkeep` survive).\n\n```bash\nmake            # show targets (default)\nmake wipe       # list, confirm, then delete untracked/ignored artifacts\n```\n\nAfterwards, re-run `pnpm install`, `pnpm run model:download`, and (if needed)\n`pnpm run dataset:download`.\n\n### Scan for committed secrets\n\nThis repo ships a [gitleaks](https://github.com/gitleaks/gitleaks) config\n(`.gitleaks.toml`) that keeps all of gitleaks' default rules and narrowly\nallowlists only the **public** Claude Code OAuth `client_id` (a UUID-shaped\nliteral that the heuristic rules flag, but which is not a secret — the flow is\nPKCE with no client secret). Install gitleaks (`brew install gitleaks`), then:\n\n```bash\nmake scan          # scan the FULL git history for secrets\nmake scan-staged   # scan only staged changes (fast; pre-commit use)\n```\n\n`make scan` exits non-zero if anything is found, so it doubles as a CI gate. To\nblock secrets before they're committed, wire `make scan-staged` into a git\npre-commit hook (see below).\n\n#### Pre-commit hook\n\nThe robust, shareable option is a tracked git hook activated via\n`core.hooksPath` — it guards **every** commit, no matter who makes it or which\ntool they use:\n\n```bash\ngit config core.hooksPath .githooks   # one-time, per clone\n# .githooks/pre-commit runs `make scan-staged` and aborts the commit on a leak\n```\n\n## Deploy as a shared server\n\n```bash\n# context is the repo root; weights in model/ are copied in the image\ndocker build -t nopii -f docker/Dockerfile .   \ndocker run -p 8788:8788 nopii\n```\n\nTeammates set `ANTHROPIC_BASE_URL=https://your-host:8788`.\n\n**Is multi-user on one endpoint actually feasible?** Yes — but **only in\n`passthrough` mode**, and with caveats. It works because nopii keeps **no per-user\nstate**: each request carries its own Anthropic API key (forwarded untouched, never\nstored), and the redaction mapping is **request-scoped and deterministic**, so there\nis no cross-user state to leak or collide — concurrent users and multiple replicas\nare fine with no shared store. What you must accept before sharing the endpoint:\n\n- **The host sees every user's raw prompt in memory** (that's the whole point of the\n  proxy). The operator — and anything that can read the process — sees unredacted PII\n  for *all* users. Run it behind TLS, restrict network access, and treat the host as\n  sensitive. If `DEBUG` is on, its masked token→value logs span all users; never\n  enable it on a shared host.\n- **nopii has no authentication of its own.** Anyone who can reach the port can send\n  prompts through it (billed to whatever key they supply). Put it behind your own\n  network controls / a gateway — nopii won't gate access for you.\n- **No per-user isolation or rate limiting.** One user can't see another's mapping\n  (request-scoped), but there's no quota, tenancy, or audit boundary between them.\n\nDo **not** deploy `AUTH_MODE=oauth` as a shared server — it would bill every request\nto one subscription and exposes that account's tokens; oauth mode is meant for a\nsingle local user.\n\n## Limitations \u0026 trade-offs\n\n- **Detection is not perfect.** GLiNER + regex catch common PII; domain-specific\n  identifiers and implied PII can slip through. Tune the threshold, add regex\n  patterns, and grow the fixture set. Review only sanitized samples.\n- **Latency.** Each redacted request runs local NER inference (a few ms–seconds\n  for long prompts). The model is warmed at startup to avoid cold-start spikes.\n- **Auth.** Two modes via `AUTH_MODE`: `passthrough` (API key, forwarded untouched) and\n  `oauth` (your Pro/Max subscription, tokens held and refreshed by nopii — see *Auth*\n  above). Both are verified end-to-end for the **`claude` CLI**; Claude Desktop, the VS\n  Code extension, and the claude.ai web app can't be routed through the proxy.\n- **Anthropic API only.** nopii is built for the Anthropic `/v1/messages` request shape\n  with bearer / `x-api-key` auth. Pointing `ANTHROPIC_UPSTREAM_URL` at a non-Anthropic\n  gateway such as **AWS Bedrock** (or Vertex) is **not a validated path** — their request\n  shape and signing (Bedrock uses SigV4, not a forwarded key) differ. And where such a\n  gateway authenticates with long-lived **cloud credentials**, those carry the same\n  *reaches-any-endpoint* PII surface as Option B's API key (the credential isn't what\n  protects your data — redaction is).\n- **Fail-closed by default.** If detection errors, the request is blocked so PII\n  cannot leak. Flip `FAIL_OPEN=true` only if availability matters more than privacy.\n- **Mapping is in-process and request-scoped.** No PII is persisted. In a\n  multi-replica deployment each request is self-contained (deterministic tokens),\n  so no shared store is required.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Folafrv%2Fnopii","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Folafrv%2Fnopii","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Folafrv%2Fnopii/lists"}