{"id":51168136,"url":"https://github.com/packyme/privacy-filter","last_synced_at":"2026-06-28T02:00:47.216Z","repository":{"id":363501677,"uuid":"1253048130","full_name":"packyme/privacy-filter","owner":"packyme","description":"LLM privacy gateway in Go — millisecond-latency PII and secret redaction. Used in production by PackyCode.","archived":false,"fork":false,"pushed_at":"2026-06-09T06:06:51.000Z","size":76,"stargazers_count":171,"open_issues_count":1,"forks_count":18,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-06-09T08:11:32.894Z","etag":null,"topics":["ai-gateway","data-redaction","gitleaks","golang","grpc","http-server","llm-security","middleware","pii-detection","privacy","redact","secrets-detection"],"latest_commit_sha":null,"homepage":"https://www.packyapi.com","language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/packyme.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-05-29T05:34:55.000Z","updated_at":"2026-06-09T07:42:10.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/packyme/privacy-filter","commit_stats":null,"previous_names":["packyme/privacy-filter"],"tags_count":null,"template":false,"template_full_name":null,"purl":"pkg:github/packyme/privacy-filter","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/packyme%2Fprivacy-filter","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/packyme%2Fprivacy-filter/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/packyme%2Fprivacy-filter/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/packyme%2Fprivacy-filter/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/packyme","download_url":"https://codeload.github.com/packyme/privacy-filter/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/packyme%2Fprivacy-filter/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":34874557,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-28T02:00:05.809Z","response_time":54,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai-gateway","data-redaction","gitleaks","golang","grpc","http-server","llm-security","middleware","pii-detection","privacy","redact","secrets-detection"],"created_at":"2026-06-26T22:00:19.668Z","updated_at":"2026-06-28T02:00:47.208Z","avatar_url":"https://github.com/packyme.png","language":"Go","funding_links":[],"categories":["LLMOps","Go","*Ops for AI"],"sub_categories":["LLM Gateways \u0026 Proxies","LLMOps"],"readme":"# Privacy Filter (Go)\n\n**English** | [简体中文](README.zh-CN.md)\n\nStrip sensitive user data (PII / secrets) from text before it reaches an LLM.\nPure Go, no model, no GPU, no CGO — a single static binary, millisecond latency on text of any length.\n\n🌐 Running in production at [PackyCode](https://www.packyapi.com) — the privacy-compliance component of an API relay service.\n\n---\n\n## Three ways to use it\n\n1. **Core package**: `import \"privacyfilter/filter\"` straight into your gateway — redaction is one function call, no HTTP hop.\n2. **HTTP service**: `cmd/http`, REST API.\n3. **gRPC service**: `cmd/grpc`, interface in `proto/filter.proto`.\n\nThe latter two are thin wrappers around the `filter` core package.\n\n---\n\n## Two detection layers\n\n| Layer | Covers | Technique |\n|---|---|---|\n| Structured PII | Email, phone, national ID, bank card (Luhn-checked), IP | Regex |\n| Secrets / credentials | API keys, tokens, private keys, passwords written in prose, unknown high-entropy strings | gitleaks ruleset (keyword pre-filter) + contextual regex + Shannon-entropy fallback |\n\nEach layer emits `(start, end, placeholder)` spans → spans are merged and de-overlapped → the text is rebuilt in a single pass.\nPlaceholders are typed and carry the entity kind — `[邮箱]` (email), `[电话]` (phone), `[身份证]` (national ID), `[银行卡]` (bank card), `[IP]`, `[密钥]` (secret) — and are irreversible (no un-redaction).\n\n\u003e No person / place / organization name recognition — that needs an NER model, which costs seconds of CPU time on long text and was removed per requirements.\n\u003e High-risk identity data (national ID, bank card, secrets, etc.) is fully covered by regex.\n\n---\n\n## Layout\n\n```\nprivacy-filter/\n├── go.mod / go.sum\n├── filter/                  core package (import directly from a gateway)\n│   ├── filter.go            Filter / New / Redact\n│   ├── pii.go               structured PII\n│   ├── secrets.go           gitleaks + context + entropy\n│   └── filter_test.go\n├── cmd/\n│   ├── http/main.go         HTTP service\n│   └── grpc/main.go         gRPC service\n├── proto/filter.proto       gRPC interface definition\n├── gen/filterpb/            protoc-generated code\n├── rules/gitleaks.toml      gitleaks ruleset\n├── scripts/fetch_rules.sh   ruleset update script\n└── Dockerfile\n```\n\n---\n\n## Build\n\n```bash\ngo build -o bin/server-http ./cmd/http\ngo build -o bin/server-grpc ./cmd/grpc\ngo test ./...                          # run all tests\n```\n\n---\n\n## Usage\n\n### 1. Core package (recommended for gateways)\n\n```go\nimport \"privacyfilter/filter\"\n\n// Create once at startup; concurrency-safe, reuse globally.\nf, err := filter.New(\"rules/gitleaks.toml\")   // pass \"\" to use the built-in fallback rules\n\n// Per request\nres := f.Redact(userPrompt)\nforwardToLLM(res.Redacted)                    // forward the redacted text to the LLM\n```\n\n`filter.Result`: `Redacted` (redacted text), `Hit`, `Count`, `Entities` (hit details,\nincluding type and byte offsets).\n\n\u003e To consume this package from your own gateway module: put it in the same monorepo, or add\n\u003e `replace privacyfilter =\u003e ../privacy-filter` to the gateway's go.mod. The `filter` package\n\u003e depends only on `BurntSushi/toml`.\n\n### 2. HTTP service\n\n```bash\n./bin/server-http                    # default :8088\n```\n\n```bash\ncurl http://127.0.0.1:8088/health\ncurl -X POST http://127.0.0.1:8088/redact -H 'Content-Type: application/json' \\\n  -d '{\"text\":\"我的邮箱是 a@b.com，密码是 Hunter2xy\"}'\n# {\"redacted\":\"我的邮箱是 [邮箱]，密码是 [密钥]\",\"hit\":true,\"count\":2,\"entities\":[...],\"elapsed_ms\":0.08}\n```\n\nEndpoints: `GET /health`, `POST /redact`, `POST /redact/batch` (`{\"texts\":[...]}`).\n\n### 3. gRPC service\n\n```bash\n./bin/server-grpc                    # default :8089\n```\n\nService `filter.v1.PrivacyFilter`, methods `Redact` / `RedactBatch`, defined in `proto/filter.proto`.\nGenerate a client from that proto on the gateway side. To regenerate the code in this repo:\n\n```bash\nprotoc -I. --go_out=. --go_opt=module=privacyfilter \\\n       --go-grpc_out=. --go-grpc_opt=module=privacyfilter proto/filter.proto\n```\n\n---\n\n## Configuration (environment variables)\n\n| Variable | Default | Description |\n|---|---|---|\n| `PF_PORT` | `8088` | HTTP listen port |\n| `PF_GRPC_PORT` | `8089` | gRPC listen port |\n| `PF_GITLEAKS_TOML` | `rules/gitleaks.toml` | path to the gitleaks rules file |\n\n---\n\n## Performance (local benchmark, synthetic high-density PII text — worst case)\n\n| Text length | Latency |\n|---|---|\n| ~50 B | ~0.01ms |\n| ~2 KB | ~0.46ms |\n| ~32 KB | ~9ms |\n\nBoth layers are O(n). Real prompts (PII is never this dense) are faster.\n\n---\n\n## Integration notes (gateway side)\n\n- With the core-package import there is no HTTP/gRPC hop, hence no timeout and no fail-open/closed concerns.\n- If you use the HTTP/gRPC service: set a 150–300ms timeout; on failure, prefer fail-closed (reject the request rather than forwarding the raw text).\n\n---\n\n## Notes\n\n- All **222 gitleaks rules compile natively** in Go (Go's `regexp` is RE2, the same engine gitleaks uses;\n  an earlier Python port lost 26 rules to RE2-incompatible syntax).\n- Go `regexp` runs in linear time — no catastrophic backtracking (ReDoS) risk.\n- gitleaks does not support look-around assertions, so digit boundaries for phone / national ID etc. are\n  enforced by manual post-match validation.\n- No person / place / organization recognition. If added later, prefer rules (a Chinese-address regex is\n  feasible; names are better anchored by context).\n- The entropy fallback can mis-flag git SHAs, long base64 strings, etc. — tune the threshold or add an allowlist.\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpackyme%2Fprivacy-filter","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fpackyme%2Fprivacy-filter","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpackyme%2Fprivacy-filter/lists"}