{"id":43918179,"url":"https://github.com/aragossa/pii-shield","last_synced_at":"2026-04-18T20:07:13.208Z","repository":{"id":336117576,"uuid":"1146314565","full_name":"aragossa/pii-shield","owner":"aragossa","description":"Zero-code K8s sidecar for log sanitization. Detects secrets via Entropy Analysis, preserves JSON integrity, and redacts PII deterministically. 🛡️","archived":false,"fork":false,"pushed_at":"2026-04-08T10:48:14.000Z","size":175,"stargazers_count":50,"open_issues_count":0,"forks_count":4,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-04-08T12:24:46.280Z","etag":null,"topics":["devsecops","entropy","gdpr","golang","json","kubernetes","log-sanitization","log-sanitizer","logging","pii-redaction","security","sidecar","soc2"],"latest_commit_sha":null,"homepage":"https://pii-shield.com/","language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/aragossa.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-01-30T22:51:32.000Z","updated_at":"2026-04-08T10:48:19.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/aragossa/pii-shield","commit_stats":null,"previous_names":["aragossa/pii-shield"],"tags_count":22,"template":false,"template_full_name":null,"purl":"pkg:github/aragossa/pii-shield","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aragossa%2Fpii-shield","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aragossa%2Fpii-shield/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aragossa%2Fpii-shield/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aragossa%2Fpii-shield/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/aragossa","download_url":"https://codeload.github.com/aragossa/pii-shield/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aragossa%2Fpii-shield/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31982756,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-18T17:30:12.329Z","status":"ssl_error","status_checked_at":"2026-04-18T17:29:59.069Z","response_time":103,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["devsecops","entropy","gdpr","golang","json","kubernetes","log-sanitization","log-sanitizer","logging","pii-redaction","security","sidecar","soc2"],"created_at":"2026-02-06T22:03:04.917Z","updated_at":"2026-04-18T20:07:13.200Z","avatar_url":"https://github.com/aragossa.png","language":"Go","funding_links":[],"categories":["Security"],"sub_categories":["HTTP Clients"],"readme":"# PII-Shield 🛡️\n\n**Zero-code log sanitization sidecar for Kubernetes.**\nPrevents data leaks (GDPR/SOC2) by redacting PII from logs *before* they leave the pod.\n\n![License](https://img.shields.io/badge/License-Apache_2.0-blue.svg)\n![Docker Pulls](https://img.shields.io/docker/pulls/thelisdeep/pii-shield)\n![Go Report Card](https://goreportcard.com/badge/github.com/aragossa/pii-shield?v=1)\n![Go Reference](https://pkg.go.dev/badge/github.com/aragossa/pii-shield.svg)\n![Build Status](https://github.com/aragossa/pii-shield/actions/workflows/test.yml/badge.svg)\n![Coverage Status](https://codecov.io/gh/aragossa/pii-shield/branch/main/graph/badge.svg)\n![GitHub release (latest SemVer)](https://img.shields.io/github/v/release/aragossa/pii-shield?sort=semver)\n[![Artifact Hub](https://img.shields.io/endpoint?url=https://artifacthub.io/badge/repository/pii-shield)](https://artifacthub.io/packages/search?repo=pii-shield)\n\n\"Don't let PII poison your AI models.\" PII-Shield ensures that sensitive data never reaches your training dataset, saving you from GDPR-forced model retraining.\n\n## Why PII-Shield?\n\nDevelopers often forget to mask sensitive data. Traditional regex filters in Fluentd/Logstash are slow, hard to maintain, and consume expensive CPU on log aggregators.\n\n**PII-Shield sits right next to your app container:**\n- **Production Ready:** Optimized for Kubernetes sidecars with **ultra-low memory allocations** (zero-GC overhead on hot paths) and deterministic O(1) regex matching.\n- **Context-Aware Entropy Analysis:** Detected high-entropy secrets even without keys (e.g. `Error: ... 44saCk9...`) by analyzing context keywords.\n- **Custom Regex Rules:** Deterministic redaction for structured data (UUIDs, IDs) that overrides entropy checks, ensuring 100% compliance for known patterns.\n- **100% Accuracy:** Verified against \"Wild\" stress tests including binary garbage, JSON nesting, and multilingual logs.\n- **Deterministic Hashing:** Replaces secrets with unique hashes (e.g., `[HIDDEN:a1b2c]`), allowing QA to correlate errors without seeing the raw data.\n- **Drop-in:** No code changes required. Works with any language (Node, Python, Java, Go).\n- **Whitelist Support:** Explicitly allow safe patterns (e.g., git hashes, system IDs) using `PII_SAFE_REGEX_LIST` to prevent false positives.\n\n## Trusted By\n\n**GuardSpine** (AI Governance Kernel) integrated PII-Shield's **In-Process WASM** to sanitize sensitive evidence trails directly within their Node.js and Python agents.\n\n\u003e We chose the WASM architecture to ensure **zero network overhead** and **\u003c1ms latency**. PII-Shield runs directly in-process, preserving the referential integrity of our hash chains while keeping logs compliant.\n\n## Performance Considerations\n\nWhile PII-Shield is highly optimized, deep inspection of complex logs requires careful attention to configuration.\n- **Text Logs:** Extremely fast (\u003e100k lines/s).\n- **JSON Logs:** Zero-allocation parsing (no `encoding/json` overhead). The scanner manually parses JSON structures to ensure high throughput (~7MB/s) without memory spikes.\n- **Recommendation:** Usage is safe for high throughput. We use recursion safeguards to prevent stack overflows on deeply nested JSON.\n\n## Installation\n\n### Helm Chart (Recommended for Kubernetes)\nA complete demonstrational sidecar pipeline is available via our official Helm repository:\n\n```bash\nhelm repo add pii-shield https://aragossa.github.io/pii-shield/\nhelm install my-scanner pii-shield/pii-shield\n```\nThis deploys PII-Shield configured as a live log-redaction pipe with an ultra-lightweight footprint (30Mi Memory / 50m CPU).\n\n### Docker\nGet the latest lightweight image from Docker Hub:\n```bash\ndocker pull thelisdeep/pii-shield:latest\n```\n\n### Build from Source\n\nYou can build the binary directly from the source code:\n\n```bash\ngo build -o pii-shield ./cmd/cleaner/main.go\n```\n\n## Configuration\nSee [CONFIGURATION.md](CONFIGURATION.md) for a full list of environment variables, including:\n- `PII_SALT`: Custom HMAC salt (Required for production).\n- `PII_ADAPTIVE_THRESHOLD`: Enable dynamic entropy baselines.\n- `PII_DISABLE_BIGRAM_CHECK`: Optimize for non-English logs.\n- `PII_CUSTOM_REGEX_LIST`: Custom regex rules for deterministic redaction.\n- `PII_SAFE_REGEX_LIST`: Whitelist regex rules to ignore (matches are returned as-is).\n\n### Entropy Sensitivity Table (Default Threshold: 3.6)\n\n| Entropy | Data Type | Example |\n|---------|-----------|---------|\n| **0.0 - 3.0** | Common words, repeats | `password`, `admin`, `111111` |\n| **3.0 - 3.6** | CamelCase, partial hashes | `ProgramCampaignInstanceJob`, `8f3a11b2c` |\n| **3.6 - 4.5** | Paths, UUIDs, Weak Passwords | `/opt/application/runtime`, `P@ssw0rd2026!` |\n| **4.5 - 5.0** | Medium Tokens | `E8s9d_2kL1` |\n| **5.0+** | High Entropy Keys | (SHA-256, API Keys) |\n\n## Quick Start\n1. Test Locally (CLI)\nYou can pipe any log output through PII-Shield to see it in action immediately:\n\n```bash\n# Emulate a log with a sensitive password\necho \"Error: User password=MySecretPass123! failed login\" | docker run -i --rm thelisdeep/pii-shield:latest\n\n# Output: Error: User password=[HIDDEN:8f3a11] failed login\n```\n\n2. Kubernetes (Sidecar Pattern)\nTo use PII-Shield as a pipe wrapper for your application, use an `initContainer` to copy the binary into a shared volume.\n\n```yaml\napiVersion: v1\nkind: Pod\nmetadata:\n  name: secure-app\nspec:\n  volumes:\n  - name: bin-dir\n    emptyDir: {}\n  \n  # 1. Copy the PII-Shield binary to a shared volume\n  initContainers:\n  - name: install-shield\n    image: thelisdeep/pii-shield:latest\n    command: [\"cp\", \"/bin/pii-shield\", \"/opt/bin/pii-shield\"]\n    volumeMounts:\n    - name: bin-dir\n      mountPath: /opt/bin\n\n  # 2. Run your app and pipe logs through PII-Shield\n  containers:\n  - name: my-app\n    image: my-app:1.0\n    command: [\"/bin/sh\", \"-c\"]\n    # Pipe stderr/stdout through the sanitizer\n    args: [\"./start-app.sh 2\u003e\u00261 | /opt/bin/pii-shield\"] \n    volumeMounts:\n    - name: bin-dir\n      mountPath: /opt/bin\n```\n\n## Managing PII-Shield across dozens of clusters?\nWe are building a hosted Control Plane with centralized rule management, Slack alerting, and redaction analytics. \n\n[![Join the Waitlist](https://img.shields.io/badge/Join_the_Waitlist-PII--Shield_Cloud-blue?style=for-the-badge)](https://tally.so/r/PdY7Ze)\n\n## Verification\nThis project is verified with a comprehensive suite:\n1. **Unit Tests**: Cover edge cases, multilingual support, and JSON integrity.\n2. **Fuzzing**: Native Go fuzzing ensures crash safety against invalid inputs.\n3. **Smoke Testing**: `./run_smoke.sh` validates 100% detection accuracy on mixed workloads.\n\n## License\nDistributed under the Apache 2.0 License. See `LICENSE` for more information.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Faragossa%2Fpii-shield","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Faragossa%2Fpii-shield","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Faragossa%2Fpii-shield/lists"}