https://github.com/aragossa/pii-shield
Zero-code K8s sidecar for log sanitization. Detects secrets via Entropy Analysis, preserves JSON integrity, and redacts PII deterministically. 🛡️
https://github.com/aragossa/pii-shield
devsecops entropy gdpr golang json kubernetes log-sanitization log-sanitizer logging pii-redaction security sidecar soc2
Last synced: about 2 months ago
JSON representation
Zero-code K8s sidecar for log sanitization. Detects secrets via Entropy Analysis, preserves JSON integrity, and redacts PII deterministically. 🛡️
- Host: GitHub
- URL: https://github.com/aragossa/pii-shield
- Owner: aragossa
- License: apache-2.0
- Created: 2026-01-30T22:51:32.000Z (4 months ago)
- Default Branch: main
- Last Pushed: 2026-04-08T10:48:14.000Z (2 months ago)
- Last Synced: 2026-04-08T12:24:46.280Z (2 months ago)
- Topics: devsecops, entropy, gdpr, golang, json, kubernetes, log-sanitization, log-sanitizer, logging, pii-redaction, security, sidecar, soc2
- Language: Go
- Homepage: https://pii-shield.com/
- Size: 171 KB
- Stars: 50
- Watchers: 0
- Forks: 4
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- awesome-go - pii-shield - Zero-code log sanitization sidecar for Kubernetes that redacts PII from logs. (Security / HTTP Clients)
- fucking-awesome-go - pii-shield - Zero-code log sanitization sidecar for Kubernetes that redacts PII from logs. (Security / HTTP Clients)
- awesome-go-with-stars - pii-shield - code log sanitization sidecar for Kubernetes that redacts PII from logs. | 2026-03-16 | (Security / HTTP Clients)
README
# PII-Shield 🛡️
**Zero-code log sanitization sidecar for Kubernetes.**
Prevents data leaks (GDPR/SOC2) by redacting PII from logs *before* they leave the pod.







[](https://artifacthub.io/packages/search?repo=pii-shield)
"Don't let PII poison your AI models." PII-Shield ensures that sensitive data never reaches your training dataset, saving you from GDPR-forced model retraining.
## Why PII-Shield?
Developers often forget to mask sensitive data. Traditional regex filters in Fluentd/Logstash are slow, hard to maintain, and consume expensive CPU on log aggregators.
**PII-Shield sits right next to your app container:**
- **Production Ready:** Optimized for Kubernetes sidecars with **ultra-low memory allocations** (zero-GC overhead on hot paths) and deterministic O(1) regex matching.
- **Context-Aware Entropy Analysis:** Detected high-entropy secrets even without keys (e.g. `Error: ... 44saCk9...`) by analyzing context keywords.
- **Custom Regex Rules:** Deterministic redaction for structured data (UUIDs, IDs) that overrides entropy checks, ensuring 100% compliance for known patterns.
- **100% Accuracy:** Verified against "Wild" stress tests including binary garbage, JSON nesting, and multilingual logs.
- **Deterministic Hashing:** Replaces secrets with unique hashes (e.g., `[HIDDEN:a1b2c]`), allowing QA to correlate errors without seeing the raw data.
- **Drop-in:** No code changes required. Works with any language (Node, Python, Java, Go).
- **Whitelist Support:** Explicitly allow safe patterns (e.g., git hashes, system IDs) using `PII_SAFE_REGEX_LIST` to prevent false positives.
## Trusted By
**GuardSpine** (AI Governance Kernel) integrated PII-Shield's **In-Process WASM** to sanitize sensitive evidence trails directly within their Node.js and Python agents.
> We chose the WASM architecture to ensure **zero network overhead** and **<1ms latency**. PII-Shield runs directly in-process, preserving the referential integrity of our hash chains while keeping logs compliant.
## Performance Considerations
While PII-Shield is highly optimized, deep inspection of complex logs requires careful attention to configuration.
- **Text Logs:** Extremely fast (>100k lines/s).
- **JSON Logs:** Zero-allocation parsing (no `encoding/json` overhead). The scanner manually parses JSON structures to ensure high throughput (~7MB/s) without memory spikes.
- **Recommendation:** Usage is safe for high throughput. We use recursion safeguards to prevent stack overflows on deeply nested JSON.
## Installation
### Helm Chart (Recommended for Kubernetes)
A complete demonstrational sidecar pipeline is available via our official Helm repository:
```bash
helm repo add pii-shield https://aragossa.github.io/pii-shield/
helm install my-scanner pii-shield/pii-shield
```
This deploys PII-Shield configured as a live log-redaction pipe with an ultra-lightweight footprint (30Mi Memory / 50m CPU).
### Docker
Get the latest lightweight image from Docker Hub:
```bash
docker pull thelisdeep/pii-shield:latest
```
### Build from Source
You can build the binary directly from the source code:
```bash
go build -o pii-shield ./cmd/cleaner/main.go
```
## Configuration
See [CONFIGURATION.md](CONFIGURATION.md) for a full list of environment variables, including:
- `PII_SALT`: Custom HMAC salt (Required for production).
- `PII_ADAPTIVE_THRESHOLD`: Enable dynamic entropy baselines.
- `PII_DISABLE_BIGRAM_CHECK`: Optimize for non-English logs.
- `PII_CUSTOM_REGEX_LIST`: Custom regex rules for deterministic redaction.
- `PII_SAFE_REGEX_LIST`: Whitelist regex rules to ignore (matches are returned as-is).
### Entropy Sensitivity Table (Default Threshold: 3.6)
| Entropy | Data Type | Example |
|---------|-----------|---------|
| **0.0 - 3.0** | Common words, repeats | `password`, `admin`, `111111` |
| **3.0 - 3.6** | CamelCase, partial hashes | `ProgramCampaignInstanceJob`, `8f3a11b2c` |
| **3.6 - 4.5** | Paths, UUIDs, Weak Passwords | `/opt/application/runtime`, `P@ssw0rd2026!` |
| **4.5 - 5.0** | Medium Tokens | `E8s9d_2kL1` |
| **5.0+** | High Entropy Keys | (SHA-256, API Keys) |
## Quick Start
1. Test Locally (CLI)
You can pipe any log output through PII-Shield to see it in action immediately:
```bash
# Emulate a log with a sensitive password
echo "Error: User password=MySecretPass123! failed login" | docker run -i --rm thelisdeep/pii-shield:latest
# Output: Error: User password=[HIDDEN:8f3a11] failed login
```
2. Kubernetes (Sidecar Pattern)
To use PII-Shield as a pipe wrapper for your application, use an `initContainer` to copy the binary into a shared volume.
```yaml
apiVersion: v1
kind: Pod
metadata:
name: secure-app
spec:
volumes:
- name: bin-dir
emptyDir: {}
# 1. Copy the PII-Shield binary to a shared volume
initContainers:
- name: install-shield
image: thelisdeep/pii-shield:latest
command: ["cp", "/bin/pii-shield", "/opt/bin/pii-shield"]
volumeMounts:
- name: bin-dir
mountPath: /opt/bin
# 2. Run your app and pipe logs through PII-Shield
containers:
- name: my-app
image: my-app:1.0
command: ["/bin/sh", "-c"]
# Pipe stderr/stdout through the sanitizer
args: ["./start-app.sh 2>&1 | /opt/bin/pii-shield"]
volumeMounts:
- name: bin-dir
mountPath: /opt/bin
```
## Managing PII-Shield across dozens of clusters?
We are building a hosted Control Plane with centralized rule management, Slack alerting, and redaction analytics.
[](https://tally.so/r/PdY7Ze)
## Verification
This project is verified with a comprehensive suite:
1. **Unit Tests**: Cover edge cases, multilingual support, and JSON integrity.
2. **Fuzzing**: Native Go fuzzing ensures crash safety against invalid inputs.
3. **Smoke Testing**: `./run_smoke.sh` validates 100% detection accuracy on mixed workloads.
## License
Distributed under the Apache 2.0 License. See `LICENSE` for more information.