https://github.com/prithvi1236/redactor
Redacts sensitive data from clipboard before you paste into ChatGPT, Claude, or any external AI — 100% local, no secrets leaves your machine.
https://github.com/prithvi1236/redactor
desktop-app local-ai ner privacy python security
Last synced: 26 days ago
JSON representation
Redacts sensitive data from clipboard before you paste into ChatGPT, Claude, or any external AI — 100% local, no secrets leaves your machine.
- Host: GitHub
- URL: https://github.com/prithvi1236/redactor
- Owner: prithvi1236
- License: mit
- Created: 2026-05-27T06:14:17.000Z (about 1 month ago)
- Default Branch: main
- Last Pushed: 2026-05-27T06:42:22.000Z (about 1 month ago)
- Last Synced: 2026-05-27T08:22:59.663Z (about 1 month ago)
- Topics: desktop-app, local-ai, ner, privacy, python, security
- Language: Python
- Homepage:
- Size: 12.7 KB
- Stars: 1
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# ClipRedact
ClipRedact is a local clipboard privacy utility that redacts sensitive data before you paste into external AI tools, and restores the originals after you get a response — entirely on your machine.



## Why This Exists
Most sensitive data leaks don't happen through hacks. They happen through convenience — copying a log, a config snippet, or a work note into ChatGPT without checking what's in it.
API keys, client names, internal metrics, and credentials routinely end up in prompts sent to servers you don't control. ClipRedact intercepts that moment: press a hotkey before you paste, and only the redacted version leaves your clipboard. The model on the other end never sees the original.
No data is sent to any server. No cloud service is involved in redaction. The NER model runs locally on your CPU or GPU.
## How It Works
Redaction runs as a two-layer pipeline. Structural secrets are caught instantly by regex. Contextual entities like names and organisations are caught by a local NER model, but only when context signals they are sensitive — so public figures and well-known companies are left alone.
```text
┌─────────────────────┐
│ Clipboard Input │
└─────────┬───────────┘
│
v
┌─────────────────────┐
│ Layer 1: Regex │ API keys, JWTs, emails, phones, IPs,
│ (instant) │ password key=value pairs, conn strings
└─────────┬───────────┘
│
v
┌─────────────────────┐
│ Layer 2: Local NER │ PERSON / ORG with trigger-word gating
│ (dslim/bert-base) │ + public-org allowlist
└─────────┬───────────┘
│
v
┌─────────────────────┐
│ Clipboard Output │ Safe-to-paste redacted text
└─────────┬───────────┘
│
v
┌─────────────────────┐
│ Restore Hotkey │ Swap placeholders back to originals
└─────────────────────┘
```
## Redaction Examples
| Before (copied text) | After (clipboard content) |
|---|---|
| `Email me at sarah@acme.com and use key sk_live_1234567890abcdef.` | `Email me at [EMAIL_1] and use key [API_KEY_1].` |
| `our CEO Sarah Connor joined Acme Corp this quarter.` | `our CEO ⟪PERSON·a1b2⟫ joined ⟪ORG·3c4d⟫ this quarter.` |
| `postgres://admin:pass@db.com/prod` | `[CONN_STRING_1]` |
| `postgres://user@db.com/analytics` | `postgres://user@db.com/analytics` *(no credentials — unchanged)* |
| `Google released a new model yesterday.` | `Google released a new model yesterday.` *(public org — unchanged)* |
| `tell me about India` | `tell me about India` *(no trigger word — unchanged)* |
## Quickstart
Install dependencies and start the utility in three steps.
**1. Install Python 3.11 or newer.**
**2. Install dependencies:**
```bash
pip install -r requirements.txt
```
> NER uses **ONNX Runtime** via [Optimum](https://huggingface.co/docs/optimum) (no PyTorch). On first run, `dslim/bert-base-NER` is downloaded and exported to ONNX if needed (~400 MB once, then cached).
**3. Start the redactor:**
```bash
python redactor.py
```
The app runs silently in the background. The terminal prints a redaction map each time the hotkey fires (placeholder → original, local only).
## Redact and Restore Flow
**Redacting before you paste:**
1. Copy text in any app (`Ctrl+C`).
2. Press the **redact hotkey** (`Ctrl+Shift+X`).
3. Paste the redacted version into your AI tool (`Ctrl+V`).
**Restoring after the model responds:**
1. Copy the AI response (`Ctrl+C`).
2. Press the **restore hotkey** (`Ctrl+Shift+Z`).
3. Paste the fully restored text back into your own tools (`Ctrl+V`).
Mappings are held in memory for the duration of the session (see `SESSION_TTL`). Restarting the process clears all mappings.
## Hotkey Reference
| Action | Platform | Default Hotkey |
|---|---|---|
| Redact | Windows / Linux | `Ctrl + Shift + X` |
| Redact | macOS | `Cmd + Shift + X` |
| Restore | Windows / Linux | `Ctrl + Shift + Z` |
| Restore | macOS | `Cmd + Shift + Z` |
macOS requires Accessibility permission in System Settings → Privacy & Security → Accessibility.
To change a hotkey, edit the `HOTKEY` or `HOTKEY_RESTORE` constants in `redactor.py`.
## Configuration
All tuneable constants are at the top of `redactor.py`.
| Constant | Default | Description |
|---|---|---|
| `HOTKEY` | `++x` | Global redact shortcut |
| `HOTKEY_RESTORE` | `++z` | Global restore shortcut |
| `SESSION_TTL` | `3600` | Seconds before a mapping expires. Mappings are in-memory only — they do not survive a process restart. |
| `NER_WINDOW_SIZE` | `5` | Words scanned before and after an entity for trigger words |
| `NER_MIN_CONFIDENCE` | `0.85` | Minimum NER confidence score to act on an entity |
| `NER_TRIGGER_WORDS` | *(see source)* | Context words that enable PERSON / ORG masking (e.g. `our`, `ceo`, `client`) |
| `NER_PUBLIC_ORG_ALLOWLIST` | *(see source)* | Organisations never masked regardless of context |
| `NER_TOKEN_HEX_BYTES` | `2` | Byte length of random suffix in NER placeholders |
## Running Tests
```bash
python test_redactor.py # full suite — regex + NER (dslim/bert-base-NER)
python test_redactor.py --quick # core subset (~22 layer cases + all state tests)
python test_redactor.py --verbose # also prints redacted output per case
```
First run downloads/exports the ONNX NER model (~3–8 s). Requires `optimum`, `onnxruntime`, and `transformers` (see install above).
## Edge Cases
**Masked correctly:**
- `our CEO Sarah Connor` → masked (`our` is a trigger within 5 words of the entity)
- `John Doe, our CTO` → masked (trigger word after the entity — bidirectional window)
- Same entity appearing twice → same placeholder reused (deduplication)
- `postgres://admin:pass@db.com/prod` → masked (credentials present)
**Intentionally not masked:**
- `tell me about India` → not masked (no trigger word near entity)
- `Google released a model` → not masked (public org allowlist)
- `postgres://user@db.com/analytics` → not masked (no embedded credentials)
- `who is the CEO of Microsoft?` → not masked (public org, no possessive trigger)
- Hypothetical phrasing without a named entity (e.g. `what if someone earned $X`)
## Known Limitations
- Mappings are **in-memory only**. If the process is restarted between redacting and restoring, the mapping is gone and the restore hotkey will report no matching session entries.
- The NER model downloads approximately **400 MB** on first run. Subsequent runs use the local cache.
- NER warm-up takes **3 to 8 seconds** at startup (runs in a background thread, hotkey works throughout).
- Regex email detection can match **hypothetical email examples** in plain text (e.g. `user@domain.tld`).
- Non-text clipboard payloads (images, files) are ignored.
- NER detection is **English-only**. Names and organisations in other languages may not be caught.
## Using ClipRedact on macOS (built app)
After `build/macos/build.sh`, distribute **`dist/ClipRedact.app`** only.
| Launch method | Hotkeys work? | Notes |
|---------------|---------------|--------|
| **Double-click `ClipRedact.app`** | After Accessibility is granted | Normal user path |
| **`dist/redactor` from Terminal** | Often works immediately | Inherits **Terminal’s** Accessibility permission — not the same as the `.app` |
| **Tray menu → Redact clipboard now** | Always (no hotkey permission) | Use if hotkeys do nothing |
### First-time setup (required for hotkeys)
1. Open **ClipRedact.app** (menu bar icon appears top-right).
2. **System Settings → Privacy & Security → Accessibility** → enable **ClipRedact** (or `redactor`).
3. Quit and reopen the app if hotkeys still do nothing.
4. Copy text, press **⌘ + Shift + X** to redact (defaults on macOS).
NER model cache is stored in `~/Library/Application Support/ClipRedact/hf_cache` (not inside the `.app`).
## Building a Windows `.exe`
PyInstaller must run **on Windows** (cross-compiling a `.exe` from macOS is not supported).
### GitHub Actions (no Windows PC)
Push this repo to GitHub. A workflow in [`.github/workflows/build.yml`](.github/workflows/build.yml) builds on `windows-latest` when you push to `main`, push a `v*` tag, or run it manually.
1. Open your repo on GitHub → **Actions** → **Build Windows exe** → **Run workflow** (or push to `main`).
2. Wait for the job to finish (first run downloads the NER model and may take ~15–30 minutes).
3. Open the completed run → **Artifacts** → download **ClipRedact-windows** (`ClipRedact-win64.zip`).
4. Unzip and run `ClipRedact\ClipRedact.exe` (ship the whole folder, not the `.exe` alone).
### Local build on Windows
**1. Prerequisites:** Windows 10/11, [Python 3.11+](https://www.python.org/downloads/) (check “Add to PATH”).
**2. Build** (Command Prompt or PowerShell from the repo root):
```bat
build\windows\build.bat
```
Or in PowerShell:
```powershell
.\build\windows\build.ps1
```
The script creates `.venv-build`, installs dependencies, pre-downloads the NER model, and runs PyInstaller.
**3. Output:** `dist\ClipRedact\ClipRedact.exe` plus DLLs in the same folder. **Zip the whole `ClipRedact` folder** when sharing — do not ship the `.exe` alone.
**4. First run on another PC:** Double-click `ClipRedact.exe`. It runs in the system tray (no console window). The NER model is cached in `hf_cache` next to the executable (~400 MB once). Internet is required on first run if the model was not pre-downloaded at build time.
**5. Debug build** (show console + errors): edit `clipredact.spec` and set `console=True`, then rebuild.
| Artifact | Approx. size |
|----------|----------------|
| `dist\ClipRedact\` folder (zipped) | ~150–350 MB |
| After first-run `hf_cache\` | +~400 MB |
Manual build:
```bat
python -m venv .venv-build
.venv-build\Scripts\activate
pip install -r requirements-build.txt
pyinstaller clipredact.spec --noconfirm
```
## Roadmap
- Undo last redaction — single hotkey to revert clipboard to pre-redaction state.
- Mapping persistence via OS keychain (macOS Keychain, Windows Credential Manager) for cross-session restore.
- Settings UI for categories, trigger words, and hotkey remapping.
- macOS notarization / code signing for wide distribution.
## Contributing
Contributions are welcome if they improve detection quality, reduce false positives, or enhance developer ergonomics.
1. Fork the repo and create a feature branch.
2. Add or update tests in `test_redactor.py`.
3. Run the full test suite before opening a pull request.
4. Include a clear problem statement and before/after behaviour in the PR description.
## License
MIT License.