{"id":44040649,"url":"https://github.com/solentlabs/har-capture","last_synced_at":"2026-04-02T19:13:45.576Z","repository":{"id":335378583,"uuid":"1145531087","full_name":"solentlabs/har-capture","owner":"solentlabs","description":"Capture and sanitize HAR (HTTP Archive) files with deep PII removal. Perfect for support diagnostics, security reviews, and test fixtures.","archived":false,"fork":false,"pushed_at":"2026-03-29T14:42:07.000Z","size":671,"stargazers_count":3,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-03-29T16:41:59.993Z","etag":null,"topics":["bug-reports","devtools","har","http-archive","pii","playwright","privacy","python","sanitization","security","support-tools","zero-dependencies"],"latest_commit_sha":null,"homepage":"https://solentlabs.io/har-capture","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/solentlabs.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":"SECURITY.md","support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-01-29T22:33:25.000Z","updated_at":"2026-03-29T14:36:41.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/solentlabs/har-capture","commit_stats":null,"previous_names":["solentlabs/har-capture"],"tags_count":21,"template":false,"template_full_name":null,"purl":"pkg:github/solentlabs/har-capture","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/solentlabs%2Fhar-capture","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/solentlabs%2Fhar-capture/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/solentlabs%2Fhar-capture/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/solentlabs%2Fhar-capture/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/solentlabs","download_url":"https://codeload.github.com/solentlabs/har-capture/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/solentlabs%2Fhar-capture/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31314153,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-02T12:59:32.332Z","status":"ssl_error","status_checked_at":"2026-04-02T12:54:48.875Z","response_time":89,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bug-reports","devtools","har","http-archive","pii","playwright","privacy","python","sanitization","security","support-tools","zero-dependencies"],"created_at":"2026-02-07T20:37:30.983Z","updated_at":"2026-04-02T19:13:45.570Z","avatar_url":"https://github.com/solentlabs.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# har-capture\n\n[![PyPI version](https://img.shields.io/pypi/v/har-capture)](https://pypi.org/project/har-capture/)\n[![Downloads](https://img.shields.io/pypi/dm/har-capture)](https://pypi.org/project/har-capture/)\n[![codecov](https://codecov.io/gh/solentlabs/har-capture/branch/main/graph/badge.svg)](https://codecov.io/gh/solentlabs/har-capture)\n[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)\n[![AI Assisted](https://img.shields.io/badge/AI-Claude%20Assisted-5A67D8.svg)](https://claude.ai)\n\nCapture and sanitize [HAR (HTTP Archive)](https://w3c.github.io/web-performance/specs/HAR/Overview.html) files with deep PII removal. Perfect for support diagnostics, security reviews, and test fixtures.\n\n## Quick Start\n\n\u003cdetails open\u003e\n\u003csummary\u003e\u003cb\u003eWindows\u003c/b\u003e\u003c/summary\u003e\n\n1. Install Python from the [Microsoft Store](https://apps.microsoft.com/detail/9NRWMJP3717K) or [python.org](https://www.python.org/downloads/)\n1. Open PowerShell and run:\n\n```bash\npip install har-capture[full]\npython -m har_capture https://example.com\n```\n\n\u003c/details\u003e\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cb\u003emacOS / Linux\u003c/b\u003e\u003c/summary\u003e\n\n```bash\npip install har-capture[full]\nhar-capture https://example.com\n```\n\n\u003c/details\u003e\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cb\u003eAlready have a HAR file?\u003c/b\u003e\u003c/summary\u003e\n\n```bash\npip install har-capture\nhar-capture sanitize myfile.har\n```\n\n\u003c/details\u003e\n\n______________________________________________________________________\n\n## Why har-capture?\n\nChrome DevTools now sanitizes cookies and auth headers, but HAR files contain **much more sensitive data**: IP addresses, MAC addresses, emails, passwords in form bodies, serial numbers, device names, WiFi credentials, session tokens, and API keys.\n\n**How har-capture compares:**\n\n| Feature                               | har-capture | DevTools | Google/Cloudflare |\n| ------------------------------------- | ----------- | -------- | ----------------- |\n| Deep sanitization (IPs, MACs, emails) | ✅          | ❌       | ❌                |\n| Correlation-preserving hashes         | ✅          | ❌       | ❌                |\n| Interactive review                    | ✅          | ❌       | Varies            |\n| Custom patterns                       | ✅          | ❌       | Limited           |\n| Local + CLI automation                | ✅          | No CLI   | Varies            |\n\n**Key benefits:**\n\n- **Zero dependencies** - Core sanitization uses only Python stdlib\n- **Format-preserving hashes** - Track the same device across requests without exposing real values\n- **One-command workflow** - Capture, sanitize, and compress in a single step\n\n[See detailed comparison with all tools →](docs/COMPARISON.md)\n\n______________________________________________________________________\n\n## See It In Action\n\n**1. Sanitization report** — 84 values auto-redacted across 9 PII categories:\n\n![Sanitization Report](https://raw.githubusercontent.com/solentlabs/har-capture/main/docs/images/sanitization-report.png)\n\n**2. Flagged values for review** — passwords, fields, WiFi SSIDs, and phone numbers detected automatically:\n\n![Flagged Values for Review](https://raw.githubusercontent.com/solentlabs/har-capture/main/docs/images/flagged-values-table.png)\n\n**3. Interactive redaction picker** — high-confidence items pre-selected, you choose the rest:\n\n![Redact Picker](https://raw.githubusercontent.com/solentlabs/har-capture/main/docs/images/redact-picker.png)\n\n______________________________________________________________________\n\n## Installation\n\n```bash\n# Core only (sanitization - zero dependencies)\npip install har-capture\n\n# With browser capture support\npip install har-capture[capture]\nplaywright install chromium\n\n# Full installation (recommended)\npip install har-capture[full]\n```\n\n______________________________________________________________________\n\n## Usage\n\n### Command Line\n\n```bash\n# Capture and sanitize (interactive review always enabled)\nhar-capture https://example.com\n\n# Sanitize existing HAR\nhar-capture sanitize capture.har\n\n# Validate for PII leaks\nhar-capture validate capture.har\n```\n\n[Full CLI reference →](docs/CLI_REFERENCE.md)\n\n### Python API\n\n```python\nfrom har_capture.sanitization import sanitize_html, sanitize_har_file\nfrom har_capture.sanitization.report import HeuristicMode\n\n# Sanitize HTML (correlation-preserving by default)\nclean_html = sanitize_html(raw_html)\n\n# Sanitize with consistent salt (correlate across captures)\nclean_html = sanitize_html(raw_html, salt=\"my-secret-key\")\n\n# Enable heuristic detection for WiFi, SSIDs, device names\nclean_html = sanitize_html(raw_html, heuristics=HeuristicMode.REDACT)\n\n# Sanitize HAR file\nsanitize_har_file(\"capture.har\")  # → capture.sanitized.har\n\n# Custom patterns (e.g., modem serials, customer IDs)\ncustom = {\"patterns\": {\"modem_sn\": {\"regex\": r\"SN[0-9]{10}\", \"replacement_prefix\": \"MODEM\"}}}\nsanitize_har_file(\"capture.har\", custom_patterns=custom)\n```\n\n______________________________________________________________________\n\n## Documentation\n\n- **[Comparison with Other Tools](docs/COMPARISON.md)** - DevTools, Google, Cloudflare, Edgio\n- **[Correlation-Preserving Redaction](docs/CORRELATION.md)** - How format-preserving hashing works\n- **[PII Categories](docs/PII_CATEGORIES.md)** - What gets sanitized\n- **[Custom Patterns](docs/CUSTOM_PATTERNS.md)** - Add organization-specific patterns\n- **[CLI Reference](docs/CLI_REFERENCE.md)** - Detailed command documentation\n- **[Interactive Sanitization](docs/INTERACTIVE_SANITIZATION.md)** - Review edge cases manually\n\n______________________________________________________________________\n\n## Use Cases\n\n- **Support diagnostics** - Users submit sanitized HAR files without exposing credentials\n- **Security review** - Validate HAR files for PII leaks before sharing\n- **Test fixtures** - Generate reproducible traffic captures\n- **Modem debugging** - Capture router/modem traffic with sensitive data removed\n\n______________________________________________________________________\n\n## What Gets Sanitized\n\n| Category        | Examples              | Output                                               |\n| --------------- | --------------------- | ---------------------------------------------------- |\n| **Network**     | IPs, MACs             | `192.168.1.1` → `10.255.42.17`                       |\n| **Personal**    | Emails, phones        | `user@example.com` → `user_a1b2@redacted.invalid`    |\n| **Credentials** | Passwords, tokens     | `password=secret` → `password=PASS_a1b2c3d4`         |\n| **Device**      | Serials, WiFi, SSIDs  | `SN123456` → `SERIAL_a1b2c3d4`                       |\n| **HTTP**        | Auth headers, cookies | `Cookie: session=xyz` → `Cookie: session=TOKEN_a1b2` |\n\n[See complete PII categories list →](docs/PII_CATEGORIES.md)\n\n______________________________________________________________________\n\n## Platform Support\n\n| Component    | Windows | macOS | Linux |\n| ------------ | ------- | ----- | ----- |\n| Sanitization | ✅      | ✅    | ✅    |\n| Validation   | ✅      | ✅    | ✅    |\n| CLI          | ✅      | ✅    | ✅    |\n| Capture      | ✅      | ✅    | ✅    |\n\n______________________________________________________________________\n\n## Contributing\n\nContributions welcome! See [CONTRIBUTING.md](CONTRIBUTING.md) for guidelines.\n\n______________________________________________________________________\n\n## License\n\nMIT License - see [LICENSE](LICENSE) for details.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsolentlabs%2Fhar-capture","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsolentlabs%2Fhar-capture","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsolentlabs%2Fhar-capture/lists"}