{"id":35365628,"url":"https://github.com/tenuo-ai/safe_unzip","last_synced_at":"2026-01-13T19:19:27.152Z","repository":{"id":331271507,"uuid":"1124973534","full_name":"tenuo-ai/safe_unzip","owner":"tenuo-ai","description":"Secure zip extraction for Rust \u0026 Python. Prevents Zip Slip, Zip Bombs, and symlink attacks.","archived":false,"fork":false,"pushed_at":"2026-01-03T22:31:16.000Z","size":229,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"master","last_synced_at":"2026-01-05T07:51:01.540Z","etag":null,"topics":["archive","extraction","file-security","python","rust","security","zip","zip-bomb","zip-slip"],"latest_commit_sha":null,"homepage":"","language":"Rust","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/tenuo-ai.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-12-30T00:02:14.000Z","updated_at":"2026-01-03T22:25:53.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/tenuo-ai/safe_unzip","commit_stats":null,"previous_names":["tenuo-ai/safe_unzip"],"tags_count":6,"template":false,"template_full_name":null,"purl":"pkg:github/tenuo-ai/safe_unzip","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tenuo-ai%2Fsafe_unzip","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tenuo-ai%2Fsafe_unzip/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tenuo-ai%2Fsafe_unzip/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tenuo-ai%2Fsafe_unzip/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/tenuo-ai","download_url":"https://codeload.github.com/tenuo-ai/safe_unzip/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tenuo-ai%2Fsafe_unzip/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28397826,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-13T14:36:09.778Z","status":"ssl_error","status_checked_at":"2026-01-13T14:35:19.697Z","response_time":56,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["archive","extraction","file-security","python","rust","security","zip","zip-bomb","zip-slip"],"created_at":"2026-01-02T01:53:08.352Z","updated_at":"2026-01-13T19:19:27.145Z","avatar_url":"https://github.com/tenuo-ai.png","language":"Rust","funding_links":[],"categories":[],"sub_categories":[],"readme":"# safe_unzip\n\nSecure archive extraction. Supports **ZIP** (core), **TAR**, and **7z** (optional features).\n\n## The Problem\n\nZip files can contain malicious paths that escape the extraction directory:\n\n```python\nimport zipfile\nzipfile.ZipFile(\"evil.zip\").extractall(\"/var/uploads\")\n# Extracts ../../etc/cron.d/pwned → /etc/cron.d/pwned 💀\n```\n\nThis is [Zip Slip](https://snyk.io/research/zip-slip-vulnerability), and Python's default behavior is still vulnerable.\n\n### \"But didn't Python fix this?\"\n\nSort of. Python added warnings and `ZipInfo.filename` sanitization in 2014. In Python 3.12+, there's a `filter` parameter:\n\n```python\n# The \"safe\" way — but who knows this exists?\nzipfile.ZipFile(\"evil.zip\").extractall(\"/var/uploads\", filter=\"data\")\n```\n\nThe problem: **the safe option is opt-in**. The default is still vulnerable. Most developers don't read the docs carefully enough to discover `filter=\"data\"`.\n\n`safe_unzip` makes security the default, not an afterthought.\n\n## The Solution\n\n```rust\nuse safe_unzip::extract_file;\n\nextract_file(\"/var/uploads\", \"evil.zip\")?;\n// Err(PathEscape { entry: \"../../etc/cron.d/pwned\", ... })\n```\n\n```python\n# Python bindings — same safety\nfrom safe_unzip import extract_file\n\nextract_file(\"/var/uploads\", \"evil.zip\")\n# Raises: PathEscapeError\n```\n\n**Security is the default.** No special flags, no opt-in safety. Every path is validated. Malicious archives are rejected, not extracted.\n\n## Why Not Just Use `zip` / `tar` / `zipfile`?\n\nBecause **archive extraction is a security boundary**, and most libraries treat it as a convenience function.\n\n| Library | Default Behavior | Safe Option |\n|---------|------------------|-------------|\n| Python `zipfile` | Vulnerable | `filter=\"data\"` (opt-in, obscure) |\n| Python `tarfile` | Vulnerable | `filter=\"data\"` (opt-in, Python 3.12+) |\n| Rust `zip` | Vulnerable | Manual path validation |\n| Rust `tar` | Vulnerable | Manual path validation |\n| `safe_unzip` | **Safe by default** | N/A — always safe |\n\nIf you're extracting untrusted archives, you need a library designed for that threat model.\n\n## Who Should Use This\n\n- **Backend services** handling user-uploaded zip files\n- **CI/CD systems** unpacking third-party artifacts  \n- **SaaS platforms** with file import features\n- **Forensics / malware analysis** pipelines\n- **Anything running as a privileged user**\n\nIf your zip files only come from trusted sources you control, the standard `zip` crate is fine. If users can upload archives, use `safe_unzip`.\n\n## Features\n\n- **CLI Tool** — `safe_unzip` command with `--list`, `--verify`, limits, and filtering\n- **Archive Verification** — Check CRC32 integrity without extracting\n- **Multi-Format Support** — ZIP (core), TAR, and 7z (feature flags)\n- **Partial Extraction** — Extract specific files with `only()` or glob patterns\n- **Progress Callbacks** — Monitor extraction progress (Rust API)\n- **Async API** — Optional tokio-based async extraction (feature flag)\n- **Zip Slip Protection** — Path traversal attacks blocked via [path_jail](https://crates.io/crates/path_jail)\n- **Zip Bomb Protection** — Configurable limits on size, file count, and path depth\n- **Strict Size Enforcement** — Catches files that decompress larger than declared\n- **Filename Sanitization** — Blocks control characters and Windows reserved names\n- **Symlink Handling** — Skip or reject symlinks (no symlink-based escapes)\n- **Secure Overwrite** — Removes symlinks before overwriting to prevent symlink attacks\n- **Atomic File Creation** — TOCTOU-safe file creation using `O_EXCL`\n- **Overwrite Policies** — Error, skip, or overwrite existing files\n- **Filter Callback** — Extract only the files you want\n- **Two-Pass Mode** — Validate everything before writing anything\n- **Permission Stripping** — Removes setuid/setgid bits on Unix\n\n## Installation\n\n**CLI:**\n```bash\ncargo install safe_unzip --features cli\n```\n\n**Rust:**\n```toml\n[dependencies]\nsafe_unzip = \"0.1\"\n```\n\n**Python:**\n```bash\npip install safe-unzip\n```\n\n### Feature Flags\n\n| Feature | Default | Description |\n|---------|---------|-------------|\n| `tar` | ❌ | TAR/TAR.GZ extraction |\n| `async` | ❌ | Tokio-based async API |\n| `sevenz` | ❌ | 7z extraction (heavier deps) |\n\n```toml\n# ZIP only (smallest, ~30 deps)\nsafe_unzip = \"0.1\"\n\n# With TAR support (~40 deps)\nsafe_unzip = { version = \"0.1\", features = [\"tar\"] }\n\n# With async API\nsafe_unzip = { version = \"0.1\", features = [\"async\"] }\n\n# Kitchen sink (~85 deps)\nsafe_unzip = { version = \"0.1\", features = [\"tar\", \"async\", \"sevenz\"] }\n```\n\n\u003e **Note:** Python bindings always include TAR support.\n\n### Python Bindings\n\nThe Python bindings are **thin wrappers** over the Rust implementation via PyO3. This means:\n\n- ✅ **Identical security guarantees** — same code path, same validation\n- ✅ **Identical limits** — same defaults (1GB total, 10K files, 100MB per file)\n- ✅ **Identical semantics** — same error conditions, same behavior\n- ✅ **No re-implementation** — Python calls Rust directly, no logic duplication\n\nSecurity reviewers: the Python API is a direct binding, not a port.\n\n## CLI Usage\n\n```bash\n# Extract archive to destination\nsafe_unzip archive.zip -d /var/uploads\n\n# List contents without extracting\nsafe_unzip archive.zip --list\n\n# Verify integrity (CRC32 check)\nsafe_unzip archive.zip --verify\n\n# With limits\nsafe_unzip archive.zip -d /var/uploads --max-size 100M --max-files 1000\n\n# Glob filtering\nsafe_unzip archive.zip -d /var/uploads --include \"**/*.py\" --exclude \"**/test_*\"\n\n# Partial extraction\nsafe_unzip archive.zip -d /var/uploads --only README.md --only LICENSE\n\n# Verbose output\nsafe_unzip archive.zip -d /var/uploads -v\n```\n\n## Quick Start\n\n```rust\nuse safe_unzip::extract_file;\n\n// Extract with safe defaults\nlet report = extract_file(\"/var/uploads\", \"archive.zip\")?;\nprintln!(\"Extracted {} files ({} bytes)\", \n    report.files_extracted, \n    report.bytes_written\n);\n```\n\n## Usage Examples\n\n### Basic Extraction\n\n```rust\nuse safe_unzip::Extractor;\n\nlet report = Extractor::new(\"/var/uploads\")?\n    .extract_file(\"archive.zip\")?;\n```\n\n### Create Destination if Missing\n\n```rust\nuse safe_unzip::Extractor;\n\n// Extractor::new() errors if destination doesn't exist (catches typos)\n// Extractor::new_or_create() creates it automatically\nlet report = Extractor::new_or_create(\"/var/uploads/new_folder\")?\n    .extract_file(\"archive.zip\")?;\n\n// The convenience functions (extract_file, extract) also create automatically\nuse safe_unzip::extract_file;\nextract_file(\"/var/uploads/new_folder\", \"archive.zip\")?;\n```\n\n### Custom Limits (Prevent Zip Bombs)\n\n```rust\nuse safe_unzip::{Extractor, Limits};\n\nlet report = Extractor::new(\"/var/uploads\")?\n    .limits(Limits {\n        max_total_bytes: 500 * 1024 * 1024,  // 500 MB total\n        max_file_count: 1_000,                // Max 1000 files\n        max_single_file: 50 * 1024 * 1024,   // 50 MB per file\n        max_path_depth: 10,                   // No deeper than 10 levels\n    })\n    .extract_file(\"archive.zip\")?;\n```\n\n### Filter by Extension\n\n```rust\nuse safe_unzip::Extractor;\n\n// Only extract images\nlet report = Extractor::new(\"/var/uploads\")?\n    .filter(|entry| {\n        entry.name.ends_with(\".png\") || \n        entry.name.ends_with(\".jpg\") ||\n        entry.name.ends_with(\".gif\")\n    })\n    .extract_file(\"archive.zip\")?;\n\nprintln!(\"Extracted {} images, skipped {} other files\",\n    report.files_extracted,\n    report.entries_skipped\n);\n```\n\n### Partial Extraction (New in v0.1.5)\n\nExtract specific files by name or glob pattern:\n\n```rust\nuse safe_unzip::Extractor;\n\n// Extract only specific files\nlet report = Extractor::new(\"/var/uploads\")?\n    .only(\u0026[\"README.md\", \"LICENSE\"])\n    .extract_file(\"archive.zip\")?;\n\n// Include by glob pattern\nlet report = Extractor::new(\"/var/uploads\")?\n    .include_glob(\u0026[\"**/*.py\", \"**/*.rs\"])\n    .extract_file(\"archive.zip\")?;\n\n// Exclude by glob pattern\nlet report = Extractor::new(\"/var/uploads\")?\n    .exclude_glob(\u0026[\"**/__pycache__/**\", \"**/*.pyc\"])\n    .extract_file(\"archive.zip\")?;\n```\n\n**Python:**\n```python\nfrom safe_unzip import Extractor\n\n# Extract only specific files\nreport = Extractor(\"/var/uploads\").only([\"README.md\", \"LICENSE\"]).extract_file(\"archive.zip\")\n\n# Include by pattern\nreport = Extractor(\"/var/uploads\").include_glob([\"**/*.py\"]).extract_file(\"archive.zip\")\n\n# Exclude by pattern  \nreport = Extractor(\"/var/uploads\").exclude_glob([\"**/__pycache__/**\"]).extract_file(\"archive.zip\")\n```\n\n### Progress Callbacks\n\nMonitor extraction progress:\n\n```rust\nuse safe_unzip::Extractor;\n\nlet report = Extractor::new(\"/var/uploads\")?\n    .on_progress(|p| {\n        println!(\"[{}/{}] {} ({} bytes)\",\n            p.entry_index + 1,\n            p.total_entries,\n            p.entry_name,\n            p.entry_size\n        );\n    })\n    .extract_file(\"archive.zip\")?;\n```\n\n**Python:**\n```python\nfrom safe_unzip import Extractor\n\ndef show_progress(p):\n    print(f\"[{p['entry_index']+1}/{p['total_entries']}] {p['entry_name']}\")\n\nExtractor(\"/var/uploads\").on_progress(show_progress).extract_file(\"archive.zip\")\n\n# Or with tqdm for a progress bar\nfrom tqdm import tqdm\nentries = list_zip_entries(\"archive.zip\")\npbar = tqdm(total=len(entries))\ndef update_bar(p):\n    pbar.update(1)\n    pbar.set_description(p['entry_name'])\nExtractor(\"/var/uploads\").on_progress(update_bar).extract_file(\"archive.zip\")\npbar.close()\n```\n\n### Archive Verification\n\nCheck archive integrity without extracting:\n\n```rust\nuse safe_unzip::verify_file;\n\n// Verify CRC32 for all entries\nlet report = verify_file(\"archive.zip\")?;\nprintln!(\"Verified {} entries ({} bytes)\", \n    report.entries_verified, \n    report.bytes_verified\n);\n```\n\nOr with the Extractor (useful if you want to verify then extract):\n\n```rust\nuse safe_unzip::Extractor;\n\nlet extractor = Extractor::new(\"/var/uploads\")?;\n\n// First verify\nextractor.verify_file(\"archive.zip\")?;\n\n// Then extract (re-reads archive, but guarantees integrity)\nextractor.extract_file(\"archive.zip\")?;\n```\n\n### Overwrite Policies\n\n```rust\nuse safe_unzip::{Extractor, OverwritePolicy};\n\n// Skip files that already exist\nlet report = Extractor::new(\"/var/uploads\")?\n    .overwrite(OverwritePolicy::Skip)\n    .extract_file(\"archive.zip\")?;\n\n// Or overwrite them\nlet report = Extractor::new(\"/var/uploads\")?\n    .overwrite(OverwritePolicy::Overwrite)\n    .extract_file(\"archive.zip\")?;\n\n// Default: Error if file exists\nlet report = Extractor::new(\"/var/uploads\")?\n    .overwrite(OverwritePolicy::Error)  // This is the default\n    .extract_file(\"archive.zip\")?;\n```\n\n### Symlink Policies\n\n```rust\nuse safe_unzip::{Extractor, SymlinkPolicy};\n\n// Default: silently skip symlinks\nlet report = Extractor::new(\"/var/uploads\")?\n    .symlinks(SymlinkPolicy::Skip)\n    .extract_file(\"archive.zip\")?;\n\n// Or reject archives containing symlinks\nlet report = Extractor::new(\"/var/uploads\")?\n    .symlinks(SymlinkPolicy::Error)\n    .extract_file(\"archive.zip\")?;\n```\n\n### Extraction Modes\n\n| Mode | Speed | On Failure | Use When |\n|------|-------|------------|----------|\n| `Streaming` (default) | Fast (1 pass) | Partial files remain | Speed matters; you'll clean up on error |\n| `ValidateFirst` | Slower (2 passes) | No files if validation fails | Can't tolerate partial state |\n\n**⚠️ Neither mode is truly atomic.** If extraction fails mid-write (e.g., disk full), partial files remain regardless of mode. `ValidateFirst` only prevents writes when *validation* fails (bad paths, limits exceeded), not when I/O fails during extraction.\n\n```rust\nuse safe_unzip::{Extractor, ExtractionMode};\n\n// Two-pass extraction:\n// 1. Validate ALL entries (no disk writes)\n// 2. Extract (only if validation passed)\nlet report = Extractor::new(\"/var/uploads\")?\n    .mode(ExtractionMode::ValidateFirst)\n    .extract_file(\"untrusted.zip\")?;\n```\n\nUse `ValidateFirst` when you can't tolerate partial state from malicious archives. Use `Streaming` (default) when speed matters and you can clean up on error.\n\n### Extracting from Memory\n\n```rust\nuse safe_unzip::Extractor;\nuse std::io::Cursor;\n\nlet zip_bytes: Vec\u003cu8\u003e = download_zip_somehow();\nlet cursor = Cursor::new(zip_bytes);\n\nlet report = Extractor::new(\"/var/uploads\")?\n    .extract(cursor)?;\n```\n\n### TAR Extraction (New in v0.1.2)\n\n```rust\nuse safe_unzip::{Driver, TarAdapter};\n\n// Extract a .tar file\nlet report = Driver::new(\"/var/uploads\")?\n    .extract_tar_file(\"archive.tar\")?;\n\n// Extract a .tar.gz file\nlet report = Driver::new(\"/var/uploads\")?\n    .extract_tar_gz_file(\"archive.tar.gz\")?;\n\n// With options\nlet report = Driver::new(\"/var/uploads\")?\n    .filter(|entry| entry.name.ends_with(\".txt\"))\n    .validation(safe_unzip::ValidationMode::ValidateFirst)\n    .extract_tar_file(\"archive.tar\")?;\n```\n\nThe new `Driver` API provides a unified interface for all archive formats with the same security guarantees.\n\n### 7z Extraction (Requires `sevenz` Feature)\n\nEnable the `sevenz` feature:\n\n```toml\n[dependencies]\nsafe_unzip = { version = \"0.1\", features = [\"sevenz\"] }\n```\n\n```rust\nuse safe_unzip::{Driver, SevenZAdapter};\n\n// Extract a .7z file\nlet report = Driver::new(\"/var/uploads\")?\n    .extract_7z_file(\"archive.7z\")?;\n\n// Or from bytes\nlet report = Driver::new(\"/var/uploads\")?\n    .extract_7z_bytes(\u0026seven_z_bytes)?;\n```\n\n**Note:** 7z archives are fully decompressed into memory before extraction, so large archives may use significant RAM.\n\n**Python:**\n```python\nfrom safe_unzip import extract_7z_file, Extractor\n\n# Simple extraction\nreport = extract_7z_file(\"/var/uploads\", \"archive.7z\")\n\n# With options\nreport = Extractor(\"/var/uploads\").extract_7z_file(\"archive.7z\")\n```\n\n### Async Extraction (New)\n\nEnable the `async` feature for tokio-based async extraction:\n\n```toml\n[dependencies]\nsafe_unzip = { version = \"0.1\", features = [\"async\"] }\n```\n\n```rust\nuse safe_unzip::r#async::{extract_file, extract_tar_file, AsyncExtractor};\n\n#[tokio::main]\nasync fn main() -\u003e Result\u003c(), safe_unzip::Error\u003e {\n    // Simple async extraction\n    let report = extract_file(\"/var/uploads\", \"archive.zip\").await?;\n    \n    // TAR extraction\n    let report = extract_tar_file(\"/var/uploads\", \"archive.tar\").await?;\n    \n    // With options\n    let report = AsyncExtractor::new(\"/var/uploads\")?\n        .max_total_bytes(500 * 1024 * 1024)\n        .max_file_count(1000)\n        .extract_file(\"archive.zip\")\n        .await?;\n    \n    Ok(())\n}\n```\n\nConcurrent extraction of multiple archives:\n\n```rust\nuse safe_unzip::r#async::{extract_file, extract_tar_bytes};\n\nlet (zip_result, tar_result) = tokio::join!(\n    extract_file(\"/uploads/a\", \"first.zip\"),\n    extract_tar_bytes(\"/uploads/b\", tar_data),\n);\n```\n\nThe async API uses `spawn_blocking` internally, so extraction runs in a thread pool without blocking the async runtime.\n\n### Python Async API\n\nPython async support uses `asyncio.to_thread()` to run extraction in a thread pool:\n\n```python\nimport asyncio\nfrom safe_unzip import async_extract_file, AsyncExtractor\n\nasync def main():\n    # Simple async extraction\n    report = await async_extract_file(\"/var/uploads\", \"archive.zip\")\n    \n    # TAR extraction\n    from safe_unzip import async_extract_tar_file\n    report = await async_extract_tar_file(\"/var/uploads\", \"archive.tar\")\n    \n    # With options\n    report = await (\n        AsyncExtractor(\"/var/uploads\")\n        .max_total_mb(500)\n        .max_files(1000)\n        .extract_file(\"archive.zip\")\n    )\n\nasyncio.run(main())\n```\n\nConcurrent extraction:\n\n```python\nasync def extract_all(archives):\n    tasks = [\n        async_extract_file(f\"/uploads/{i}\", path)\n        for i, path in enumerate(archives)\n    ]\n    return await asyncio.gather(*tasks)\n```\n\n## Security Model\n\n| Threat | Attack Vector | Defense |\n|--------|---------------|---------|\n| **Zip Slip** | Entry named `../../etc/cron.d/pwned` | `path_jail` validates every path |\n| **Zip Bomb (size)** | 42KB → 4PB expansion | `max_total_bytes` limit + streaming enforcement |\n| **Zip Bomb (count)** | 1 million empty files | `max_file_count` limit |\n| **Zip Bomb (lying)** | Declared 1KB, decompresses to 1GB | Strict size reader detects mismatch |\n| **Symlink Escape** | Symlink to `/etc/passwd` | Skip or reject symlinks |\n| **Symlink Overwrite** | Create symlink, then overwrite target | Symlinks removed before overwrite |\n| **Path Depth** | `a/b/c/.../1000levels` | `max_path_depth` limit |\n| **Invalid Filename** | Control chars, `CON`, `NUL` | Filename sanitization |\n| **Overwrite** | Replace sensitive files | `OverwritePolicy::Error` default |\n| **Setuid** | Create setuid executables | Permission bits stripped |\n| **Encrypted Archives** | Password handling complexity | Rejected (see [Encrypted Archives](#encrypted-archives)) |\n\n## Default Limits\n\n| Limit | Default | Description |\n|-------|---------|-------------|\n| `max_total_bytes` | 1 GB | Total uncompressed size |\n| `max_file_count` | 10,000 | Number of files |\n| `max_single_file` | 100 MB | Largest single file |\n| `max_path_depth` | 50 | Directory nesting depth |\n\n## Error Handling\n\n```rust\nuse safe_unzip::{extract_file, Error};\n\nmatch extract_file(\"/var/uploads\", \"archive.zip\") {\n    Ok(report) =\u003e {\n        println!(\"Success: {} files\", report.files_extracted);\n    }\n    Err(Error::PathEscape { entry, detail }) =\u003e {\n        eprintln!(\"Blocked path traversal in '{}': {}\", entry, detail);\n    }\n    Err(Error::TotalSizeExceeded { limit, would_be }) =\u003e {\n        eprintln!(\"Archive too large: {} bytes (limit: {})\", would_be, limit);\n    }\n    Err(Error::FileTooLarge { entry, size, limit }) =\u003e {\n        eprintln!(\"File '{}' too large: {} bytes (limit: {})\", entry, size, limit);\n    }\n    Err(Error::FileCountExceeded { limit }) =\u003e {\n        eprintln!(\"Too many files (limit: {})\", limit);\n    }\n    Err(Error::AlreadyExists { path }) =\u003e {\n        eprintln!(\"File already exists: {}\", path);\n    }\n    Err(Error::InvalidFilename { entry, .. }) =\u003e {\n        eprintln!(\"Invalid filename: {}\", entry);\n    }\n    Err(Error::EncryptedEntry { entry }) =\u003e {\n        eprintln!(\"Encrypted entry not supported: {}\", entry);\n    }\n    Err(e) =\u003e {\n        eprintln!(\"Extraction failed: {}\", e);\n    }\n}\n```\n\n## Limitations\n\n### Format Limitations\n\n- **ZIP, TAR, 7z only** — RAR not supported\n- **Requires seekable input for ZIP** — ZIP format requires reading the central directory at the end\n- **TAR is sequential** — TAR files are read in order; `ValidateFirst` mode caches entries in memory\n- **No encrypted archives** — See below\n\n### Encrypted Archives\n\n`safe_unzip` does not support password-protected zip files. Encrypted entries are rejected with `Error::EncryptedEntry`.\n\nIf you need to extract encrypted archives:\n1. Decrypt first using the `zip` crate directly\n2. Then extract with `safe_unzip`\n\nThis is intentional—encryption handling is outside our security scope. Password management, key derivation, and cryptographic validation are complex domains that deserve dedicated tooling.\n\n### Extraction Behavior\n\n- **Partial state in Streaming mode** — If extraction fails mid-way, already-extracted files remain on disk. Use `ExtractionMode::ValidateFirst` to validate before writing.\n- **Filters not applied during validation** — In `ValidateFirst` mode, limits are checked against ALL entries. Filtered entries still count toward limits. This is conservative: validation may reject archives that would succeed with filtering.\n\n### Security Scope\n\nThese threats are **not fully addressed** (by design or complexity):\n\n| Limitation | Reason |\n|------------|--------|\n| **Case-insensitive collisions** | On Windows/macOS, `File.txt` and `file.txt` map to the same file. We don't track extracted names to detect this. |\n| **Unicode normalization** | `café` (NFC) vs `café` (NFD) appear identical but are different bytes. Full normalization requires ICU. |\n| **Concurrent extraction** | If multiple threads/processes extract to the same destination, race conditions can occur. Use file locking or separate destinations. |\n| **Sparse file attacks** | Not applicable to zip format. |\n| **Hard links** | Zip format doesn't support hard links. |\n| **Device files** | Zip format doesn't support special device files. |\n\n### TOCTOU Mitigations\n\nFor `OverwriteMode::Error` and `OverwriteMode::Skip`, we use **atomic file creation** (`O_CREAT | O_EXCL`) instead of check-then-create. This eliminates race conditions between checking if a file exists and creating it.\n\nFor `OverwriteMode::Overwrite`, symlinks are removed before writing to prevent symlink-following attacks, but there's a brief window between removal and creation.\n\n### Filename Restrictions\n\nThese filenames are **rejected** for security:\n\n- Control characters (including null bytes)\n- Backslashes (`\\`) — prevents Windows path separator confusion\n- Paths longer than 1024 bytes\n- Path components longer than 255 bytes\n- Windows reserved names: `CON`, `PRN`, `AUX`, `NUL`, `COM1-9`, `LPT1-9`\n\n## Development\n\n### Fuzzing\n\nWe use [cargo-fuzz](https://github.com/rust-fuzz/cargo-fuzz) with two targets:\n\n```bash\n# Install cargo-fuzz (requires nightly)\ncargo install cargo-fuzz\n\n# Run the main extraction fuzzer\ncargo +nightly fuzz run fuzz_extract\n\n# Run the adapter fuzzer (tests parsing layer)\ncargo +nightly fuzz run fuzz_zip_adapter\n```\n\nFuzzing targets are in `fuzz/fuzz_targets/`. Run fuzzing before releases to catch parsing edge cases.\n\n## License\n\nMIT OR Apache-2.0\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftenuo-ai%2Fsafe_unzip","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ftenuo-ai%2Fsafe_unzip","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftenuo-ai%2Fsafe_unzip/lists"}