{"id":45144497,"url":"https://github.com/byte271/6cy","last_synced_at":"2026-02-23T04:01:49.156Z","repository":{"id":339029685,"uuid":"1159477625","full_name":"byte271/6cy","owner":"byte271","description":"High-performance, streaming-first container format with per-block codec polymorphism and robust data recoverability. Reference implementation in Rust.","archived":false,"fork":false,"pushed_at":"2026-02-20T04:09:59.000Z","size":240,"stargazers_count":24,"open_issues_count":0,"forks_count":1,"subscribers_count":1,"default_branch":"main","last_synced_at":"2026-02-21T05:50:34.589Z","etag":null,"topics":["codec-polymorphism","compression","container-format","data-integrity","lz4","rust","specification","storage-engine","streaming-data","zstd"],"latest_commit_sha":null,"homepage":"","language":"Rust","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/byte271.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":"SECURITY.md","support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-02-16T19:22:37.000Z","updated_at":"2026-02-20T10:23:36.000Z","dependencies_parsed_at":"2026-02-21T02:09:17.413Z","dependency_job_id":null,"html_url":"https://github.com/byte271/6cy","commit_stats":null,"previous_names":["byte271/6cy"],"tags_count":3,"template":false,"template_full_name":null,"purl":"pkg:github/byte271/6cy","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/byte271%2F6cy","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/byte271%2F6cy/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/byte271%2F6cy/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/byte271%2F6cy/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/byte271","download_url":"https://codeload.github.com/byte271/6cy/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/byte271%2F6cy/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":29704400,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-02-21T23:35:04.139Z","status":"online","status_checked_at":"2026-02-22T02:00:08.193Z","response_time":110,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["codec-polymorphism","compression","container-format","data-integrity","lz4","rust","specification","storage-engine","streaming-data","zstd"],"created_at":"2026-02-20T01:08:40.796Z","updated_at":"2026-02-22T03:00:45.251Z","avatar_url":"https://github.com/byte271.png","language":"Rust","readme":"\u003cdiv align=\"center\"\u003e\r\n\r\n# .6cy Container Format\r\n\r\n**v0.3.0** · Reference implementation in Rust\r\n\r\n[![License: Apache-2.0](https://img.shields.io/badge/license-Apache--2.0-blue.svg)](LICENSE)\r\n[![Spec License: CC BY 4.0](https://img.shields.io/badge/spec-CC%20BY%204.0-green.svg)](LICENSE-SPEC)\r\n\r\n\u003c/div\u003e\r\n\r\n---\r\n\r\n**.6cy** is a binary archive format built around four hard guarantees:\r\n\r\n- **Every block is self-describing.** Magic, version, codec UUID, sizes, and two\r\n  independent checksums live in each 84-byte block header. A reader can parse\r\n  any single block in isolation.\r\n- **Checksums are mandatory.** A CRC32 covers the header; a BLAKE3 covers the\r\n  content. Neither can be disabled. Corruption is caught before any allocation.\r\n- **Codec identity is frozen.** Each codec is identified by a permanent 128-bit\r\n  UUID stored verbatim on disk. Short numeric IDs are an in-process optimization\r\n  only and are never written to files.\r\n- **No runtime negotiation.** The superblock declares all required codec UUIDs\r\n  upfront. A decoder either has every one or fails immediately — no fallback,\r\n  no partial decode.\r\n\r\n---\r\n\r\n## Benchmark — 6cy (LZMA) vs 7-Zip (LZMA2 level 1)\r\n\r\nTested on **AMD Ryzen 9 6900HX** (8C/16T, 3301 MHz), 16 GB RAM,\r\nWindows 11 Home, 10 GiB synthetic binary file, 3 runs each.\r\nFull methodology in [`BENCHMARK.md`](BENCHMARK.md).\r\n\r\n| Metric | **6cy LZMA** | 7z LZMA2 L1 |\r\n|--------|-------------|-------------|\r\n| Pack time (avg) | **13.0 s** | 34.6 s |\r\n| Unpack time (avg) | 47.0 s | **8.6 s** |\r\n| Archive size | **960 KiB** | 1 527 KiB |\r\n| Pack throughput | **0.767 GiB/s** | 0.289 GiB/s |\r\n| Unpack throughput | 0.213 GiB/s | **1.162 GiB/s** |\r\n| Pack CPU (avg) | **76.7 %** | 97.4 % |\r\n| Compression ratio | **10 919 : 1** | 6 868 : 1 |\r\n\r\n**6cy packs 2.66× faster** and produces a **37% smaller** archive.  \r\n**7z decompresses 5.46× faster** — 7z's LZMA2 decompressor is a mature\r\nhand-optimized C++ implementation; `lzma-rs` is pure Rust (correctness-first).\r\nA future release will evaluate an optional liblzma FFI backend for the\r\ndecompression path.\r\n\r\n---\r\n\r\n## Features\r\n\r\n- **Content-addressable deduplication** — identical 4 MiB chunks are written\r\n  once; subsequent references cost only an 84-byte `BlockRef`. No codec pass\r\n  for duplicate chunks.\r\n- **Four codecs** — Zstd (default), LZ4, Brotli, LZMA. Each identified by a\r\n  frozen UUID; short IDs never leave the process.\r\n- **Solid mode** — multiple files compressed together as one block for maximum\r\n  ratio on small/similar files.\r\n- **AES-256-GCM block encryption** — Argon2id key derivation (64 MiB, 3 passes).\r\n  The archive UUID serves as the KDF salt so the same password yields a\r\n  different key for every archive.\r\n- **Chunked streaming** — files of any size are split into configurable chunks\r\n  (default 4 MiB). Random access spans chunk boundaries correctly.\r\n- **Reconstructible index** — the FILE INDEX is written last. If it is missing\r\n  or corrupt, `6cy scan` rebuilds the file list by reading only block headers\r\n  forward from byte 256, without decompressing any payload.\r\n- **Plugin C ABI** — third-party codecs load via a frozen C ABI\r\n  (`plugin_abi/sixcy_plugin.h`, ABI version 1). Explicit buffer contracts,\r\n  declared thread safety, no shared allocator.\r\n\r\n---\r\n\r\n## Project Layout\r\n\r\n```\r\nsixcy/\r\n├── Cargo.toml                   # version 0.3.0, Apache-2.0\r\n├── LICENSE                      # Apache-2.0 (code)\r\n├── LICENSE-SPEC                 # CC BY 4.0 (spec.md)\r\n├── README.md                    # this file\r\n├── BENCHMARK.md                 # detailed benchmark report\r\n├── CHANGELOG.md                 # version history\r\n├── CONTRIBUTING.md              # how to contribute\r\n├── SECURITY.md                  # threat model and disclosure policy\r\n├── spec.md                      # binary format specification (CC BY 4.0)\r\n├── plugin_abi/\r\n│   └── sixcy_plugin.h           # frozen C ABI for codec plugins\r\n└── src/\r\n    ├── main.rs                  # CLI (6cy binary)\r\n    ├── lib.rs                   # crate root + re-exports\r\n    ├── archive.rs               # high-level Archive API\r\n    ├── block.rs                 # block header encode/decode\r\n    ├── superblock.rs            # superblock (offset 0, 256 bytes)\r\n    ├── plugin.rs                # Rust wrapper for C plugin ABI\r\n    ├── codec/mod.rs             # frozen UUID registry + built-in codecs\r\n    ├── crypto/mod.rs            # AES-256-GCM + Argon2id\r\n    ├── index/mod.rs             # FileIndex, BlockRef\r\n    ├── io_stream/mod.rs         # SixCyWriter, SixCyReader, scan_blocks\r\n    └── recovery/mod.rs          # RecoveryMap + checkpoints\r\n```\r\n\r\n---\r\n\r\n## Getting Started\r\n\r\n### Prerequisites\r\n\r\n- [Rust](https://www.rust-lang.org/tools/install) stable (1.70+)\r\n- No C toolchain required — all dependencies are pure Rust\r\n\r\n### Build\r\n\r\n```bash\r\ngit clone https://github.com/byte271/6cy.git\r\ncd 6cy\r\ncargo build --release\r\n# binary: target/release/6cy  (Linux/macOS)\r\n# binary: target\\release\\6cy.exe  (Windows)\r\n```\r\n\r\n---\r\n\r\n## CLI Reference\r\n\r\n### `pack` — create an archive\r\n\r\n```bash\r\n# Single file, Zstd (default)\r\n6cy pack -o archive.6cy -i file.bin\r\n\r\n# Multiple files, LZMA codec\r\n6cy pack -o archive.6cy -i a.bin -i b.bin -i c.bin --codec lzma\r\n\r\n# Solid block (all inputs compressed together)\r\n6cy pack -o archive.6cy -i *.txt --codec zstd --solid\r\n\r\n# Encrypted (AES-256-GCM, Argon2id key derivation)\r\n6cy pack -o archive.6cy -i secret.bin --password \"my passphrase\"\r\n\r\n# Custom chunk size (default 4096 KiB = 4 MiB)\r\n6cy pack -o archive.6cy -i huge.bin --chunk-size 8192\r\n\r\n# Full options\r\n6cy pack --output archive.6cy \\\r\n         --input file1.bin --input file2.bin \\\r\n         --codec lzma \\\r\n         --level 3 \\\r\n         --chunk-size 4096 \\\r\n         --solid \\\r\n         --password \"secret\"\r\n```\r\n\r\n**Available codecs:** `zstd` (default) · `lz4` · `brotli` · `lzma` · `none`\r\n\r\n### `unpack` — extract an archive\r\n\r\n```bash\r\n# Extract to current directory\r\n6cy unpack archive.6cy\r\n\r\n# Extract to specific directory\r\n6cy unpack archive.6cy -C output/\r\n\r\n# Extract encrypted archive\r\n6cy unpack archive.6cy -C output/ --password \"my passphrase\"\r\n```\r\n\r\n### `list` — list contents\r\n\r\n```bash\r\n6cy list archive.6cy\r\n# Name                       Size    Compressed  Chunks  First block hash\r\n# readme.txt                 4096          1024       1  a1b2c3...\r\n# data.bin              10485760       2097152       3  deadbe...\r\n```\r\n\r\n### `info` — archive metadata\r\n\r\n```bash\r\n6cy info archive.6cy\r\n# ── .6cy Archive ─────────────────────────────────────────\r\n#   Path           archive.6cy\r\n#   Format version 3\r\n#   UUID           550e8400-e29b-41d4-a716-446655440000\r\n#   Encrypted      false\r\n#   Index offset   41943296 B\r\n#   Index size     2048 B\r\n#   Files          5\r\n#   Root hash      a3f2...\r\n#   Required codecs (2):\r\n#     4a8f2e1c-9b3d-4f7a-c2e8-6d5b1a0f3c9e (lzma)\r\n#     b28a9d4f-5e3c-4a1b-8f2e-7c6d9b0e1a2f (zstd)\r\n```\r\n\r\n### `scan` — reconstruct index from block headers\r\n\r\n```bash\r\n# Recover file list without the INDEX block (partial/truncated archives)\r\n6cy scan archive.6cy\r\n# Scan recovered 3 file(s) from block headers:\r\n#   id=00000000  chunks=3  size=12582912  name=file_00000000\r\n#   id=00000001  chunks=1  size=4096      name=file_00000001\r\n```\r\n\r\n### `optimize` — re-compress at maximum ratio\r\n\r\n```bash\r\n6cy optimize archive.6cy -o archive_max.6cy          # Zstd level 19 (default)\r\n6cy optimize archive.6cy -o archive_max.6cy --level 19\r\n```\r\n\r\n---\r\n\r\n## Library API\r\n\r\nAdd to `Cargo.toml`:\r\n\r\n```toml\r\n[dependencies]\r\nsixcy = { path = \".\" }   # or version once published to crates.io\r\n```\r\n\r\n### Create an archive\r\n\r\n```rust\r\nuse sixcy::archive::{Archive, PackOptions};\r\nuse sixcy::codec::CodecId;\r\n\r\n// Default options: Zstd level 3, 4 MiB chunks, no encryption\r\nlet mut ar = Archive::create(\"output.6cy\", PackOptions::default())?;\r\nar.add_file(\"readme.txt\", b\"Hello, world!\")?;\r\nar.add_file_with_codec(\"data.bin\", \u0026data, CodecId::Lzma)?;\r\nar.finalize()?;  // MUST be called — writes INDEX block and patches superblock\r\n```\r\n\r\n### Solid blocks\r\n\r\n```rust\r\nar.begin_solid(CodecId::Zstd)?;\r\nar.add_file(\"a.txt\", \u0026a)?;\r\nar.add_file(\"b.txt\", \u0026b)?;\r\nar.add_file(\"c.txt\", \u0026c)?;\r\nar.end_solid()?;   // flushes the combined block\r\nar.finalize()?;\r\n```\r\n\r\n### Encryption\r\n\r\n```rust\r\nlet opts = PackOptions {\r\n    password: Some(\"my passphrase\".into()),\r\n    ..PackOptions::default()\r\n};\r\nlet mut ar = Archive::create(\"secret.6cy\", opts)?;\r\nar.add_file(\"private.bin\", \u0026data)?;\r\nar.finalize()?;\r\n\r\n// Open\r\nlet mut ar = Archive::open_encrypted(\"secret.6cy\", \"my passphrase\")?;\r\nlet data = ar.read_file(\"private.bin\")?;\r\n```\r\n\r\n### Read an archive\r\n\r\n```rust\r\nlet mut ar = Archive::open(\"output.6cy\")?;\r\n\r\n// List all files\r\nfor info in ar.list() {\r\n    println!(\"{}: {} bytes ({} blocks)\", info.name, info.original_size, info.block_count);\r\n}\r\n\r\n// Read a whole file\r\nlet data = ar.read_file(\"readme.txt\")?;\r\n\r\n// Random access (spans chunk boundaries)\r\nlet mut buf = [0u8; 4096];\r\nlet n = ar.read_at(\"data.bin\", 1_048_576, \u0026mut buf)?;\r\n\r\n// Extract everything\r\nar.extract_all(\"./output/\")?;\r\n```\r\n\r\n### Metadata\r\n\r\n```rust\r\nprintln!(\"UUID:      {}\", ar.uuid());\r\nprintln!(\"Root hash: {}\", ar.root_hash_hex());\r\n```\r\n\r\n### Index reconstruction (no INDEX block)\r\n\r\n```rust\r\nuse sixcy::io_stream::SixCyReader;\r\nuse std::fs::File;\r\n\r\nlet mut reader = SixCyReader::new(File::open(\"partial.6cy\")?)?;\r\nlet reconstructed = reader.scan_blocks()?;\r\nfor record in \u0026reconstructed.records {\r\n    println!(\"{}: {} bytes\", record.name, record.original_size);\r\n}\r\n```\r\n\r\n---\r\n\r\n## Block Header Layout (v1, 84 bytes)\r\n\r\nAll fields are little-endian.\r\n\r\n```\r\n[ 0]  4 B  magic            0x424C434B  (\"BLCK\")\r\n[ 4]  2 B  header_version   = 1\r\n[ 6]  2 B  header_size      = 84\r\n[ 8]  2 B  block_type       0=Data  1=Index  2=Solid\r\n[10]  2 B  flags            0x0001=Encrypted\r\n[12] 16 B  codec_uuid       frozen 16-byte UUID (LE field order)\r\n[28]  4 B  file_id          0xFFFFFFFF for Solid/Index blocks\r\n[32]  8 B  file_offset      byte offset in decompressed file\r\n[40]  4 B  orig_size        uncompressed bytes\r\n[44]  4 B  comp_size        on-disk bytes\r\n[48] 32 B  content_hash     BLAKE3 of uncompressed plaintext\r\n[80]  4 B  header_crc32     CRC32([0..80])  ← verified first, always\r\n```\r\n\r\n---\r\n\r\n## Codec UUIDs (frozen forever)\r\n\r\n| Codec  | UUID |\r\n|--------|------|\r\n| None   | `00000000-0000-0000-0000-000000000000` |\r\n| Zstd   | `b28a9d4f-5e3c-4a1b-8f2e-7c6d9b0e1a2f` |\r\n| LZ4    | `3f7b2c8e-1a4d-4e9f-b6c3-5d8a2f7e0b1c` |\r\n| Brotli | `9c1e5f3a-7b2d-4c8e-a5f1-2e6b9d0c3a7f` |\r\n| LZMA   | `4a8f2e1c-9b3d-4f7a-c2e8-6d5b1a0f3c9e` |\r\n\r\nUUIDs are never reused. A deprecated codec keeps its UUID permanently.\r\n\r\n---\r\n\r\n## Running Tests\r\n\r\n```bash\r\ncargo test\r\n```\r\n\r\n## Running Benchmarks\r\n\r\n```bash\r\ncargo bench\r\n```\r\n\r\n---\r\n\r\n## Security\r\n\r\nSee [`SECURITY.md`](SECURITY.md) for the threat model, hardening details, and\r\nthe vulnerability disclosure policy.\r\n\r\n---\r\n\r\n## Container Format Specification\r\n\r\nSee [`spec.md`](spec.md)\r\n\r\n## Contributing\r\n\r\nSee [`CONTRIBUTING.md`](CONTRIBUTING.md).\r\n\r\n---\r\n\r\n## License\r\n\r\nThe **reference implementation** (all `.rs` files, `Cargo.toml`,\r\n`plugin_abi/sixcy_plugin.h`) is licensed under the\r\n**[Apache License 2.0](LICENSE)**.\r\n\r\nThe **format specification** (`spec.md`) is licensed under\r\n**CC BY 4.0** — you may implement the format in any language\r\nand share implementations freely, provided you attribute the original\r\nspecification.\r\n\r\n\r\n\r\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbyte271%2F6cy","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbyte271%2F6cy","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbyte271%2F6cy/lists"}