{"id":50425474,"url":"https://github.com/pmarreck/blar","last_synced_at":"2026-05-31T10:03:49.164Z","repository":{"id":355727950,"uuid":"1229338589","full_name":"pmarreck/blar","owner":"pmarreck","description":"BLAR archive format — built on BLIP","archived":false,"fork":false,"pushed_at":"2026-05-15T20:04:07.000Z","size":12550,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"yolo","last_synced_at":"2026-05-15T23:03:17.543Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Zig","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/pmarreck.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-05-05T00:12:47.000Z","updated_at":"2026-05-15T20:04:13.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/pmarreck/blar","commit_stats":null,"previous_names":["pmarreck/blar"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/pmarreck/blar","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pmarreck%2Fblar","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pmarreck%2Fblar/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pmarreck%2Fblar/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pmarreck%2Fblar/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/pmarreck","download_url":"https://codeload.github.com/pmarreck/blar/tar.gz/refs/heads/yolo","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pmarreck%2Fblar/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":33726722,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-05-31T02:00:06.040Z","response_time":95,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2026-05-31T10:03:48.416Z","updated_at":"2026-05-31T10:03:49.148Z","avatar_url":"https://github.com/pmarreck.png","language":"Zig","funding_links":[],"categories":[],"sub_categories":[],"readme":"# blar — BLAR archive format and tool\n\n[![built with garnix](https://img.shields.io/endpoint.svg?url=https%3A%2F%2Fgarnix.io%2Fapi%2Fbadges%2Fpmarreck%2Fblar)](https://garnix.io/repo/pmarreck/blar)\n\n`blar` is a deterministic, integrity-verified, structurally introspectable archive format with a `tar`-like CLI. Think `tar`, but with BLAKE3-128 outer + xxHash64 inner integrity, O(1) random access via end-of-container index tables, built-in LZMA2 compression and AEAD encryption, structural inspection via path expressions (`peek`/`poke`), JSON round-tripping (`to-json`/`from-json`), printable-binary text representation, and transparent container expansion (PDF, PNG, JPEG-via-JXL, ZIP, tar, gzip, BMP, TGA, TIFF, GIF, WAV/AIFF→FLAC, FITS, DICOM, NIfTI).\n\nThe archive format is called **BLAR**. The encoding it builds on is **BLIP** — see [pmarreck/BLIP](https://github.com/pmarreck/BLIP) for the underlying length-prefixed integer encoding and generic container envelope.\n\nSister project: [pmarreck/mini_blar](https://github.com/pmarreck/mini_blar) — a constrained subset of the BLAR format with its own independent implementation.\n\n## Build\n\nRequires [Nix](https://nixos.org/) with flakes enabled. All dependencies (Zig 0.15.2, libFLAC, libjxl, hyperfine) are provided hermetically.\n\n```bash\n./test           # run tests\n./build          # build (ReleaseFast)\n./build --debug  # build (Debug)\n./bm             # run benchmarks\n```\n\n## Usage\n\n### Smart defaults (no subcommand needed)\n\n```bash\nblar myproject/              # create archive (detects directory)\nblar archive.blar            # extract (detects .blar extension)\nblar -z myproject/           # create with compression\n```\n\n### Explicit commands\n\n```bash\nblar create -o archive.blar myproject/        # archive a directory tree\nblar create myproject                         # default output: myproject.blar\nblar create -o archive.blar file1 file2       # archive individual files\nblar list archive.blar                        # list entries\nblar extract archive.blar -C output_dir/      # extract restoring perms+structure\nblar verify archive.blar                      # BLAKE3-128 outer + per-file + Merkle\nblar info archive.blar                        # metadata (counts, sizes)\nblar cat archive.blar path/to/file.txt        # print single file to stdout\nblar peek archive.blar \"[1][0]\" --type        # navigate structure\nblar poke archive.blar \"[1][0][1]\" --value x  # modify a leaf value\nblar to-json archive.blar | jq ...            # JSON interchange\nblar from-json -o b.blar \u003c input.json\n```\n\n### Tar-style shortcuts (hyphen optional)\n\n```bash\nblar cf archive.blar myproject/   # create\nblar tf archive.blar              # list\nblar xf archive.blar              # extract\nblar Vf archive.blar              # verify\nblar If archive.blar              # info\nblar pf archive.blar file.txt     # cat (print)\nblar kf archive.blar \"[1][0]\"     # peek\nblar Kf archive.blar \"[1][0][1]\"  # poke\nblar jf archive.blar              # to-json\nblar Jf input.json -o out.blar    # from-json\n```\n\n### Segmentation\n\n```bash\nblar create --segment-size 100M -o archive.blar myproject/   # split during create\nblar create --segment-count 4   -o archive.blar myproject/   # equal-sized N segments\nblar split archive.blar --size 100M                          # split existing archive\nblar join  archive.blar.000 archive.blar.001 ...             # rejoin segments\nblar reassemble archive.blar.000                             # autodetect tail\n```\n\n`list`, `extract`, `verify`, `info` all auto-detect segment files.\n\n## Inspecting archives (`peek`)\n\nNavigate the binary structure with jq-like path expressions. Every container, metadata key, and hash is addressable.\n\n```bash\nblar peek archive.blar \"[1][0]\" --type            # FILE\nblar peek archive.blar \"[1][0].count\"             # 2\nblar peek archive.blar \"[1][0].hash\"              # a1b2c3d4e5f6a7b8\nblar peek archive.blar \"[1][0][0][pa]\"            # file path\nblar peek archive.blar \"[1][0][0][md]\"            # 0644 (mode as octal)\nblar peek archive.blar \"[1][0][0][mt]\"            # 2026-02-24T10:30:00.123456789Z\nblar peek archive.blar \"[1][0][1]\" --raw          # raw file content\nblar peek archive.blar \"[1][0][1]\" --hex          # 0x68656c6c6f...\nblar peek archive.blar \".hash\"                    # archive's stored checksum\n```\n\nArchive structure: `ARRAY[DATA(magic), ARRAY[FILE[DICT{metadata}, DATA{content}], ...]]`.\n\n`--raw` writes raw bytes (printable-binary-encoded with stderr warning if stdout is a tty). `--hex` shows payload hex for leaves and container checksum for aggregates. `--json` emits JSON with printable-binary identity check on strings. `--type` is shorthand for `.type`.\n\n## Modifying archives (`poke`)\n\nSame path syntax as `peek`, but *sets* values. The entire archive is re-serialized with all hashes, offsets, and index tables recomputed automatically.\n\n```bash\necho -n \"new content\" | blar poke archive.blar \"[1][0][1]\"\nblar poke archive.blar \"[1][0][0][pa]\" --value \"renamed.txt\"   # rename\nblar poke archive.blar \"[1][0][1]\" -i data.bin                 # read from file\nblar poke archive.blar \"[1][0][1]\" --value \"x\" -o modified.blar   # write to new file\nblar poke archive.blar \"[1][0][1]\" --value \"x\" --backup           # .bak before overwrite\n```\n\n## JSON interchange (`to-json` / `from-json`)\n\nConvert any archive to JSON, manipulate with `jq`, convert back. Byte-identical round-tripping for uncompressed/unencrypted archives — same files in produces same bytes out.\n\n```bash\nblar to-json archive.blar | jq '.entries[].path'\n\n# Modify content\nblar to-json a.blar \\\n  | jq '(.entries[] | select(.path==\"hello.txt\")).content = \"new\"' \\\n  | blar from-json -o b.blar\n\n# Add / remove / rename / chmod via jq edits, all the usual moves.\n\n# Re-apply compression and/or encryption when materializing\nBLAR_PASSWORD=secret blar to-json encrypted.blar \\\n  | BLAR_PASSWORD=secret blar from-json -z -e -o b.blar\n```\n\nBinary content is encoded via printable-binary in JSON strings. All hashes, offsets, and index tables recompute on `from-json`. `to-json` decompresses and decrypts transparently — `from-json -z`/`-e` re-applies them.\n\n## Encryption\n\nPer-container AEAD encryption as an LP attribute. Encryption is applied after compression and before checksumming.\n\n- **Ciphers**: AES-256-GCM (default, hardware-accelerated), ChaCha20-Poly1305 (constant-time on all platforms)\n- **KDF**: Argon2id (default, 64 MiB / 3 iter / parallelism 4), PBKDF2-SHA256 (600,000 iter, portable fallback)\n\n```bash\nblar create -e -o secret.blar myproject/                 # AES-256-GCM + Argon2id\nblar create -e chacha -o secret.blar myproject/          # ChaCha20-Poly1305\nblar create -e --kdf pbkdf2 -o secret.blar myproject/    # PBKDF2 instead\nblar create -z -e -o secret.blar myproject/              # compress + encrypt\nBLAR_PASSWORD=mysecret blar list secret.blar             # decrypt on read\nblar list secret.blar                                    # interactive prompt on stderr\n```\n\nA wrong password fails the AEAD tag check — no ambiguity about whether decryption succeeded.\n\n## blar vs tar\n\n| | `blar` | `tar` |\n|---|---|---|\n| Determinism | Byte-identical output guaranteed by spec | Format-dependent; GNU/BSD/POSIX produce different bytes |\n| Integrity | BLAKE3-128 outer + xxHash64 inner + Merkle hashes | None built-in |\n| Random access | O(1) via end-of-container index tables | Sequential scan only |\n| Per-file overhead | ~165 bytes | 1024 bytes minimum (512-byte aligned) |\n| Metadata | Extensible key-value, any container type as value | Fixed header fields (or pax extended) |\n| Typed values | UTF8, DATA, ARRAY, DICT, MAP, FILE, DIR | Byte ranges in fixed-width fields |\n| Nesting | Recursive | Flat |\n| Encryption | Built-in AEAD | None — layer external |\n| Compression | Built-in LZMA2 per-container with format-aware expansion | External, whole-archive |\n| Introspection | `peek`/`poke` with path expressions | `tar tf` only |\n| JSON interop | Full round-trip via `to-json`/`from-json` + `jq` | None |\n| Text-safe transport | Printable-binary encoding | Binary only |\n| Specification | Single canonical spec | v7, ustar, pax, GNU, BSD — multiple incompatible |\n\n**Where tar wins**: ubiquity. Every Unix system has it.\n\n**Where blar wins**: correctness guarantees, structural transparency, and dramatically smaller archives via container expansion.\n\n## Compression granularity\n\nLP attributes apply to any container, so blar supports the full spectrum:\n\n- **Per-file**: COMP on each FILE/DATA. Preserves O(1) random access.\n- **Solid**: COMP on the body ARRAY. Best ratio for similar files; whole-archive decompress to read one file.\n- **Grouped**: organize files by content type into sub-arrays, compress each group independently. Different algorithms (or none) per group; e.g. skip COMP for already-compressed video.\n\n```\nARRAY (archive)\n├── DATA (magic)\n└── DICT (body, keyed by content type)\n    ├── \"image/png\"   → ARRAY [FILE, FILE, ...]   ← COMP=lzma2\n    ├── \"text/plain\"  → ARRAY [FILE, FILE, ...]   ← COMP=lzma2\n    └── \"video/mp4\"   → ARRAY [FILE, FILE, ...]   ← no COMP\n```\n\n## Transparent container expansion\n\n`blar` automatically detects and decomposes known file formats during archiving so LZMA2 can compress them effectively. Extraction reconstructs the original file byte-identically. This is transparent — you archive a PDF, you extract a PDF.\n\n| Format | What happens | Notes |\n|---|---|---|\n| **PDF** | JPEGs → JXL (lossless), zlib streams decompressed, structural shell preserved | byte-identical; up to 70% smaller on text-heavy PDFs |\n| **PNG** | Raw pixels → lossless JXL; metadata chunks (tEXt/iCCP/pHYs) preserved | pixel-identical with metadata |\n| **ZIP** | Per-entry decompression; structure preserved | byte-identical |\n| **gzip** | Decompressed inside; recompressed on extract | content-identical (level/strategy not preserved) |\n| **tar** | Decomposed into constituent files; headers preserved as metadata | byte-identical; lets LZMA2 group across tar boundary |\n| **BMP/TGA/TIFF** | Raw pixels → JXL; compact header metadata | byte-identical; ~90–97% savings |\n| **GIF** | Pure-Zig LZW decoder → JXL pixels; original GIF stored as metadata if smaller | byte-identical |\n| **WAV/AIFF** | PCM → FLAC via libFLAC; non-PCM chunks preserved | byte-identical; 50–60% on real audio |\n| **FITS / DICOM / NIfTI** | Pixel/voxel data → JXL; text headers / DICOM tags preserved | byte-identical; compressed/encapsulated DICOM left as-is |\n\nReal PDF results:\n\n| PDF | Original | blar (no expand) | blar (expand) | Savings |\n|---|---|---|---|---|\n| Far Side Vol I (673 JPEGs) | 158 MB | ~155 MB | 121 MB | 23% smaller |\n| Slaughterhouse-Five (text-only) | 876 KB | ~840 KB | 780 KB | 16% smaller |\n| Beginning Lua Programming | 8.6 MB | ~8.2 MB | 2.5 MB | 70% smaller |\n\n```bash\nblar create -z -o archive.blar documents/                       # expand by default\nblar create -z --no-expand-containers -o archive.blar docs/     # disable\nblar list archive.blar                                          # types: p=PDF, n=PNG, j=JPEG,\n                                                                # b=BMP, a=TGA, i=TIFF, f=GIF,\n                                                                # w=WAV/AIFF, s=FITS, m=DICOM,\n                                                                # g=gzip, t=tar, z=ZIP, d=dir, -=file\n```\n\n## C FFI\n\n`blar` is a hexagonal-design app: a pure-Zig core with no I/O, exposed via a C FFI in `src/blar.h`, with a C CLI dogfooding the FFI.\n\n```c\n#include \"blar.h\"\n\nblar_file_entry files[] = {\n    { \"hello.txt\", 9, (uint8_t*)\"Hello!\\n\", 7 },\n};\nuint8_t *archive;\nsize_t archive_len;\nblar_create(files, 1, /*flags=*/0, \u0026archive, \u0026archive_len);\nbool ok = blar_verify(archive, archive_len);\nblip_free(archive, archive_len);   /* freer comes from BLIP */\n```\n\nThe BLIP encoding/container primitives (`blip_encode`, `blip_decode`, `blip_peek`, `blip_poke`, `blip_segment_*`, etc.) are provided by the BLIP dep — `#include \u003cblip.h\u003e` for those.\n\n## License\n\nMIT — see [LICENSE](LICENSE).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpmarreck%2Fblar","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fpmarreck%2Fblar","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpmarreck%2Fblar/lists"}