{"id":50213474,"url":"https://github.com/thicclatka/tetration","last_synced_at":"2026-05-26T07:01:01.383Z","repository":{"id":360122937,"uuid":"1222214533","full_name":"thicclatka/tetration","owner":"thicclatka","description":"New file format for tensors","archived":false,"fork":false,"pushed_at":"2026-05-25T05:09:31.000Z","size":6553,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-05-25T06:27:46.383Z","etag":null,"topics":["cli","data","fileformat","mmap","tensors"],"latest_commit_sha":null,"homepage":"https://github.com/thicclatka/tetration","language":"Rust","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/thicclatka.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE-APACHE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":"AGENTS.md","dco":null,"cla":null}},"created_at":"2026-04-27T06:39:20.000Z","updated_at":"2026-05-25T05:12:54.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/thicclatka/tetration","commit_stats":null,"previous_names":["thicclatka/tetration"],"tags_count":1,"template":false,"template_full_name":null,"purl":"pkg:github/thicclatka/tetration","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/thicclatka%2Ftetration","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/thicclatka%2Ftetration/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/thicclatka%2Ftetration/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/thicclatka%2Ftetration/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/thicclatka","download_url":"https://codeload.github.com/thicclatka/tetration/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/thicclatka%2Ftetration/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":33508317,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T03:12:49.672Z","status":"ssl_error","status_checked_at":"2026-05-26T03:12:47.976Z","response_time":63,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cli","data","fileformat","mmap","tensors"],"created_at":"2026-05-26T07:01:00.242Z","updated_at":"2026-05-26T07:01:01.293Z","avatar_url":"https://github.com/thicclatka.png","language":"Rust","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Tetration\n\n[![Crates.io](https://img.shields.io/crates/v/tetration.svg)](https://crates.io/crates/tetration)\n[![docs.rs](https://img.shields.io/docsrs/tetration)](https://docs.rs/tetration)\n![Build](https://github.com/thicclatka/tetration/workflows/Build/badge.svg)\n![Rust](https://img.shields.io/badge/rust-1.95-orange.svg)\n\n[_For those who are more cur..._](https://bookshop.org/p/books/book-of-numbers-a-novel-joshua-cohen/af5aa739b0fac506?ean=9780812986655\u0026next=t)\n\n**_STILL IN DEVELOPMENT — layout v1 and query JSON may change before 1.0._**\n\n**HDF5-shaped** persistence (many large arrays in one durable file), **Zarr-shaped** chunking (regular grid, per-chunk compression, parallel I/O)—in a **single mmap-friendly `.tet` file`**, not a directory of shard blobs.\n\n## What it does today (v1)\n\n- **On-disk layout** — superblock, dataset directory, chunk index, raw or zstd payloads ([`docs/layout_v1.md`](docs/layout_v1.md)).\n- **Mmap + read planning** — logical slices → chunk coordinates → [`ReadPlan`](https://docs.rs/tetration/latest/tetration/query/struct.ReadPlan.html).\n- **JSON query + execute** — flat query documents, streaming reductions, tier-C stats, spill export ([`docs/query_engine.md`](docs/query_engine.md)).\n- **Import** — `tet convert` from HDF5, NetCDF, Zarr v3 directory stores.\n- **File health** — `tet verify` (quick scan; **`--deep`** decodes every chunk), `tet repair` (plan / `--apply` safe fixes).\n- **CLI** — `tet info`, `tet verify`, `tet repair`, `tet query`, `tet qhist`, `tet convert`.\n\n**Wire dtypes** (tags `1`–`10`, row-major chunks): **`f32`**, **`f64`**, **`i32`**, **`i64`**, **`u8`**, **`u16`**, **`i16`**, **`u32`**, **`f16`**, **`u64`**. Booleans import as **`u8`**. See [`docs/layout_v1.md`](docs/layout_v1.md#dataset-record-concatenated-variable-length-per-record).\n\n## Quick start\n\n### macOS — Homebrew (recommended)\n\nOne-time tap (this repo ships `Formula/tetration.rb`; pulls in **HDF5** and **NetCDF** for `tet convert`):\n\n```bash\nbrew tap thicclatka/tetration https://github.com/thicclatka/tetration\nbrew install tetration\ntet --help\n```\n\nUpgrade later: `brew upgrade tetration`.\n\nFrom a local clone (no tap): `brew install --build-from-source Formula/tetration.rb`\n\n### `cargo install`\n\n**Default features** need **system HDF5 and NetCDF** dev libraries (`.h5` / `.nc` convert; Zarr v3 is Rust + bundled **zstd**):\n\n| Platform             | Typical packages                                                                                                                |\n| -------------------- | ------------------------------------------------------------------------------------------------------------------------------- |\n| **Debian / Ubuntu**  | `libhdf5-dev`, `libnetcdf-dev`, `pkg-config`, `build-essential`                                                                 |\n| **macOS (Homebrew)** | `brew install hdf5 netcdf pkg-config`                                                                                           |\n| **Windows**          | OpenSSL + NetCDF/HDF5 (e.g. [vcpkg](https://vcpkg.io/) or conda-forge); see [`.github/scripts/`](.github/scripts/) for CI hints |\n\n```bash\ncargo install tetration\n```\n\nWithout HDF5/NetCDF libs: **`cargo install tetration --no-default-features`** — `tet info` / `tet query` on `.tet` files and **Zarr** import still work.\n\n### Build from source\n\n```bash\ngit clone https://github.com/thicclatka/tetration.git\ncd tetration\ncargo build --release\nexport PATH=\"$PWD/target/release:$PATH\"   # or: alias tet=\"$PWD/target/release/tet\"\n```\n\n### First commands\n\n```bash\ntet convert volume.h5 volume.tet          # HDF5 / NetCDF / Zarr v3 → .tet\n\ntet info volume.tet\ntet verify volume.tet\ntet verify --deep volume.tet -q    # full chunk decode (large files sample 128 by default)\ntet query '{\"dataset\":\"\u003cname\u003e\",\"mean\":[]}' -t volume.tet -x -q   # \u003cname\u003e from info output\n```\n\n**Daily driver:** plan + execute with readable stdout:\n\n```bash\ntet query q.json -t data.tet -x -q              # one-line aggregate\ntet query q.json -t data.tet -x --format stats  # slim JSON (no chunk list)\ntet query q.json -t data.tet --format plan      # catalog + read_plan only\n```\n\nQuery JSON is **flat** (e.g. `\"mean\": []`, `\"spill\": \"slice.bin\"`); nested `\"operation\"` objects are rejected. Details: [query document](docs/query_engine.md#query-document-json).\n\n## `tet` commands\n\nFull flag lists: **`tet -h`** and **`tet \u003ccommand\u003e -h`** (always match the installed binary).\n\n| Command                                        | Alias  | Role                                                       |\n| ---------------------------------------------- | ------ | ---------------------------------------------------------- |\n| [`tet info`](#tet-info) `\u003cpath.tet\u003e`           | —      | Summarize a file (default: dataset table)                  |\n| [`tet verify`](#tet-verify) `\u003cpath.tet\u003e`       | —      | Layout health check (exit 1 on failure); `--json` / `-q`   |\n| [`tet repair`](#tet-repair) `\u003cpath.tet\u003e`       | —      | Plan or apply safe in-place fixes (e.g. bad footer)        |\n| [`tet query`](#tet-query) `[QUERY]`            | `q`    | Validate JSON; optional catalog + execute against `-t`     |\n| [`tet qhist`](#tet-qhist) `[list\\|run]`        | `hist` | Recent queries (platform cache; **not** the `.tet` footer) |\n| [`tet convert`](#tet-convert) `\u003cin\u003e \u003cout.tet\u003e` | —      | HDF5 / NetCDF / Zarr v3 → `.tet`                           |\n\n### `tet info`\n\n| Flag                                                                 | Effect                                                            |\n| -------------------------------------------------------------------- | ----------------------------------------------------------------- |\n| _(default)_                                                          | Dataset catalog table                                             |\n| `--json`                                                             | Full pretty JSON (superblock, catalog, chunks, history)           |\n| `-q`, `--quiet`                                                      | One-line summary                                                  |\n| `--all`                                                              | All text sections                                                 |\n| `--layout` / `--execution` / `--datasets` / `--chunks` / `--history` | One section each (`--history` = convert footer; not `qhist`)      |\n| `-n`, `--limit N`                                                    | Max chunk rows with `--chunks` or `--all` (default 32; `0` = all) |\n| `--dataset`, `--grep`                                                | Case-insensitive filters on dataset name (and dtype for `--grep`) |\n\n### `tet verify`\n\n| Flag        | Effect                                                                               |\n| ----------- | ------------------------------------------------------------------------------------ |\n| _(default)_ | Human-readable check list + summary (decodes up to **128** chunks on large files)    |\n| `--deep`    | Decode **every** chunk payload (not just the quick sample)                           |\n| `--repair`  | After verify, apply safe in-place repairs for repairable findings (see `tet repair`) |\n| `--json`    | Pretty JSON [`TetVerifyReport`](src/verify/report.rs)                                |\n| `-q`        | One line (`status=ok` / `failed`)                                                    |\n\nExit code **1** when verification fails (CI-friendly). Manual smoke fixtures: [`fixtures/small/tet/README.md`](fixtures/small/tet/README.md).\n\n### `tet repair`\n\n| Flag           | Effect                                                                   |\n| -------------- | ------------------------------------------------------------------------ |\n| _(default)_    | Plan from verify recommendations (no writes)                             |\n| `--apply CODE` | Apply fix (repeatable); today: `footer_invalid` strips a bad `THST` tail |\n| `--dry-run`    | With `--apply`, show changes without writing                             |\n| `--json`       | Pretty JSON plan or repair report                                        |\n\n### `tet query`\n\n`QUERY`: path to `.json`, inline JSON, `-` for stdin, or omit to read stdin.\n\n| Flag                | Effect                                                                                                        |\n| ------------------- | ------------------------------------------------------------------------------------------------------------- |\n| `-t`, `--tet PATH`  | Attach catalog / read plan (required for `-x`)                                                                |\n| `-x`, `--execute`   | Decode tiles, run `operation`, attach `execution`                                                             |\n| `--format`          | `full` (default), `json`, `stats`, `plan`, `quiet`                                                            |\n| `-q`, `--quiet`     | Shorthand for `--format quiet` (one-line stdout)                                                              |\n| `--preview N`       | Cap preview sample values when executing (`--preview-f32` alias; default 64 for full/json, 0 for quiet/stats) |\n| `--spill-allow DIR` | Extra spill roots (repeatable; needs `-x` and `-t`)                                                           |\n\n### `tet qhist`\n\nStored under the platform cache (`query_history.jsonl`), not in the `.tet` file. Env: `TET_NO_QUERY_HISTORY`, `TET_QUERY_HISTORY_FILE`, `TET_QUERY_HISTORY_MAX`. Details: [`GETTING_STARTED.md` — qhist](GETTING_STARTED.md#cli-query-history-tet-qhist).\n\n| Subcommand / flag                                                | Effect                                                                                                              |\n| ---------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------- |\n| `list` _(default)_                                               | Compact table of recent queries                                                                                     |\n| `run N`                                                          | Re-run saved row (`1` = newest in filtered view); honors today's `--format` / `-q`; `-t` / `-x` / `--plan` override |\n| `--clear`                                                        | Remove the history file                                                                                             |\n| `list --all`, `--dataset`, `--tet`, `--mode`, `--grep`, `--json` | Filters / full JSON export on `list`                                                                                |\n\n### `tet convert`\n\n| Input   | Sniff / extensions                                        |\n| ------- | --------------------------------------------------------- |\n| HDF5    | `.h5`, `.hdf5`, `.hdf`, `.he2`, `.he5`, or file signature |\n| NetCDF  | `.nc`, `.netcdf`, `.nc4`, `.nc3`, `.cdf`, or signature    |\n| Zarr v3 | Directory with root `zarr.json`                           |\n\n| Flag       | Effect                                                                         |\n| ---------- | ------------------------------------------------------------------------------ |\n| `--jobs N` | Parallel chunk read workers (`0` = host `available_parallelism`, capped at 64) |\n\nMore examples and roadmap: [`GETTING_STARTED.md`](GETTING_STARTED.md).\n\n## Documentation map\n\n| Doc                                            | Contents                                                                               |\n| ---------------------------------------------- | -------------------------------------------------------------------------------------- |\n| [`GETTING_STARTED.md`](GETTING_STARTED.md)     | Phased checklist, verification, CLI history, what's next                               |\n| [`docs/layout_v1.md`](docs/layout_v1.md)       | Wire layout, superblock, chunk index, footer history                                   |\n| [`docs/query_engine.md`](docs/query_engine.md) | Planning, execution strategies, spill allowlist, JSON security                         |\n| [`fixtures/README.md`](fixtures/README.md)     | Test tensors, convert fixtures, [`small/tet/`](fixtures/small/tet/) verify/query smoke |\n\n## Design stance (short)\n\n**Partial I/O is the default case** — mmap payload regions, touch only chunks that intersect the selection, parallel decode across disjoint tiles. Full-array loads into RAM are not required for planning or tier-A/B aggregates.\n\n**JSON is the control plane**, not the storage encoding: hosts validate input, cap size, and enforce spill path policy ([security notes](docs/query_engine.md#json-security-input-and-output)).\n\n**Non-goals (v1):** SQL-on-files, arbitrary codec plugins, GPU codecs in the file format. GPU use is “materialize on CPU (or spill), then copy to device” in bindings—see Phase 10 in [`GETTING_STARTED.md`](GETTING_STARTED.md). **Phase 8** (file health + wire dtypes through **`u64`/`f16`**) is **done**; **Phase 9** is richer query ops and interchange export; Python wheels and a narrow C ABI are Phase 11; the layout spec is the portable floor.\n\n## Library use\n\n```toml\n[dependencies]\ntetration = \"0.1\"\n```\n\n```rust\nuse tetration::prelude::*;\n// or: tetration::layout::mmap_file_read, tetration::query::{parse_query_json, …}\n```\n\nEmbedders get the full [`QueryResponse`](https://docs.rs/tetration/latest/tetration/query/struct.QueryResponse.html); the CLI uses [`format_query_response`](https://docs.rs/tetration/latest/tetration/query/fn.format_query_response.html) for stdout modes. **Session API:** [`TetWriterSession`](https://docs.rs/tetration/latest/tetration/catalog/struct.TetWriterSession.html), [`TetFile`](https://docs.rs/tetration/latest/tetration/catalog/struct.TetFile.html), [`execute_query_json`](https://docs.rs/tetration/latest/tetration/query/fn.execute_query_json.html) (or [`prelude`](https://docs.rs/tetration/latest/tetration/prelude/index.html)). **Verify / repair:** [`verify_tet_file`](https://docs.rs/tetration/latest/tetration/verify/fn.verify_tet_file.html), [`repair_tet_file`](https://docs.rs/tetration/latest/tetration/repair/fn.repair_tet_file.html) (`VerifyOptions::deep_decode` mirrors `tet verify --deep`). **Examples:** `cargo run --example create_and_query`, `inspect_catalog`, `session_write`, `gen_small_tet_fixtures` (tracked [`fixtures/small/tet/`](fixtures/small/tet/)). Next query-semantics work: [GETTING_STARTED.md — Phase 9](GETTING_STARTED.md#phase-9--query-ops--interchange-later).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fthicclatka%2Ftetration","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fthicclatka%2Ftetration","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fthicclatka%2Ftetration/lists"}