{"id":50780982,"url":"https://github.com/kylefoxaustin/ratchet","last_synced_at":"2026-06-12T03:03:27.866Z","repository":{"id":359012594,"uuid":"1222096243","full_name":"kylefoxaustin/ratchet","owner":"kylefoxaustin","description":"SoC sizing engine: workload model + KPI evaluator. Powers nightjar, keyhole, skippy.","archived":false,"fork":false,"pushed_at":"2026-06-05T06:07:41.000Z","size":168,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-06-05T08:05:54.687Z","etag":null,"topics":["benchmarking","edge-ai","npu","performance-modeling","python","sizing","soc"],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/kylefoxaustin.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-04-27T03:22:07.000Z","updated_at":"2026-06-05T06:07:44.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/kylefoxaustin/ratchet","commit_stats":null,"previous_names":["kylefoxaustin/ratchet"],"tags_count":2,"template":false,"template_full_name":null,"purl":"pkg:github/kylefoxaustin/ratchet","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kylefoxaustin%2Fratchet","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kylefoxaustin%2Fratchet/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kylefoxaustin%2Fratchet/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kylefoxaustin%2Fratchet/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/kylefoxaustin","download_url":"https://codeload.github.com/kylefoxaustin/ratchet/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kylefoxaustin%2Fratchet/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":34226631,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-12T02:00:06.859Z","response_time":109,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["benchmarking","edge-ai","npu","performance-modeling","python","sizing","soc"],"created_at":"2026-06-12T03:03:26.188Z","updated_at":"2026-06-12T03:03:27.860Z","avatar_url":"https://github.com/kylefoxaustin.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# ratchet\n\nGeneric SoC sizing engine — the shared foundation for an edge-SoC sizing\necosystem. Pure-Python primitives for what-if analysis of edge-class\napplication processors: a canonical NPU/GPU tier registry, a 4-level dtype\ncapability taxonomy, an LLM performance-projection API, an anchor-secrets\noverlay for private silicon measurements, an LLM catalog schema, calibration\nprovenance, plus the carried-forward sliders, KPIs, subsystem demand\ncalculators, instrumentation probes, and Parquet workload-record schema.\n\nDesigned to be shared across multiple sizer sites:\n\n- [`nightjar`](https://github.com/kylefoxaustin/nightjar) — drone software\n  stack + edge-SoC sizer (rescue-bird use case)\n- `personal-ai-assistant-sizer` (PAI sizer) — LLM-only edge sizer\n- `keyhole` / `keyhole-sizer` — video sizer\n- `personal-ai-framework` (Skippy) — agentic-AI framework\n- `drone-sizer` — planned\n\nratchet owns the canonical engine. Each consuming site composes its own visible\ntier ladder from the registry and supplies its own model catalog, subsystem\ndemand calculators, KPI definitions, and slider catalog. Surfaces pin\n`ratchet\u003e=0.2.0,\u003c0.3.0`; engine additions follow a rule-of-three (no new\nprimitive until ≥2 surfaces demonstrate the need).\n\n## Canonical tiers\n\n`TIERS` is the canonical registry (8 named tiers). Surfaces select a visible\nsubset; they never define new `Hardware` except via `make_custom_tier()` or the\n`hw_with_memory()` overlay.\n\n| Tier                          | Class     | Notes                                  |\n|-------------------------------|-----------|----------------------------------------|\n| NPU Low-LP4                   | edge      | LPDDR4X                                |\n| NPU Low-LP5-32bit             | edge      | LPDDR5, 32-bit bus                     |\n| NPU Low-LP5-64bit             | edge      | LPDDR5, 64-bit bus                     |\n| NPU Low-LP5X                  | edge      | LPDDR5X                                |\n| NPU i.MX 95 (ground truth)    | edge      | measured silicon (anchor attachment)   |\n| NPU Mid                       | edge      | LPDDR5X, mid-class NPU                  |\n| NPU High                      | edge      | LPDDR5X, high-class NPU                 |\n| RTX 5090 (reference, measured)| reference | Blackwell sm_120; FP4 reference silicon |\n\n## Install (dev)\n\n```bash\ngit clone https://github.com/kylefoxaustin/ratchet.git\ncd ratchet\npip install -e \".[dev,gpu]\"\npytest          # 230 tests\n```\n\nConsumer sites depend on ratchet via local pip install during development:\n\n```bash\n# from a sibling directory\npip install -e ../ratchet\n```\n\n## Layout\n\n```\nratchet/\n├── tiers/        Hardware dataclass + canonical TIERS registry +\n│                 memory-upgrade overlay + custom-tier factory\n├── precision/    4-level capability taxonomy + dtype dispatch +\n│                 deployment-path classifier\n├── projection/   LLM projection API (4-path cascade), result types,\n│                 memory feasibility, workload-pattern overlay\n├── anchors/      anchor-secrets loader + post-projection overlay for\n│                 private silicon measurements (runtime, never in source)\n├── catalog/      LLMModel schema + quant byte tables (content per-surface)\n├── calibration/  CalibrationSource provenance + silicon-class defaults\n├── engine/       primitives: Slider, SubsystemDemand, KpiResult, llm_demand\n├── whatif/       one consumer of the engine: point/sweep/pareto runner\n├── probes/       Parquet writer + per-op / GPU / NVENC / glass-to-glass probes\n└── schemas/      WorkloadRecord dataclass + PyArrow schema\n```\n\nSurfaces import the public API from `ratchet` directly (not from submodules).\nThe `engine/` and `whatif/` split is deliberate: a sizer can use the engine\nprimitives for one-shot evaluation without going through the what-if runner.\n\n## Usage\n\n```python\nfrom ratchet import project_llm, NPU_MID, Projected, WontFit, DtypeMismatch\nfrom ratchet.catalog.reference import QWEN3_30B_A3B_MOE_Q4\n\nresult = project_llm(QWEN3_30B_A3B_MOE_Q4, NPU_MID, \"rag_qa\",\n                     prompt_tokens=4800, decode_tokens=400)\nmatch result:\n    case Projected(decode_tok_s=t, source=s):\n        print(f\"{t} tok/s ({s})\")\n    case WontFit(required_gb=r, available_gb=a):\n        print(f\"Won't fit: {r:.1f} GB needed, {a:.1f} available\")\n    case DtypeMismatch(retargeting_hint=h):\n        print(h)\n```\n\n### Precision: capability vs. runtime realization\n\nThe capability taxonomy answers *\"can this silicon execute this dtype, and how\nwell?\"* (`tensor_native` / `tensor_compat` / `cuda_core` / `unsupported`),\nseparately from whether a given **runtime** can realize the win. FP4 is the\nclearest case: native on Blackwell sm_120, but its compute win is realized only\non a mature runtime (vLLM ≥ 0.22 / TensorRT-LLM) — on an immature one\n(llama.cpp today) FP4 behaves like INT4 weight-only (ADR 016).\n\n```python\nfrom ratchet import RTX_5090_REFERENCE, deployment_path_for_tier\n\ndeployment_path_for_tier(RTX_5090_REFERENCE, \"nvfp4\", \"fresh_compile\")\n# -\u003e \"native_fast\"\ndeployment_path_for_tier(RTX_5090_REFERENCE, \"nvfp4\", \"fresh_compile\", \"immature\")\n# -\u003e \"fp4_runtime_immature\"   (silicon is native, runtime can't realize the win)\n\n# project_llm(model, hw, workload, fp4_runtime_maturity=\"immature\")\n#   models an FP4 model as INT4 weight-only (prefill falls to the bf16 floor).\n#   Default is \"mature\" — non-breaking.\n```\n\n## Status\n\n**v0.2.6** — current. Engine consumed by pinned versions across surfaces;\nbackward compatible with v0.1.0 (all prior imports still work). See\n[`docs/decisions/`](docs/decisions) for the **16 ADRs** covering engine-level\ndesign choices, and [`docs/design/`](docs/design) for the design specs.\n\nVersion history:\n\n- **v0.2.0** — engine-consolidation release: absorbs the canonical tier\n  registry, capability taxonomy, projection API, anchor-secrets system, and LLM\n  catalog schema the ecosystem surfaces had evolved independently.\n- **v0.2.1** — anchor-loader correction (Amendment 3): the real `npu_anchors.py`\n  contract is canonical, superseding the design-doc §9 sketch.\n- **v0.2.2** — tier registry corrected to production silicon specs.\n- **v0.2.3** — BW-scale the private anchor overlay on memory-upgrade clones.\n- **v0.2.4** — correct NPU i.MX 95 TDP 8 → 10 W (Amendment 6).\n- **v0.2.5** — NVFP4/MXFP4 (FP4) added as a first-class compute dtype, distinct\n  from weight-only INT4 (a memory format that dequantizes to bf16).\n- **v0.2.6** — FP4's compute win is runtime-conditional (ADR 016): new\n  `fp4_runtime_maturity` projection axis, defaulting to `\"mature\"`.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkylefoxaustin%2Fratchet","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fkylefoxaustin%2Fratchet","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkylefoxaustin%2Fratchet/lists"}