{"id":50963614,"url":"https://github.com/lantos1618/zen-holotype","last_synced_at":"2026-06-18T17:33:09.017Z","repository":{"id":361424005,"uuid":"1254301476","full_name":"lantos1618/zen-holotype","owner":"lantos1618","description":"zen-holotype: an everything-is-a-type compiler for a Zen-flavoured language. One trie where module imports, type-checking, and pointer-direction safety are a single fits() operation. tree-sitter front end, C back end.","archived":false,"fork":false,"pushed_at":"2026-06-15T21:05:33.000Z","size":4170,"stargazers_count":0,"open_issues_count":1,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-06-15T23:08:19.231Z","etag":null,"topics":["compiler","programming-language","tree-sitter","type-system"],"latest_commit_sha":null,"homepage":null,"language":"C","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/lantos1618.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-05-30T11:53:11.000Z","updated_at":"2026-06-13T13:25:47.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/lantos1618/zen-holotype","commit_stats":null,"previous_names":["lantos1618/zen-holotype"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/lantos1618/zen-holotype","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lantos1618%2Fzen-holotype","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lantos1618%2Fzen-holotype/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lantos1618%2Fzen-holotype/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lantos1618%2Fzen-holotype/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/lantos1618","download_url":"https://codeload.github.com/lantos1618/zen-holotype/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lantos1618%2Fzen-holotype/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":34501475,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-18T02:00:06.871Z","response_time":128,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["compiler","programming-language","tree-sitter","type-system"],"created_at":"2026-06-18T17:33:05.509Z","updated_at":"2026-06-18T17:33:09.009Z","avatar_url":"https://github.com/lantos1618.png","language":"C","funding_links":[],"categories":[],"sub_categories":[],"readme":"# zen\n\n**zen** is a tiny, **self-hosted** compiler for a small [Zen](https://github.com/lantos1618/zenlang)-flavoured\nlanguage, built to test one idea: **pin down what every value _is_ with type structure,\nand you lock out everything it isn't.** The compiler already applies that to names,\nfunctions, generics, and numeric fits; pointer direction/nullability are still converging\non the same model.\n\nThe compiler is written in Zen and compiles itself: `cc` builds a `zenc` binary from\ncommitted C, and `zenc` re-emits that C byte-for-byte. C is the intentional\nintermediate/bootstrap target today — not a defect or a host-language fallback. There is\n**no Python and no tree-sitter** in the build — see [Build \u0026 run](#build--run).\n\n\u003e Every path resolves to exactly **one** canonical node — the single definition that\n\u003e *is* the meaning of a name — and diamond imports collapse onto it.\n\n## What we're actually doing: structure *is* the constraint\n\nThe target is not a pile of checks that hunt for bad programs — no separate null pass,\nborrow pass, and linker-shaped namespace pass. We do the opposite: **describe exactly\nwhat each thing is, and let that description lock out everything it isn't.** A type is a\nclosed door; \"checking\" is confirming the key fits the lock.\n\nTake one annotation. The intended shape of `Ptr\u003cVec\u003e` is not \"a pointer\" — it's three\nlocks at once:\n\n```\n   Ptr \u003c Vec \u003e\n    │     └──── points at THIS type only      (a different struct? rejected)\n    ├──────── read-only   →  mutation locked out      (write needs MutPtr)\n    └──────── non-null    →  absence locked out       (null needs Option\u003c…\u003e)\n```\n\nThe desired capability model is **opt-in**. Didn't write `MutPtr`? Mutation should be\nlocked out. Didn't write `Option`? Null should be unrepresentable. The same move scales:\na **path** locks identity (`core.vec.Vec` is one node, so you can't mean a different\n`Vec`), and a **function signature** locks its call sites (only values whose locks match\nthe parameter get in).\n\nCurrent implementation note: the parser accepts `Ptr\u003cT\u003e`, `MutPtr\u003cT\u003e`, and `RawPtr\u003cT\u003e`,\nbut the checker/backend currently collapse them to one pointer type and enforce invariant\npointee equality. Nullable values are modeled through library enums/raw pointers today; the\nfull pointer-direction/nullability lattice is the direction, not fully enforced shipping\nbehavior yet.\n\nSo the three things a compiler usually does separately — resolve names, check types, prove\npointer safety — are meant to become the single act of **fitting a key to a lock**. The\ncurrent compiler already uses that shape for function calls, generics, numeric widening,\nand invariant pointer pointees.\n\n## How it works\n\n```\n   lex.zen ──tokens──► parse_*.zen ──► compiler.genc AST ──► check.zen ──► genc_emit.zen ──► C ──► cc\n   (all compiler stages are ordinary Zen, in zen/compiler/)\n```\n\n```\n   core/vec.zen   ops.zen   main.zen\n        │\n        ▼  compiler.lex + compiler.parse  (lexer + recursive-descent parser, in Zen)\n   ┌──────────┐\n   │   AST    │   compiler.genc Expr / Stmt / Decl values\n   └────┬─────┘\n        │  insert every decl at its path\n        ▼\n  ╔═══════════ Namespace — ONE trie (path = identity) ═══════╗\n  ║  root                                                    ║\n  ║   ├─ core.vec.Vec         (struct)                       ║   diamond imports\n  ║   ├─ ops.len   ops.cap    (fns)                          ║   collapse to ONE\n  ║   └─ main.area   main.main  (fns)                        ║   node, for free\n  ╚═════════════════════╤════════════════════════════════════╝\n                        │  resolve refs · infer() each body · fits() each call\n                        ▼\n              ┌─────────────────────┐\n     PASS ✓ ◄─┤  fits(given, want)? ├─► FAIL ✗   reported, excluded from codegen\n              └──────────┬──────────┘   type mismatch ✗\n                         ▼               (numeric widening + structural equality today)\n              ┌─────────────────────┐\n              │     lower to C      │   pointers erase to C pointers\n              └──────────┬──────────┘\n                        ▼  cc\n                    build/vecdemo   ──►   12\n```\n\n## Why it pays off\n\nFolding name-resolution, type-checking, and pointer-safety toward one `fits()` relation\nisn't just tidy — it buys real things:\n\n- **Imports are becoming structural.** Today `std.internal.resolve` flattens the `std`/`compiler`\n  import closure, dedups by module and top-level name, and gives deterministic\n  first-definition behavior. Namespace binds now prefix direct module exports so two bound\n  modules can safely share short names. The trie/path model is still the broader direction.\n- **Pointer safety is moving into type-checking.** Numeric widening and invariant pointer\n  pointees are checked by `fits()` today. Pointer direction and nullability are spelled in\n  source, but full enforcement of those axes is still pending.\n- **Low runtime cost.** Implemented pointer forms erase to plain C pointers, and checked\n  program structure lowers directly to C. Library `Opt`/`Result` values remain explicit\n  user-level enums where tags are part of the chosen representation.\n- **It stays small, and it's its own proof.** The checker and validator are written in Zen,\n  and the compiler compiles itself (a deterministic fixpoint).\n\nThe trade: it leans on **nominal** identity (a type *is* its path) and asks you to write\nevery pointer's direction and nullability down. As the checker catches up to that surface,\nthose axes can stay in one pass instead of becoming separate analyses.\n\n## The whole compiler, in four ideas\n\n**1. The path/trie model is the namespace direction.**\nConceptually, a file's path *is* its name. `core/vec.zen` defining `Vec` becomes the\none node `core.vec.Vec` — so every import of it lands on that same node.\n\n```\ncore/vec.zen     Vec*: { len: i32, cap: i32 }      →  defines node  core.vec.Vec\n\nops/area.zen     { Vec } = core.vec    ─┐\nmain.zen         { Vec } = core.vec    ─┴─►  both resolve to that ONE node\n                                             (a diamond import — never duplicated)\n\ntarget conflict?  two files both define  core.vec.Vec\n```\n\nThe trie model is the direction for names and imports. Today `std.internal.resolve` is the\nself-hosted loader that walks a program's `{ … } = std.X` imports, gathers the\ntransitive closure, dedups module/name collisions, and hands `zenc` one flat module —\nsee [Modules \u0026 imports](#modules--imports).\n\n**2. Pointers are types. `fits()` is where that logic is landing.**\nThe target shape has direction (`Ptr`/`MutPtr`/`RawPtr`) and nullability\n(`Option\u003cT\u003e`, no bare null) as axes of the type, so the same check that resolves\neverything else can also lock pointer direction and reject nulls. Today, `fits()`\nenforces numeric widening plus structural equality, including invariant pointer\npointees; direction/nullability enforcement is still pending.\n\n```\n DIRECTION              NULLABILITY\n   MutPtr   (subtype)     Option\u003cT\u003e   nullable\n     |                       |\n    Ptr      read-only       T         nonnull\n```\n\n```\ntarget fits(given, want):\n    nonnull T    where Option\u003cT\u003e wanted   -\u003e ok      (T \u003c= Option\u003cT\u003e)\n    Option\u003cT\u003e    where plain    T wanted   -\u003e REJECT  (the null guard)\n    MutPtr\u003cT\u003e    where Ptr\u003cT\u003e   wanted     -\u003e ok      (MutPtr \u003c= Ptr)\n    Ptr\u003cT\u003e       where MutPtr\u003cT\u003e wanted    -\u003e REJECT  (direction locked)\n```\n\n**3. The type system lowers to plain C.** Implemented pointer forms erase to C pointers,\nand checked structure lowers directly. The source language still branches with `.match`\nonly; the C backend is free to lower checked\nmatches to target-level `if`/`else` or `?:` because those are backend details, not Zen\nsyntax.\n\n**4. The compiler is Zen, and self-hosting.** Lexer, parser, checker, and the C\nbackend are all ordinary Zen modules in `zen/compiler/` (`lex`, `parse*`, `check`, `genc*`).\n`zenc` compiles them to C; fed its **own** sources it re-emits byte-for-byte the committed\n`bootstrap/zenc.gen.c` — a deterministic **fixpoint**. New backend = new walk over the same\nAST (a partial JavaScript backend, `compiler.genjs`, already exists alongside the\nintentional bootstrap C backend, `compiler.genc`).\n\n## Build \u0026 run\n\nThe compiler is the `zenc` binary. `cc` builds it from committed C; nothing else is needed.\n\n```sh\nmake -f bootstrap/Makefile zenc        # cc bootstrap/{zenc.gen.c,zenrt.c,driver.c} -o zenc\n./zenc path/to/flat.zen \u003e out.c        # plain emit: read flat Zen → emit C on stdout\necho 'add* = (a: i32, b: i32) i32 { a + b }' | ./zenc \u003e out.c\n```\n\nPlain emit mode is deliberately small: it expects **one already-flat module**, does not load\n`std` imports from disk, and is not the validating user-program path. Use the checked CLI\nmodes for programs:\n\n```sh\n./zenc check prog.zen                  # resolve std imports, type-check, no binary\n./zenc build prog.zen -o prog          # resolve std imports, type-check, emit C, link with cc\n./zenc run prog.zen                    # same as build, then run the temporary binary\n```\n\n`build`/`run` require `main = () i32 { … }`; `check` accepts library-like modules without\n`main`. A program with `{ … } = std.X` imports is flattened by the self-hosted loader inside\nthose checked modes — see [Modules \u0026 imports](#modules--imports).\n\n`check`/`build`/`run` also accept a project directory containing `zen.toml`:\n\n```toml\npackage = \"hello\"\nroot = \"src\"\nmain = \"main.zen\"\nout = \"hello\"\nccflags = \"native.c\"\n```\n\nThe compiler resolves that to `\u003cproject\u003e/\u003croot\u003e/\u003cmain\u003e`. `build \u003cproject-dir\u003e` uses the\nmanifest's `out` path when `-o` is omitted. `ccflags` is passed through to `cc`, so a project\ncan add native support files or compiler/linker flags while this package layer is still small.\n\n**Regenerate the committed C** after editing any graph-listed bootstrap compiler source under\n`zen/compiler/{lex,parse*,check*,check_validate,genc*}.zen` or the loader sources\n`zen/std/{io,resolve}.zen` — the binary reads `bootstrap/sources.txt` and rebuilds its own C,\nwith no Python. The manifest order is checked against the resolver graph's SCC order.\n\n```sh\nmake -f bootstrap/Makefile regen       # builds zenc, then: ./zenc --build-self bootstrap/zenc.gen.c .\ngit diff --quiet bootstrap/zenc.gen.c  # the fixpoint: the regenerated C must be byte-identical\n```\n\n**Tests** — the **binary-only oracle**. `pytest` here is just the test *runner*: it drives\nthe compiled `zenc` (and a check-mode build of it) as subprocesses and imports **zero**\ncompiler code. It is the correctness reference while a Zen-native oracle is brought up.\n\n```sh\npip install -r requirements-dev.txt    # only pytest (no mypy, no compiler deps)\npytest tests/                          # emit/run parity, reject-parity, the fixpoint, modules, traits, genjs\n```\n\n## Foreign bindings \u0026 the prelude\n\nA program is built from three layers — what's *implicitly there*, what *just links*, and\nwhat you must *import*. Keeping that boundary explicit is the point.\n\n- **The compiler-emitted head.** Every emitted translation unit opens with the `zslice`\n  typedef (`typedef struct { void* ptr; int64_t len; } zslice;` — the `[T]` fat pointer)\n  and the C `stdint`/`stdbool` types. You write nothing to get these.\n- **Intrinsics — handled inline by the backend**, never declared or imported:\n  `slice`, `addr`, `load`, `store`, `offset`, `cstr`, `null_ptr`, `load_i64`, `store_i64`,\n  `atomic_add_i64`, and `sizeof(T)`. They lower to raw C (a pointer deref, a struct\n  literal), so they need no binding.\n- **Foreign bindings — a bodyless function IS a C extern.** `malloc = (n: i64) RawPtr\u003cu8\u003e`\n  with no `{ … }` body binds the libc symbol `malloc`; the checker learns the signature and\n  the backend emits a forward declaration. libc symbols (`malloc`, `putchar`, `strlen`, …)\n  then **just link** — the system headers define them. No `extern` keyword.\n- **The header *is* a function.** `zen/std/io/c.zen`'s `libc() [Decl]` builds those bodyless\n  bindings *as AST* and `compiler.genc.genModule(libc())` emits exactly the C prototypes a TU\n  needs — the bindings live in **one** Zen module instead of being re-prototyped in every\n  file. (`std.mem.raw`, `std.io.file`, `std.core.result` still re-declare the handful of\n  symbols they each need at the top, which is the scatter `std.io.c` is gathering.)\n- **std modules — you must import them.** `std.mem.raw`, `std.text.str`, `std.text.string`, `std.mem.alloc`,\n  `std.collections.vec`, `std.collections.iter`, … are ordinary Zen you bring in with `{ … } = std.X`; they are\n  checked and lowered like your own code.\n\nThe ownership rule: Zen-owned memory takes an explicit allocator from program setup; FFI handles sit\nbelow that allocator discipline and must be wrapped with the matching release operation as soon as\nthey cross back in. The checker now rejects same-body local use after `Own\u003cT\u003e.release_in(...)`,\n`Rc\u003cT\u003e.drop_in(...)`, or `Arc\u003cT\u003e.drop_in(...)`; see **[MEMORY_MODEL.md](MEMORY_MODEL.md)** for the exact current\nscope and remaining gaps.\n\n## Modules \u0026 imports\n\nImports are a destructuring of a module path: `{ a, b } = std.X` binds `a` and `b` from\n`zen/std/X.zen`. The file-based CLI modes (`zenc check`, `zenc build`, `zenc run`, `zenc emit`) call\n`zen/std/internal/resolve.zen` before parsing, so std imports resolve from disk and the program is\nthen parsed as one flattened module:\n\n- it reads the program's `{ … } = std.X` import lines, follows each edge to\n  `zen/std/X.zen`, and gathers the **transitive closure**;\n- it strips the import lines and concatenates each module's body **once** (per-module dedup\n  breaks import cycles; a final per-**name** pass keeps the first definition of each\n  top-level name, so a cross-module clash like `string.free` vs `mem.free` resolves the same\n  way \"nearest defining module wins\" would);\n- namespace binds such as `c = std.io.c` or `left = left` rewrite qualified calls to\n  alias-prefixed symbols, so bound modules can both export a natural name like `thing` or\n  `Box` without colliding in the flattened output;\n- the result is one flat module handed to the normal parse/check/codegen pipeline.\n\nThe bare filter form (`zenc file.zen` or stdin) remains lower-level: it expects the source to\nalready be flat and emits C without the `std` import-loading/check/build wrapper. Use\n`zenc emit \u003cfile.zen\u003e` when inspecting generated C for normal source files with imports or\nnamespace binds.\nThe resolver also understands `compiler.X` for internal compiler/std dependencies such as\n`std.internal.ast` building values from `compiler.genc`; normal user-facing imports should stay in\nthe `std` namespace.\n\n## Errors are values\n\nZen is `.match`-only — **no exceptions, no stack unwinding** (hidden control flow is\nbanned). A fallible call returns a `Result\u003cT, E\u003e` (`std.core.result`): `.Ok(T)` or `.Err(E)`,\nwhich the caller `.match`es. `.match` *is* the catch; `return .Err(e)` propagates by value.\nAn optional value is `Opt\u003cT\u003e` (`.Some` / `.None`); the standard FFI error is `IoError`. The\nboundary helpers `ok_if` / `ok_ptr` lift a raw C sentinel (a negative rc, a null pointer)\ninto a `Result`. `panic` is the explicit, greppable abort for invariant breaks — *not* the\ndefault path.\n\n## What it covers\n\nThe language now covers structs and **generic data types** (`Box\u003cT\u003e` — the type-arg\ninferred from field values, monomorphized to concrete C), **user enums** (`|`-separated\nvariants, C tagged unions), **generic functions** (`id\u003cT\u003e` — type-args inferred by\nunification, **monomorphized**), **traits / constrained generics** (keyword-free: a trait\nis a record of signatures `Area*: { … }`, an impl is `Vec.impl(Area, { … })`,\n`\u003cT: Trait\u003e` — bound methods dispatch to the concrete impl; an unsatisfied bound is a type\nerror), **`.match`** with payload-binding, exhaustiveness, and **literal patterns** on\n`i32`/`bool` (so with **recursion** the language is Turing-complete — `fact`/`fib` compile\nand run; there is no source-level `if` statement), **return-type inference** (omit the\nreturn type and it's inferred from the body, across calls), `Ptr/MutPtr/RawPtr` and\n`Option`, `i32`/`i64`/`u8`/`bool` with widening, the full operator set\n(`+ - * / %  ==  \u003c \u003e \u003c= \u003e=  \u0026\u0026 ||  !`, each operand-checked), `x := v` let-bindings, the\nsingle `loop` iteration construct, mutation, slices `[T]`, a heap-allocating `String`/`Vec`\non an explicit allocator, and **metaprogramming as values** (build AST with `std.internal.ast` →\nemit with `compiler.genc.genModule` — no `@emit` pragma). Checked CLI errors report the\nsource path, source location for expression errors and trait-conformance impl errors, a stable\nerror kind (`error[arity]`), a message, a source-line caret when the source maps cleanly, and\na hint; the checker now exposes the CLI-compatible\n`CheckDiagnostic { code, kind, source_offset, span_width, count, message, hint }` plus the\nfirst-class Zen value `Diagnostic { code, kind, span: SourceSpan, count, message, hint }`.\n\nSee **[SPEC.md](SPEC.md)** for the current language behavior,\n**[FEATURES.md](FEATURES.md)** for the full inventory,\n**[ERROR_POLICY.md](ERROR_POLICY.md)** for the stdlib Result/error contract,\n**[JS_BACKEND.md](JS_BACKEND.md)** for the experimental JavaScript backend scope,\n**[ARCHITECTURE.md](ARCHITECTURE.md)** for how the self-hosted compiler is structured,\n**[VISION.md](VISION.md)** for the why, and **[CHANGELOG.md](CHANGELOG.md)** for history.\n\n## Layout\n\n| path | role |\n|---|---|\n| `zen/compiler/lex.zen` | the lexer — `scan(src, pos)` over a `str`, slice-free |\n| `zen/compiler/parse.zen` + `parse_expr` / `parse_stmt` / `parse_type` | recursive-descent parser → `compiler.genc` AST |\n| `zen/compiler/check.zen` + `check_validate.zen` | resolver + the `fits()` validator |\n| `zen/compiler/genc.zen` + `mono` + `genc_emit` | shared AST + monomorphization + C backend |\n| `zen/compiler/genjs.zen` | an experimental JavaScript backend over the *same* AST |\n| `zen/std/{mem,str,string,alloc,vec,iter}.zen` | the runtime stdlib (allocator, slices, strings, iterators) |\n| `zen/std/{c,result,cown,drop,io,resolve}.zen` | bindings, errors-as-values, FFI-memory rule, module loader |\n| `bootstrap/` | `zenc.gen.c` (committed emitted C) + `sources.txt` (graph/SCC-checked bootstrap manifest) + `zenrt.c`/`driver.c`/`Makefile` |\n| `tests/` | the binary-only oracle (pytest as runner; imports no compiler code) |\n\nInspired by treeform's [jsony](https://github.com/treeform/jsony) (parse straight\ninto typed objects, hook-based) and the syntax of\n[zenlang](https://github.com/lantos1618/zenlang).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flantos1618%2Fzen-holotype","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Flantos1618%2Fzen-holotype","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flantos1618%2Fzen-holotype/lists"}