{"id":50866487,"url":"https://github.com/karpeleslab/kataan","last_synced_at":"2026-06-15T02:00:55.547Z","repository":{"id":362387944,"uuid":"1258813479","full_name":"KarpelesLab/kataan","owner":"KarpelesLab","description":null,"archived":false,"fork":false,"pushed_at":"2026-06-11T03:08:46.000Z","size":6568,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"master","last_synced_at":"2026-06-11T03:14:33.831Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Rust","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/KarpelesLab.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":".github/FUNDING.yml","license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":"ROADMAP.md","authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null},"funding":{"github":["MagicalTux"]}},"created_at":"2026-06-04T00:16:56.000Z","updated_at":"2026-06-11T03:08:50.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/KarpelesLab/kataan","commit_stats":null,"previous_names":["karpeleslab/kataan"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/KarpelesLab/kataan","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/KarpelesLab%2Fkataan","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/KarpelesLab%2Fkataan/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/KarpelesLab%2Fkataan/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/KarpelesLab%2Fkataan/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/KarpelesLab","download_url":"https://codeload.github.com/KarpelesLab/kataan/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/KarpelesLab%2Fkataan/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":34344440,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-15T02:00:07.085Z","response_time":63,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2026-06-15T02:00:54.719Z","updated_at":"2026-06-15T02:00:55.537Z","avatar_url":"https://github.com/KarpelesLab.png","language":"Rust","funding_links":["https://github.com/sponsors/MagicalTux"],"categories":[],"sub_categories":[],"readme":"# Kataan\n\nA high-performance **JavaScript (ECMAScript) engine written in pure Rust**, with\nno foreign code on the critical path. Kataan is usable three ways — as a Rust\nlibrary, as a C library, and as a standalone command-line tool — the same\ntri-modal model proven out in the sibling projects\n[`purecrypto`](https://github.com/KarpelesLab/purecrypto) (cryptography) and\n[`rsurl`](https://github.com/KarpelesLab/rsurl) (HTTP/curl).\n\n\u003e **Status: running and broadly conformant; advanced tiers in active build-out.**\n\u003e The lexer and the full ECMAScript parser are complete, and **two execution\n\u003e engines** run real programs and are checked to agree on every test:\n\u003e\n\u003e - a **tree-walking interpreter** (the default / corpus engine), and\n\u003e - a **register bytecode VM** (the primary path for `kataan run` and the C ABI),\n\u003e   compiling nearly all of the common language directly — every operator,\n\u003e   objects/arrays, method calls with `call`/`apply`/`bind`, `new`/`new.target`,\n\u003e   all loops + `for-of`/`for-in`/`switch`/`try`-`catch`-`finally`,\n\u003e   closures (incl. mutual recursion), destructuring, rest/spread, **classes**\n\u003e   with `extends`/`super` and getters/setters, generators (incl. `yield*` and\n\u003e   `.throw()`), and `async`/`await` — falling back to the tree-walker for the\n\u003e   handful of constructs it doesn't yet compile.\n\u003e\n\u003e A **dual-path Test262-style conformance corpus (520/520) passes on both\n\u003e engines**, covering closures, classes/inheritance (incl. `extends` of native\n\u003e errors), optional chaining, the iterator protocol, `Map`/`Set`/`WeakMap`,\n\u003e `Symbol` (incl. `Symbol.hasInstance`), `BigInt`, `Promise` + async/await,\n\u003e `Proxy`/`Reflect` (incl. the `ownKeys` trap driving `Object.keys`/`values`/\n\u003e `entries`/`for-in`), typed arrays, `Date`, an in-house `RegExp`, and a large\n\u003e standard library (Math, JSON, Object/Array/String/Number). Compiled bytecode can\n\u003e be serialized, reloaded, and run without the source.\n\u003e\n\u003e Three advanced tiers are real and tested, though each has named work remaining:\n\u003e\n\u003e - a **machine-code JIT** (x86-64 / Linux, behind `jit`) with an optimizing\n\u003e   integer path (four-pass optimizer + register allocator) and a float path\n\u003e   covering `+ - * / %`, comparisons, control flow, and the SSE-expressible\n\u003e   `Math` intrinsics (`sqrt`/`abs`/`min`/`max`/`floor`/`ceil`/`trunc`), emitting\n\u003e   into W^X memory via raw syscalls; object/string ops stay interpreted;\n\u003e - a pure-Rust, `no_std` **WebAssembly engine** — full MVP plus sign-extension,\n\u003e   saturating conversion, bulk-memory, multi-value, and typed structured\n\u003e   control — with a JS↔WASM boundary (`validate`/`compile`/`instantiate`, the\n\u003e   `Module`/`Instance`/`Global`/`Memory` objects, host-function imports, and\n\u003e   stateful instances), driven by a `.wast`/WAT spec harness (a spec-derived\n\u003e   corpus, not yet the full upstream suite);\n\u003e - a **zero-copy \"D′\" snapshot tier** atop the moving GC: a verified codec that\n\u003e   `mmap`-reloads a heap (eleven reference cell kinds, cross-kind cycles,\n\u003e   insertion-order-preserving) and runs a restored closure both in place and\n\u003e   reloaded into a fresh runtime.\n\u003e\n\u003e Kataan works as a CLI/REPL, a Rust library, and a C library (`kt_eval`). See\n\u003e the [roadmap](ROADMAP.md) for the remaining road to a complete engine.\n\n## Why\n\nModern JavaScript engines (V8, JavaScriptCore, SpiderMonkey) all rely on the\nsame handful of techniques. Kataan commits to the full set from the\narchitecture stage rather than retrofitting them:\n\n- **NaN-boxed values** — every JS value in 64 bits, `Copy`, dense on the stack.\n- **Hidden classes (shapes) + inline caches** — property access becomes a slot\n  load, not a hash probe; the single biggest lever for real-world JS speed.\n- **Register-based bytecode VM** — fewer instructions than a stack VM, and\n  JIT-friendly by construction.\n- **Interned atoms + rope strings** — O(1) key comparison, non-quadratic\n  string building.\n- **A precise, generational, moving GC** — bump allocation makes `new` nearly\n  free.\n- **Tiered execution** — a fast interpreter first, then a baseline JIT, then an\n  optimizing JIT driven by inline-cache type feedback.\n\nThe language core is **sans-I/O** and `no_std + alloc`; the host runtime (event\nloop, timers, `fetch`, `crypto`, modules) is a separate layer on top, so the\nengine stays embeddable. See [`ROADMAP.md`](ROADMAP.md) for the road ahead — the\nremaining work to a complete JS+WASM engine and the design invariants behind it.\n\n## Pure Rust, no foreign code\n\nKataan depends on no C libraries. Where it needs cryptography or networking it\nreuses sibling **pure-Rust** Karpelès Lab crates:\n\n- [`purecrypto`](https://github.com/KarpelesLab/purecrypto) — `crypto.subtle` /\n  WebCrypto, `crypto.getRandomValues`, `randomUUID`, and TLS.\n- [`rsurl`](https://github.com/KarpelesLab/rsurl) — HTTP/HTTPS transport behind\n  `fetch` and the Node `http(s)` compatibility layer.\n\n`unsafe` is quarantined: the crate is `unsafe_code = \"deny\"` (not `forbid`),\nand only the `ffi` module plus a small, audited set of VM hot-path primitives\nopt back in with a scoped `#[allow(unsafe_code)]` and a safety comment.\n\n## Try it\n\nThe CLI runs JavaScript today:\n\n```console\n$ cargo run -- run -e '\nclass Animal { constructor(n){ this.n = n } speak(){ return `${this.n} makes a sound` } }\nclass Dog extends Animal { speak(){ return `${this.n} barks` } }\nconsole.log(new Dog(\"Rex\").speak());\nconsole.log([1,2,3,4].filter(x =\u003e x % 2).map(x =\u003e x*x).reduce((a,b)=\u003ea+b, 0));\nconsole.log(JSON.stringify({ ok: true, items: [...new Set([1,1,2,3])] }));\n'\nRex barks\n10\n{\"ok\":true,\"items\":[1,2,3]}\n```\n\nIt also exposes each pipeline stage, and an interactive REPL:\n\n```console\n$ cargo run -- lex    -e 'x =\u003e x * 2'  # token stream\n$ cargo run -- parse  -e 'x =\u003e x * 2'  # AST dump\n$ cargo run -- disasm -e '1 + 2 * 3'   # register bytecode\n$ cargo run -- repl                    # interactive session\n$ cargo run -- --help\n```\n\nThe `disasm` command shows the register bytecode the compiler emits:\n\n```console\n$ cargo run -- disasm -e 'let s = 0; let i = 0; while (i \u003c 3) { s += i; i += 1; } s'\nchunk #0 \"\u003cmain\u003e\"  (regs=14, params=0)\n     0  LoadInt     r0, 0\n     ...\n     6  Lt          r6, r4, r5\n     7  JumpIfFalse r6, +9\n     ...\n    16  Jump        -13\n    18  Return      r13\n```\n\n## Use as a Rust library\n\n```rust\nuse kataan::parser::Parser;\nuse kataan::interp::Interp;\n\nlet program = Parser::parse_program(\"const sq = x =\u003e x * x; sq(8)\").unwrap();\nlet mut interp = Interp::new();\nassert_eq!(interp.run(\u0026program).unwrap().to_js_string(), \"64\");\n```\n\nThe lower stages are available directly too:\n\n```rust\nuse kataan::lexer::{Lexer, TokenKind};\n\nlet tokens = Lexer::new(\"let answer = 42;\").tokenize().unwrap();\nassert_eq!(tokens.first().unwrap().text(\"let answer = 42;\"), \"let\");\nassert_eq!(tokens.last().unwrap().kind, TokenKind::Eof);\n```\n\n### Feature flags\n\n| Feature   | Default | Description                                                        |\n|-----------|:-------:|--------------------------------------------------------------------|\n| `std`     |   ✓     | Standard library; implies `alloc`. Needed by the host runtime/CLI. |\n| `alloc`   |   ✓     | Heap-backed types; the minimum for the pure language core.         |\n| `regex`   |   ✓     | In-house regular-expression engine.                                |\n| `intl`    |   ✓     | In-house `Intl`-lite (collation, number/date formatting).          |\n| `module`  |   ✓     | ESM + CommonJS module loader.                                      |\n| `host`    |   ✓     | Host runtime: event loop, timers, console, encoding, URL, streams. |\n| `fetch`   |         | `fetch` / Node `http(s)` over `rsurl`.                             |\n| `crypto`  |         | `crypto.getRandomValues` / WebCrypto over `purecrypto`.            |\n| `jit`     |         | Machine-code JIT (x86-64/Linux): optimizing integer + float paths. |\n| `ffi`     |         | The C ABI (the only place broad `unsafe` is allowed).             |\n| `cli`     |   ✓     | The `kataan` command-line tool.                                   |\n\nBuild the bare `no_std` language core with:\n\n```console\ncargo build --no-default-features --features alloc\n```\n\n## Use as a C library\n\n```console\ncargo rustc --lib --release --features ffi --crate-type staticlib   # libkataan.a\ncargo rustc --lib --release --features ffi --crate-type cdylib      # libkataan.so\n```\n\nThe header is [`include/kataan.h`](include/kataan.h); a runnable example lives\nin [`tests/ffi_smoke.c`](tests/ffi_smoke.c). The C ABI follows the `purecrypto`\nconventions — `KtStatus` return codes, the in/out length convention, opaque\nhandles, and a panic catch at every boundary.\n\n## License\n\nMIT © 2026 Karpelès Lab Inc. See [LICENSE](LICENSE).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkarpeleslab%2Fkataan","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fkarpeleslab%2Fkataan","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkarpeleslab%2Fkataan/lists"}