{"id":50677675,"url":"https://github.com/kassane/mos-toolchains-research","last_synced_at":"2026-06-08T16:35:02.484Z","repository":{"id":362333215,"uuid":"1258342219","full_name":"kassane/mos-toolchains-research","owner":"kassane","description":"AI Research about LLVM-MOS based toolchains support","archived":false,"fork":false,"pushed_at":"2026-06-03T22:04:49.000Z","size":130,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-06-03T22:19:05.507Z","etag":null,"topics":["c","cpp","d","llvm","llvm-mos","mos6502","research","rust","toolchain","zig"],"latest_commit_sha":null,"homepage":"","language":"Shell","has_issues":false,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"unlicense","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/kassane.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-06-03T13:43:55.000Z","updated_at":"2026-06-03T22:04:52.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/kassane/mos-toolchains-research","commit_stats":null,"previous_names":["kassane/mos-research-ai","kassane/mos-toolchains-research"],"tags_count":null,"template":false,"template_full_name":null,"purl":"pkg:github/kassane/mos-toolchains-research","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kassane%2Fmos-toolchains-research","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kassane%2Fmos-toolchains-research/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kassane%2Fmos-toolchains-research/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kassane%2Fmos-toolchains-research/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/kassane","download_url":"https://codeload.github.com/kassane/mos-toolchains-research/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kassane%2Fmos-toolchains-research/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":34071657,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-08T02:00:07.615Z","response_time":111,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["c","cpp","d","llvm","llvm-mos","mos6502","research","rust","toolchain","zig"],"created_at":"2026-06-08T16:35:01.962Z","updated_at":"2026-06-08T16:35:02.473Z","avatar_url":"https://github.com/kassane.png","language":"Shell","funding_links":[],"categories":[],"sub_categories":[],"readme":"# mos-research-ai\n\nResearch \u0026 test bed for **cross-language FFI on the MOS 6502** (and the wider\n65xx family), riding the shared **LLVM-MOS** backend. Four LLVM frontends in\nscope — clang (C/C++), rustc, zig, ldc2 (D) — all *unofficial / non-upstream*\nports that reach the 6502 through some build of `llvm-mos/llvm-mos`.\n\nThe central question:\n\n\u003e Four LLVM-frontend toolchains target the 6502 through forks of LLVM-MOS. Does\n\u003e that shared backend actually give a shared ABI — can the languages call each\n\u003e other on a real (simulated) 6502, and can their LLVM IRs and binaries be mixed?\n\nShort answer, established empirically here (every claim is a re-runnable\nexperiment that executes on the `mos-sim` 6502 simulator):\n\n\u003e **Yes for scalars (every width), pointers and callbacks — five co-linkable\n\u003e language objects (C, C++, Rust, D, Zig) share one C ABI and run correctly in a\n\u003e single 6502 binary (`experiments/02, 13`).** The holes are narrow and specific:\n\u003e (1) *types* — the keyword `int` is 16-bit in C but 32-bit in D/Rust/Zig (Zig's\n\u003e `c_int` used to be 32-bit too, now fixed to 16-bit = C's); (2) *struct layout* — Zig's\n\u003e `extern struct` over-aligns (`u32`→4-byte) so structs corrupt unless fields are\n\u003e `align(1)`; (3) *one call-ABI corner, now closed* — **by-value structs ≤4 bytes**\n\u003e are decomposed into registers (the MOS C ABI) by **all five** frontends; D and Rust\n\u003e were the indirect holdouts and **both were fixed in their callconv rebuilds**\n\u003e (`experiments/12`). **Fix: cross the boundary with fixed-width scalars; passing\n\u003e aggregates by pointer stays the version-proof choice (no ABI-stability promise).**\n\nSee **[Research.md](Research.md)** for the write-up and **[docs/](docs/)** for the\nevidence. Toolchain quirks worth knowing up front are in **[CLAUDE.md](CLAUDE.md)**.\n\n## The four toolchains\n\n| Lang | Toolchain | Version | LLVM | Triple / CPU |\n|------|-----------|---------|------|--------------|\n| C / C++ | LLVM-MOS SDK | clang **23.0.0git** | **23** | `--target=mos -mcpu=mos6502` |\n| Rust | Rust-MOS | rustc **1.98.0-dev** | **23** | `--target mos-unknown-none -Ctarget-cpu=mos6502` |\n| Zig | Zig-MOS | **0.17.0-mos-dev** | **22** | `-target mos-freestanding -mcpu mos6502` |\n| D | LDC2-MOS | LDC **1.42.0** (DMD 2.112.1) | **23** | `--mtriple=mos -mcpu=mos6502 -mattr=…` |\n\nAll four emit the **byte-identical** LLVM data layout, which is the whole basis\nfor interop:\n\n```\ne-m:e-p:16:8-p1:8:8-i16:8-i32:8-i64:8-f32:8-f64:8-a:8-Fi8-n8\n```\n\n(little-endian; 16-bit pointers; **address space 1 = 8-bit zero-page pointers**;\nevery scalar byte-aligned; 8-bit native int.) Two LLVM *versions* are in play —\nSDK+Rust+D on 23, Zig on 22 — so they form two **clusters** for bitcode/LTO\npurposes (docs/04), but ELF objects link freely across both.\n\n## Quickstart\n\n```bash\nscripts/setup.sh          # download the 4 toolchains into /home/user/tools (~360 MB)\nsource scripts/env.sh     # export $ZIG $LDC $RUSTC $SDKBIN $MOS_MATTR …\nscripts/run-all.sh        # build+run every experiment on mos-sim (expect 0 failing)\n```\n\nEach `experiments/NN-*/run.sh` is self-contained and ends by executing its\nbinary on `mos-sim` (exit code = its own pass/fail). The toolchains live\n**outside** the repo and are never committed (`.gitignore`).\n\n## Experiments\n\n| # | Dir | What it proves |\n|---|-----|----------------|\n| 01 | `ir-datalayout` | All 4 frontends emit one identical datalayout; `int`→`i16`(C) vs `i32`(D/Rust/Zig) |\n| 02 | `ffi-matrix` | C+C+++Rust+D+Zig in **one** 6502 binary, run on sim, incl. D→Rust \u0026 Zig→C cross-calls |\n| 03 | `int-width` | Runtime `sizeof` table: the `int`/`long`/`c_int` divergences (and what agrees) |\n| 04 | `llvm-ir-mix` | LLVM-23 `clang`/`lld` merges Zig's LLVM-22 textual IR (C/D/Rust are 23); LTO across languages |\n| 05 | `codegen-cycles` | Same loop, 5 languages: identical result, different instruction count \u0026 cycles |\n| 06 | `cpu-features` | `ldc -mcpu` leaves features empty (ldc#4919 class); `-mattr` is benign for base mos6502 |\n| 07 | `comptime-abi` | Compile-time ABI assertions; C/D/Rust byte-align vs Zig natural-align |\n| 08 | `struct-abi` | Struct round-trip: Zig `extern struct` corrupts (over-aligns); `align(1)` fixes it; zero-page AS |\n| 09 | `zero-cost` | Monomorphized generic + higher-order callable: C++ ties C exactly, lambdas inline away |\n| 10 | `tmp-parity` | factorial(10) via constexpr/consteval/CTFE/const-fn folds at `-O0` (lang guarantee); C doesn't |\n| 11 | `dwarf-parity` | Debug info: clang=DWARF5/others=DWARF4, addr_size=4 (deliberate), no CFI (unforceable; designed upstream); `returnaddress` gap both clusters; Rust-dev G_UCMP |\n| 12 | `byval-struct` | By-value struct ABI: **all five** decompose ≤4B into registers — D \u0026 Rust were the indirect holdouts, both fixed in their callconv rebuilds |\n| 13 | `scalar-callback-abi` | i64 round-trip, signed negate, function-pointer callbacks: shared across all 5 |\n| 14 | `feature-probe` | Capability matrix: inline-asm (all 4 — rust via `asm_experimental_arch`), interrupts, atomics(8-bit), multi-CPU, SIMD ✗ |\n| 15 | `std-support` | Stdlib reach: C libc, C++ STL subset, Zig std (richest), Rust `alloc::Vec`, D core.stdc+ldc |\n| 16 | `mos-sim-realworld` | Interactive stdin→stdout filter (libc `getchar` + Zig FFI uppercase) + `$FFF0` cycles |\n| 17 | `zigcc-rust-linker` | `zig cc` as Rust's linker: compiles MOS objs but hits the LLVM-22/23 bitcode cluster wall |\n| 18 | `embed-file` | Compile-time file embedding 6 ways (`#embed`/`include_bytes!`/`import`/`@embedFile`/`.incbin`) → identical bytes |\n| 19 | `reflection` | Compile-time reflection: D \u0026 Zig enumerate fields/names; C/C++/Rust manage only `sizeof` |\n| 20 | `mmio-hal` | MMIO register parity (mos-hardware/mega65-libc pattern): all 5 frontends emit identical `sta $fff9` |\n| 21 | `safety` | `@safe`/borrow rejection battery (D \u0026 Rust) vs C (none); runtime bounds check: Rust traps, Zig traps w/ `mos_panic` (default handler crashes LLVM-22) |\n| 22 | `raii-scopeguard` | Scope-guard/RAII LIFO cleanup in all 5 (zero-cost); Zig `errdefer`, D move-semantics \u0026 `extern(C++,class)` |\n| 23 | `dynamic-debug` | Runtime PC→source on the sim: `mos-sim --profile`/`--trace` PCs symbolize back via `llvm-symbolizer` (DWARF line tables are usable) |\n| 24 | `benchmarks` | Canonical kernels (BYTE sieve / recursive fib / CRC-16) in all 5: per-kernel cycles + size (codegen spread, size/speed inverts); Zig `std.hash.crc` + `std.crypto` SHA-256 + `std.math` on a 6502; 6502-vs-65C02 |\n| 25 | `global-asm-symbols` | Real-world asm in all 5: the llvm-mos-sdk iNES global-asm linker-symbol trick (`asm(\".globl x\\nx=N\")` → absolute symbol) — clang/C++ file-scope, Rust `global_asm!`, Zig `comptime asm`, D `__asm_trusted`; verified absolute via `llvm-nm` + read on mos-sim; + inline-asm MMIO putchar w/ clobbers |\n| 26 | `float-runtime` | Float math at **runtime**: soft-float `+-*/` runs in all; `sqrt` lowers to a `sqrtf` libcall the SDK lacks → Zig/D/C compile but don't link; the Rust **`libm` crate** runs sqrt natively + (exported as C `sqrtf`) gives all four **parity** → `sqrt(2)·100=141` on mos-sim |\n| 27 | `importc` | D's **ImportC** (pass a `.c` to `ldc2`) compiles C for MOS with a **16-bit `int`** (rebuilt LDC fixes the width — dlang-mos-hello-world#1); needs `-gcc=\u003cmos clang\u003e`. IR vs `mos-clang`/`zig cc`: int agrees, by-value structs lower by D rules (first-class aggregate) not C (scalars) — but C↔ImportC FFI (incl. by-value struct) still works on mos-sim |\n\n\u003e This repo studies *unofficial* 6502 support. None of these targets are upstream\n\u003e in clang/rustc/zig/ldc; pin one toolchain set (the versions above) — there is no\n\u003e cross-version ABI-stability promise on LLVM-MOS.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkassane%2Fmos-toolchains-research","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fkassane%2Fmos-toolchains-research","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkassane%2Fmos-toolchains-research/lists"}