https://github.com/pyrex41/shen-rust
Shen language port in Rust — 134/134 kernel conformance; tree-walker + AOT-compiled kernel + bytecode VM + opt-in GC; first-class AWS Cedar authorization integration.
https://github.com/pyrex41/shen-rust
authorization bytecode-vm cedar compiler functional-programming garbage-collector interpreter klambda programming-language rust shen theorem-prover
Last synced: about 22 hours ago
JSON representation
Shen language port in Rust — 134/134 kernel conformance; tree-walker + AOT-compiled kernel + bytecode VM + opt-in GC; first-class AWS Cedar authorization integration.
- Host: GitHub
- URL: https://github.com/pyrex41/shen-rust
- Owner: pyrex41
- License: other
- Created: 2026-06-01T14:41:10.000Z (21 days ago)
- Default Branch: main
- Last Pushed: 2026-06-21T04:20:12.000Z (1 day ago)
- Last Synced: 2026-06-21T05:21:54.882Z (1 day ago)
- Topics: authorization, bytecode-vm, cedar, compiler, functional-programming, garbage-collector, interpreter, klambda, programming-language, rust, shen, theorem-prover
- Language: Rust
- Size: 2.26 MB
- Stars: 2
- Watchers: 0
- Forks: 0
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# shen-rust
[](https://github.com/pyrex41/shen-rust/actions/workflows/ci.yml)
A port of the [Shen](https://shenlanguage.org/) programming language to Rust,
with native [AWS Cedar](https://www.cedarpolicy.com/) authorization integration.
Shen is a functional language with an integrated logic engine and an optional,
very expressive type system (a sequent-calculus theorem prover). `shen-rust`
boots the upstream **ShenOSKernel-41.2** and passes its full conformance suite —
**134 / 134 kernel tests**, in every execution mode.
The name follows the Shen-port convention (`shen-cl`, `shen-go`, `shen-ocaml`):
the engine is `shen-rust`. The name `shen-cedar` is reused for the Shen **+**
Cedar integration that ships in `examples/`.
## Quick start
```sh
# Build the workspace
cargo build --release
# REPL
cargo run --release --bin shen-rust
# Long-running / served mode (enables the bytecode VM — ~2.3× warm)
cargo run --release --bin shen-rust -- --served
# Run the upstream Shen kernel conformance suite (expect 134/0)
cargo run --release --bin shen-rust -- --kernel-tests
# Standard Shen launcher protocol (extension-launcher.kl), as in shen-cl:
# shen-rust eval [-q] [-l FILE] [-e EXPR] [-s KEY VALUE] [-r]
# shen-rust script FILE [ARGS...]
# shen-rust repl | --help | --version
# e.g. host stage 1 of the Ratatoskr tree-shaker from its repo root:
# shen-rust eval -l ratatoskr.shen -e '(ratatoskr.shake ["tests/fib.shen"] "out")'
# (note: omit -q for workloads that write files through `pr` — upstream
# *hush* semantics silence `pr` on every stream, unlike shen-cl whose
# native pr override ignores *hush* entirely)
target/release/shen-rust eval -e "(+ 1 2)"
```
Rustup users get the pinned toolchain from `rust-toolchain.toml`
automatically; a Nix flake (`nix develop`) provides a dev shell.
## Documentation
| Doc | Contents |
|---|---|
| [DEMO.md](DEMO.md) | **executable tour** — every code block ran for real; re-check it with [showboat](https://github.com/simonw/showboat) (`showboat verify DEMO.md`) |
| [STATUS.md](STATUS.md) | current state, milestones, known limitations |
| [ARCHITECTURE.md](ARCHITECTURE.md) | layering, value representation, execution tiers, Cedar bridge |
| [PERFORMANCE.md](PERFORMANCE.md) | how the gap to `shen-cl` was closed (17× → ~3×), and what remains |
| [BENCHMARKS.md](BENCHMARKS.md) | the numbers, with methodology — all reproducible from `scripts/` and `benches/` |
| `design/` | internal working notes: decision records, handoffs, falsified experiments |
## Execution engine
The same Shen semantics run on tiers chosen for the workload:
| Tier | When | Notes |
|---|---|---|
| **Tree-walker** | default | Allocation-light interpreter over the KL AST. Best for one-shot runs. |
| **AOT kernel** | build time | The 21 kernel KL files are compiled to Rust ahead of time by `crates/klcompile` — control flow lowered (self-tail-calls → loops; `if`/`let`/`cond` and ~18 primitives inlined). |
| **Bytecode VM** | `--served` / `SHEN_RUST_VM=1` | Runtime closures (`defun`/`lambda`/`freeze`) compile to bytecode. **~2.3× faster than the tree-walker on warm / served workloads** (load a theory once, serve many requests), where the compile cost amortizes. Not the bare default because a one-shot run can't amortize it. See `scripts/warm-bench.sh`. |
| **AOT overlay** | opt-in, per `.shen` file | Known `.shen` files are compiled to Rust offline (`scripts/codegen-shen-aot.sh`, same klcompile) and committed; after a normal load (all side effects live) the host swaps the loaded defuns for the native versions via a verified manifest (`Interp::install_overlay_if_match` — source hash + kernel digest, silent fallback on mismatch). **~3.1× over the VM, ~11.7× over the tree-walker on served authz workloads** (`benches/authz_served.rs`). |
| **Cranelift JIT** | `--features jit`, `SHEN_RUST_JIT=1` | Experimental; native codegen for runtime closures. Wins on compute loops but was falsified for the type-checker's CPS continuations — kept gated/off. See `design/jit-productionization-plan.md`. |
Every tier is differentially tested against the tree-walker and held at 134/0.
**Memory**: `Value` is a word-sized `Copy` tagged `u64` over a non-moving
mark-sweep GC heap. Collection is opt-in (`SHEN_RUST_GC=1`): the heap stays
grow-only by default (protects one-shot latency), and in GC mode a long-running
embedding gets a **bounded heap** — collection runs at interpreter safepoints
with hybrid roots (precise interpreter tables + a conservative native-stack
scan, aarch64 macOS/Linux). On a 20k-request served loop:
grow-only ≈ 482 MB and climbing; GC ≈ 26 MB flat, wall-time neutral
(`cargo bench --bench gc_boundedness`, machine-checked).
## Shen + Cedar
Cedar is AWS's authorization-policy language. `shen-rust` combines with it on
two levels:
**1. Cedar as first-class Shen values.** The engine embeds the `cedar-policy`
crate and exposes ~15 `cedar.*` primitives (`crates/shen-rust/src/cedar/`), so
Shen *programs* can parse, author, evaluate, and validate Cedar policies
directly — Cedar handles travel as ordinary Shen values:
```shen
(set p (cedar.parse-policy "permit(principal, action, resource);"))
(set ps (cedar.policy-set-add (cedar.empty-policy-set) (value p)))
(cedar.is-authorized (value ps) (cedar.empty-entities)
(cedar.make-request (cedar.make-entity-uid "User" "alice")
(cedar.make-entity-uid "Action" "read")
(cedar.make-entity-uid "Doc" "d1") ()))
;; => Allow
```
**2. Worked integration patterns.** `examples/shen-cedar-authz` shows three ways
the engine (in served / VM mode) and Cedar combine on the Rust side, natively in
one process:
```sh
cargo run -p shen-cedar-authz --example gate # Cedar gates Shen evaluation
cargo run -p shen-cedar-authz --example verify # Shen reasons ABOUT Cedar policies
cargo run -p shen-cedar-authz --example generate # Cedar generated FROM a Shen spec
```
- **gate** — each request `(principal, action, resource, shen-source)` is
authorized by Cedar (schema-validated); only on `Allow` does the served VM
evaluate the source.
- **verify** — Shen-authored, hierarchy-aware analysis of a Cedar `PolicySet`:
flags dead (shadowed) permits and partial conflicts that Cedar's per-request
evaluator won't surface, cross-checked against the live authorizer.
- **generate** — a committed Shen spec (`spec/authz.shen`) is the source of
truth; the Shen engine computes the transitive role-grant closure; the host
renders, strict-validates, and enforces the resulting Cedar policy.
See the crate's [README](examples/shen-cedar-authz/README.md) for details.
## shengen — formal backpressure
`crates/shengen-rust` compiles Shen sequent-calculus specs (`specs/`) into Rust
**guard types**, so a spec change becomes a compile error rather than silent
drift — Shen as a formal-spec language for other code. The specs are also
re-type-checked by the engine itself on every CI run (`scripts/shen-check.sh`).
## Layout
```
crates/shen-rust/ the engine: KL runtime, kernel boot, tree-walker, VM, AOT, JIT
crates/klcompile/ build-time KL → Rust AOT compiler for the kernel
crates/shengen-rust/ Shen sequent-calc specs → Rust guard types (backpressure)
bin/shen-rust/ the REPL / CLI (`--served`, `--kernel-tests`)
examples/shen-cedar-authz/ Shen + Cedar integration (gate / verify / generate)
kernel/ vendored ShenOSKernel-41.2 (klambda + conformance tests)
specs/ backpressure specs in Shen sequent-calculus syntax
design/ architecture + performance design notes
scripts/ gates.sh (CI), benches, cross-port + warm benchmarks
```
## Performance
The reference target is the upstream `shen-cl` (SBCL) port. Two metrics, two
answers (paired interleaved runs, 2026-06-10, Apple M-series):
**One-shot** (`--kernel-tests`, boot + load + run + exit):
| Config | Wall | vs shen-cl |
|---|---:|---:|
| shen-cl (SBCL) | ≈ 1.0 s | 1× |
| shen-rust, bare | ≈ 3.0 s | **~3.0× off** |
| shen-rust + tc-cache (`SHEN_RUST_TC_CACHE=`, warm) | ≈ 1.0 s | **at parity / ahead** |
The bare gap is structural — the boxed-`Value` + interpreted-dispatch model,
not a single hot spot. A 2026-06-10 profiling round stripped the runtime's
own overheads (split-TLS heap access, direct-mapped call-target interning,
thin-LTO build) for ~18 % cumulative; what remains is the model, where each
local lever measures ≤ ~8 %. The tc-cache win is typecheck-verdict
memoization, not raw speed; it's off by default.
**Served** (long-lived process: load once, serve many evaluations) — the
funded direction, where the tiers stack:
| Tier | vs tree-walk loaded | vs VM loaded |
|---|---:|---:|
| bytecode VM (`--served`) | ~2.3× | 1× |
| **AOT overlay** (committed `.shen` → native) | ~5.3–11.7× | **~1.9–3.2×** |
On served spec code the overlay leaves the interpreter entirely (the
SBCL-shaped answer for that niche): `benches/authz_served.rs` measures
~3.1× over the VM-loaded arm. For long-running served processes,
`SHEN_RUST_GC=1` bounds the heap (see "Execution engine" above). Full history
and the GC / value-representation / JIT ladder live in `PERFORMANCE.md`,
`BENCHMARKS.md`, and `design/perf-*.md`.
## Development
```sh
./scripts/gates.sh # full CI: fmt+clippy, build, test, shen-check, audits, kernel-tests
```
### Two test tiers (port vs. canonical)
The suite is split into two distinct tiers, kept separate on purpose:
- **Port-authored tests** — `cargo test --workspace`. Rust unit tests plus the
integration suites under `crates/shen-rust/tests/`. These are OURS: they
exercise the evaluator, primitives, I/O, error catchability, the
reader/eval robustness corpus, the stdlib, the bytecode-VM differential, and
the CLI launcher. The port-authored suites that mirror the shen-go test
layout, category for category, are:
| file | covers |
| --- | --- |
| `cli_launcher.rs` | spawns the release binary: `eval -e`/`-l`, `script`, `--version`/`--help`, piped-EOF clean exit (timeout-guarded), the `-q`/`*hush*` file-write divergence, adversarial input without a backtrace |
| `primitives_coverage.rs` | 50+ assertions over the native KL primitives via `Interp::eval` (arithmetic/float, cons/hd/tl, predicates, string ops, intern/value/set, absvector, hash, get-time) |
| `io_coverage.rs` | open/close, write→read round trip, read-byte EOF = −1, read-file, get-time |
| `error_robustness.rs` | the error-catchability contract on BOTH the tree-walker and the bytecode-VM path |
| `reader_fuzz.rs` | a deterministic seeded corpus of malformed input through reader+eval; asserts no panic, only catchable errors (no `rand`, no cargo-fuzz) |
| `library.rs` | booted-kernel stdlib functions (reverse/append/map/remove/element?/length/…) |
- **Canonical kernel suite** — the `kernel-tests` gate (`scripts/kernel-tests.sh`).
This runs the VENDORED ShenOSKernel conformance suite (`kernel/tests/`,
134 tests) end to end through the binary and is NOT modified by the port.
It is the conformance bar; the port-authored tier is the regression net
around the parts the kernel suite doesn't reach (CLI, adversarial inputs,
the VM path, I/O edges).
Coverage of the port-authored tier: `scripts/coverage.sh` (a `cargo-llvm-cov`
wrapper that skips gracefully if the tool isn't installed).
## License
[BSD-3-Clause](LICENSE) for the port code (© 2026 Reuben Brooks). The vendored
Shen kernel under `kernel/` is © 2010–2022 Mark Tarver and retains its own
BSD-3-Clause license (`kernel/LICENSE.txt`).