https://github.com/pyrex41/shen-rust

Shen language port in Rust — 134/134 kernel conformance; tree-walker + AOT-compiled kernel + bytecode VM + opt-in GC; first-class AWS Cedar authorization integration.
https://github.com/pyrex41/shen-rust
authorization bytecode-vm cedar compiler functional-programming garbage-collector interpreter klambda programming-language rust shen theorem-prover
Last synced: 24 days ago
JSON representation
Shen language port in Rust — 134/134 kernel conformance; tree-walker + AOT-compiled kernel + bytecode VM + opt-in GC; first-class AWS Cedar authorization integration.
Host: GitHub
URL: https://github.com/pyrex41/shen-rust
Owner: pyrex41
License: other
Created: 2026-06-01T14:41:10.000Z (about 1 month ago)
Default Branch: main
Last Pushed: 2026-06-21T04:20:12.000Z (25 days ago)
Last Synced: 2026-06-21T05:21:54.882Z (25 days ago)
Topics: authorization, bytecode-vm, cedar, compiler, functional-programming, garbage-collector, interpreter, klambda, programming-language, rust, shen, theorem-prover
Language: Rust
Size: 2.26 MB
Stars: 2
Watchers: 0
Forks: 0
Open Issues: 1
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project

README

          # shen-rust

[![CI](https://github.com/pyrex41/shen-rust/actions/workflows/ci.yml/badge.svg)](https://github.com/pyrex41/shen-rust/actions/workflows/ci.yml)

A port of the [Shen](https://shenlanguage.org/) programming language to Rust,

with native [AWS Cedar](https://www.cedarpolicy.com/) authorization integration.

Shen is a functional language with an integrated logic engine and an optional,

very expressive type system (a sequent-calculus theorem prover). `shen-rust`

boots the upstream **ShenOSKernel-41.2** and passes its full conformance suite —

**134 / 134 kernel tests**, in every execution mode.

The name follows the Shen-port convention (`shen-cl`, `shen-go`, `shen-ocaml`):

the engine is `shen-rust`. The name `shen-cedar` is reused for the Shen **+**

Cedar integration that ships in `examples/`.

## Quick start

```sh

# Build the workspace

cargo build --release

# REPL

cargo run --release --bin shen-rust

# Long-running / served mode (enables the bytecode VM — ~2.3× warm)

cargo run --release --bin shen-rust -- --served

# Run the upstream Shen kernel conformance suite (expect 134/0)

cargo run --release --bin shen-rust -- --kernel-tests

# Standard Shen launcher protocol (extension-launcher.kl), as in shen-cl:

#   shen-rust eval [-q] [-l FILE] [-e EXPR] [-s KEY VALUE] [-r]

#   shen-rust script FILE [ARGS...]

#   shen-rust repl | --help | --version

# e.g. host stage 1 of the Ratatoskr tree-shaker from its repo root:

#   shen-rust eval -l ratatoskr.shen -e '(ratatoskr.shake ["tests/fib.shen"] "out")'

# (note: omit -q for workloads that write files through `pr` — upstream

#  *hush* semantics silence `pr` on every stream, unlike shen-cl whose

#  native pr override ignores *hush* entirely)

target/release/shen-rust eval -e "(+ 1 2)"

```

Rustup users get the pinned toolchain from `rust-toolchain.toml`

automatically; a Nix flake (`nix develop`) provides a dev shell.

## Documentation

| Doc | Contents |

|---|---|

| [DEMO.md](DEMO.md) | **executable tour** — every code block ran for real; re-check it with [showboat](https://github.com/simonw/showboat) (`showboat verify DEMO.md`) |

| [STATUS.md](STATUS.md) | current state, milestones, known limitations |

| [ARCHITECTURE.md](ARCHITECTURE.md) | layering, value representation, execution tiers, Cedar bridge |

| [PERFORMANCE.md](PERFORMANCE.md) | how the gap to `shen-cl` was closed (17× → ~3×), and what remains |

| [BENCHMARKS.md](BENCHMARKS.md) | the numbers, with methodology — all reproducible from `scripts/` and `benches/` |

| `design/` | internal working notes: decision records, handoffs, falsified experiments |

## Execution engine

The same Shen semantics run on tiers chosen for the workload:

| Tier | When | Notes |

|---|---|---|

| **Tree-walker** | default | Allocation-light interpreter over the KL AST. Best for one-shot runs. |

| **AOT kernel** | build time | The 21 kernel KL files are compiled to Rust ahead of time by `crates/klcompile` — control flow lowered (self-tail-calls → loops; `if`/`let`/`cond` and ~18 primitives inlined). |

| **Bytecode VM** | `--served` / `SHEN_RUST_VM=1` | Runtime closures (`defun`/`lambda`/`freeze`) compile to bytecode. **~2.3× faster than the tree-walker on warm / served workloads** (load a theory once, serve many requests), where the compile cost amortizes. Not the bare default because a one-shot run can't amortize it. See `scripts/warm-bench.sh`. |

| **AOT overlay** | opt-in, per `.shen` file | Known `.shen` files are compiled to Rust offline (`scripts/codegen-shen-aot.sh`, same klcompile) and committed; after a normal load (all side effects live) the host swaps the loaded defuns for the native versions via a verified manifest (`Interp::install_overlay_if_match` — source hash + kernel digest, silent fallback on mismatch). **~3.1× over the VM, ~11.7× over the tree-walker on served authz workloads** (`benches/authz_served.rs`). |

| **Cranelift JIT** | `--features jit`, `SHEN_RUST_JIT=1` | Experimental; native codegen for runtime closures. Wins on compute loops but was falsified for the type-checker's CPS continuations — kept gated/off. See `design/jit-productionization-plan.md`. |

Every tier is differentially tested against the tree-walker and held at 134/0.

**Memory**: `Value` is a word-sized `Copy` tagged `u64` over a non-moving

mark-sweep GC heap. Collection is opt-in (`SHEN_RUST_GC=1`): the heap stays

grow-only by default (protects one-shot latency), and in GC mode a long-running

embedding gets a **bounded heap** — collection runs at interpreter safepoints

with hybrid roots (precise interpreter tables + a conservative native-stack

scan, aarch64 macOS/Linux). On a 20k-request served loop:

grow-only ≈ 482 MB and climbing; GC ≈ 26 MB flat, wall-time neutral

(`cargo bench --bench gc_boundedness`, machine-checked).

## Shen + Cedar

Cedar is AWS's authorization-policy language. `shen-rust` combines with it on

two levels:

**1. Cedar as first-class Shen values.** The engine embeds the `cedar-policy`

crate and exposes ~15 `cedar.*` primitives (`crates/shen-rust/src/cedar/`), so

Shen *programs* can parse, author, evaluate, and validate Cedar policies

directly — Cedar handles travel as ordinary Shen values:

```shen

(set p (cedar.parse-policy "permit(principal, action, resource);"))

(set ps (cedar.policy-set-add (cedar.empty-policy-set) (value p)))

(cedar.is-authorized (value ps) (cedar.empty-entities)

   (cedar.make-request (cedar.make-entity-uid "User" "alice")

                       (cedar.make-entity-uid "Action" "read")

                       (cedar.make-entity-uid "Doc" "d1") ()))

;; => Allow

```

**2. Worked integration patterns.** `examples/shen-cedar-authz` shows three ways

the engine (in served / VM mode) and Cedar combine on the Rust side, natively in

one process:

```sh

cargo run -p shen-cedar-authz --example gate      # Cedar gates Shen evaluation

cargo run -p shen-cedar-authz --example verify    # Shen reasons ABOUT Cedar policies

cargo run -p shen-cedar-authz --example generate  # Cedar generated FROM a Shen spec

```

- **gate** — each request `(principal, action, resource, shen-source)` is

  authorized by Cedar (schema-validated); only on `Allow` does the served VM

  evaluate the source.

- **verify** — Shen-authored, hierarchy-aware analysis of a Cedar `PolicySet`:

  flags dead (shadowed) permits and partial conflicts that Cedar's per-request

  evaluator won't surface, cross-checked against the live authorizer.

- **generate** — a committed Shen spec (`spec/authz.shen`) is the source of

  truth; the Shen engine computes the transitive role-grant closure; the host

  renders, strict-validates, and enforces the resulting Cedar policy.

See the crate's [README](examples/shen-cedar-authz/README.md) for details.

## shengen — formal backpressure

`crates/shengen-rust` compiles Shen sequent-calculus specs (`specs/`) into Rust

**guard types**, so a spec change becomes a compile error rather than silent

drift — Shen as a formal-spec language for other code. The specs are also

re-type-checked by the engine itself on every CI run (`scripts/shen-check.sh`).

## Layout

```

crates/shen-rust/           the engine: KL runtime, kernel boot, tree-walker, VM, AOT, JIT

crates/klcompile/           build-time KL → Rust AOT compiler for the kernel

crates/shengen-rust/        Shen sequent-calc specs → Rust guard types (backpressure)

bin/shen-rust/              the REPL / CLI (`--served`, `--kernel-tests`)

examples/shen-cedar-authz/  Shen + Cedar integration (gate / verify / generate)

kernel/                     vendored ShenOSKernel-41.2 (klambda + conformance tests)

specs/                      backpressure specs in Shen sequent-calculus syntax

design/                     architecture + performance design notes

scripts/                    gates.sh (CI), benches, cross-port + warm benchmarks

```

## Performance

The reference target is the upstream `shen-cl` (SBCL) port. Two metrics, two

answers (paired interleaved runs, 2026-06-10, Apple M-series):

**One-shot** (`--kernel-tests`, boot + load + run + exit):

| Config | Wall | vs shen-cl |

|---|---:|---:|

| shen-cl (SBCL) | ≈ 1.0 s | 1× |

| shen-rust, bare | ≈ 3.0 s | **~3.0× off** |

| shen-rust + tc-cache (`SHEN_RUST_TC_CACHE=`, warm) | ≈ 1.0 s | **at parity / ahead** |

The bare gap is structural — the boxed-`Value` + interpreted-dispatch model,

not a single hot spot. A 2026-06-10 profiling round stripped the runtime's

own overheads (split-TLS heap access, direct-mapped call-target interning,

thin-LTO build) for ~18 % cumulative; what remains is the model, where each

local lever measures ≤ ~8 %. The tc-cache win is typecheck-verdict

memoization, not raw speed; it's off by default.

**Served** (long-lived process: load once, serve many evaluations) — the

funded direction, where the tiers stack:

| Tier | vs tree-walk loaded | vs VM loaded |

|---|---:|---:|

| bytecode VM (`--served`) | ~2.3× | 1× |

| **AOT overlay** (committed `.shen` → native) | ~5.3–11.7× | **~1.9–3.2×** |

On served spec code the overlay leaves the interpreter entirely (the

SBCL-shaped answer for that niche): `benches/authz_served.rs` measures

~3.1× over the VM-loaded arm. For long-running served processes,

`SHEN_RUST_GC=1` bounds the heap (see "Execution engine" above). Full history

and the GC / value-representation / JIT ladder live in `PERFORMANCE.md`,

`BENCHMARKS.md`, and `design/perf-*.md`.

## Development

```sh

./scripts/gates.sh    # full CI: fmt+clippy, build, test, shen-check, audits, kernel-tests

```

### Two test tiers (port vs. canonical)

The suite is split into two distinct tiers, kept separate on purpose:

- **Port-authored tests** — `cargo test --workspace`. Rust unit tests plus the

  integration suites under `crates/shen-rust/tests/`. These are OURS: they

  exercise the evaluator, primitives, I/O, error catchability, the

  reader/eval robustness corpus, the stdlib, the bytecode-VM differential, and

  the CLI launcher. The port-authored suites that mirror the shen-go test

  layout, category for category, are:

  | file | covers |

  | --- | --- |

  | `cli_launcher.rs` | spawns the release binary: `eval -e`/`-l`, `script`, `--version`/`--help`, piped-EOF clean exit (timeout-guarded), the `-q`/`*hush*` file-write divergence, adversarial input without a backtrace |

  | `primitives_coverage.rs` | 50+ assertions over the native KL primitives via `Interp::eval` (arithmetic/float, cons/hd/tl, predicates, string ops, intern/value/set, absvector, hash, get-time) |

  | `io_coverage.rs` | open/close, write→read round trip, read-byte EOF = −1, read-file, get-time |

  | `error_robustness.rs` | the error-catchability contract on BOTH the tree-walker and the bytecode-VM path |

  | `reader_fuzz.rs` | a deterministic seeded corpus of malformed input through reader+eval; asserts no panic, only catchable errors (no `rand`, no cargo-fuzz) |

  | `library.rs` | booted-kernel stdlib functions (reverse/append/map/remove/element?/length/…) |

- **Canonical kernel suite** — the `kernel-tests` gate (`scripts/kernel-tests.sh`).

  This runs the VENDORED ShenOSKernel conformance suite (`kernel/tests/`,

  134 tests) end to end through the binary and is NOT modified by the port.

  It is the conformance bar; the port-authored tier is the regression net

  around the parts the kernel suite doesn't reach (CLI, adversarial inputs,

  the VM path, I/O edges).

Coverage of the port-authored tier: `scripts/coverage.sh` (a `cargo-llvm-cov`

wrapper that skips gracefully if the tool isn't installed).

## License

[BSD-3-Clause](LICENSE) for the port code (© 2026 Reuben Brooks). The vendored

Shen kernel under `kernel/` is © 2010–2022 Mark Tarver and retains its own

BSD-3-Clause license (`kernel/LICENSE.txt`).
ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/pyrex41/shen-rust

Awesome Lists containing this project

README