https://github.com/eth-act/wasrisc

Last synced: 5 months ago
JSON representation
Host: GitHub
URL: https://github.com/eth-act/wasrisc
Owner: eth-act
Created: 2025-11-06T17:25:30.000Z (7 months ago)
Default Branch: master
Last Pushed: 2026-01-13T12:48:58.000Z (5 months ago)
Last Synced: 2026-01-13T15:41:31.624Z (5 months ago)
Language: C
Size: 15.3 MB
Stars: 0
Watchers: 0
Forks: 1
Open Issues: 0
Metadata Files:
- Readme: README.md
Awesome Lists containing this project

README

          # WASRISC

This repository demonstrates and benchmarks different compilation methods for translating high-level languages to RISCV64IM (the target architecture for RISCV zkVMs) using WASM-WASI as an intermediate representation.

This experiment measures the performance impact of using WASM-WASI as an intermediate step compared to direct compilation from high-level languages to RISCV64IM.

**Note:** While any language that compiles to WASM with WASI support (0.1) can use these pipelines, this project focuses primarily on Go and Rust.

## Pipeline Overview

All pipelines share a common first step: compiling high-level source code to WASM-WASI. Most modern language compilers support WASM as a target.

The transition from WASM to the zkVM target can be achieved through multiple approaches. This experiment explores three compilation methods:

1. **w2c2 + GCC**: Transpile WASM to C source code using the `w2c2` compiler, then compile the C code to the final target using `gcc` or a platform-specific compiler

2. **WAMR (LLVM)**: Compile WASM directly to the final target using WAMR's LLVM backend

3. **wasmtime/wasmer (Cranelift)**: Compile WASM to Linux (either host or RISCV64) using `wasmtime` or `wasmer`, both of which use Cranelift for code generation

For the third approach, we targeted Linux because it's supported out of the box—porting to bare-metal would require significant additional effort. For benchmarking the Ethereum state transition function, this difference shouldn't significantly affect results due to minimal OS interaction and the absence of floating-point operations in the benchmark code.

```mermaid

graph TD;

source_code["Source Code
Go, Rust, C, Zig, etc."]

wasm["WebAssembly
with WASI (wasip1)"]

c_source_code["C Source Code"]

subgraph targets[" "]

    zkvm_target_binary["Target Binary
RISCV zkVM"]

    linux_target_binary["Target Binary
RISC-V/AMD64 Linux"]

end

source_code -->|"Language-specific
WASM compiler"| wasm

wasm -->|"w2c2 transpiler"| c_source_code

c_source_code -->|"GCC/platform-specific
compiler"| zkvm_target_binary

wasm -->|"WAMR
llvm backend"| zkvm_target_binary

wasm -->|"wasmtime
cranelift backend"| linux_target_binary

wasm -->|"wasmer
cranelift backend"| linux_target_binary

classDef subgraphStyle fill:none,stroke:none;

```

## Prerequisites

The benchmark environment is dockerized and includes:

- RISC-V GNU Toolchain with newlib (rv64ima)

- w2c2 WebAssembly-to-C transpiler

- QEMU with `libinsn` plugin

- WAMR

- wasmtime

- wasmer

> **Note:** The first time you run the Docker script, it will take some time as it rebuilds the RISC-V GNU toolchain from source.

In addition to Docker, install the following on your host system:

- Rust

- Rust wasip1 target:

  ```bash

  rustup target add wasm32-wasip1

  ```

- Rust RISC-V target:

  ```bash

  rustup target add riscv64gc-unknown-linux-gnu

  ```

## Quick Start

Run the `go_benchmark.sh` and `rust_benchmark.sh` scripts to compare different compilation methods for the Ethereum state transition function. These scripts will:

1. Compile Rust and Go implementations using various methods

2. Execute the compiled binaries under QEMU with the `libinsn` plugin to count instructions

3. Save instruction counts for each compilation method to `go_benchmark_results.txt` and `rust_benchmark_results.txt` (see "total insns" in those files)

See the scripts for implementation details.

## Benchmark Configurations

The following benchmarks were performed:

- **w2c2 -O0**: WASM transpiled to C with `w2c2`, then compiled with GCC using `-O0` optimization for Linux `rv64imad`

- **w2c2 optimized**: WASM transpiled to C with `w2c2`, then compiled with GCC using higher optimization levels for Linux `rv64imad`

- **directly**:

  - Rust: `cargo build --target riscv64gc-unknown-linux-gnu --release`

  - Go: `GOOS=linux GOARCH=riscv64 go build`

- **wasmtime**: WASM compiled with `wasmtime` using Cranelift backend to a `riscv64gc` precompiled ".cwasm" file, then executed using the `wasmtime` runtime on Linux

- **wasmer (cranelift)**: WASM compiled with `wasmer` using Cranelift backend to a `riscv64gc` precompiled ".wasmu" file, then executed using the `wasmer` runtime on Linux

- **wamr -O0**: WASM compiled with `wamr` using LLVM backend with `-O0` optimization for bare-metal `riscv64ima`

The following critical benchmarks could not yet be performed due to issues in `wasmer` and `wamr`:

- **wasmer (llvm)**: WASM compiled with `wasmer` using LLVM backend to a `riscv64gc` precompiled ".wasmu" file, then executed using the `wasmer` runtime on Linux

- **wamr -O3**: WASM compiled with `wamr` using LLVM backend with `-O3` optimization for bare-metal `riscv64ima`

Since these critical benchmarks could not be performed on RISC-V, they were performed on AArch64 with the expectation that those results would allow us to extrapolate potential RISC-V performance.

See the "Known Issues" section for details.

## Benchmark Results on RISCV

| Program | w2c2
-O0 | w2c2
optimized | wasmtime | wasmer
(cranelift) | wasmer
(llvm) | WAMR
-O0 | WAMR
-O3 | directly |

|---|---|---|---|---|---|---|---|---|

| `reva-client-eth` (Rust) | 7,887,190,279 | 1,419,050,123
(-O1) | 1,074,488,397 | doesn't work | ? | didn't check | ? | 388,564,723 |

| `stateless` (Go) | 12,866,052,519 | 2,118,257,727
(-O3) | 874,758,419 | 953,874,491 | ? | 5,427,433,654 | ? | 236,265,327 |

## Analysis

**Important:** The `reva-client-eth` and `stateless` numbers should not be compared directly against each other, as these implementations execute against different blocks using different block serialization frameworks.

Unfortunately, we were unable to benchmark the most promising approaches (`wasmer (llvm)` and `wamr -O3`) on RISCV due to outstanding issues. The following analysis is based on available results for RISCV only.

### Key Findings

- **Direct compilation is fastest**: As expected, compiling directly to the target architecture provides the best performance

- **Optimization level is critical for w2c2**: Using GCC optimization flags provides a 6x speedup compared to unoptimized `-O0` builds

- **Cranelift-based pipelines perform best**: Among the WASM-based approaches, pipelines using Cranelift for code generation show the best performance

- **Performance overhead of WASM intermediate step**: The ratio of instructions required when compiling via `wasmtime` versus direct compilation is:

  - 2.8x for `reva-client-eth` (Rust)

  - 3.7x for `stateless` (Go)

- **WASM quality comparison**: The relatively similar overhead ratios suggest that Go's WASM compiler generates code quality comparable to Rust's WASM compiler

- **WAMR -O0 performance**: Currently falls between `w2c2` and `wasmtime` in terms of instruction count

### Binary Sizes

```

$ ls -lah build/bin/

827K fibonacci.riscv.O0.elf

686K fibonacci.riscv.O3.elf

823K hello_world.riscv.O0.elf

682K hello_world.riscv.O3.elf

23M  reva-client-eth.riscv.O0.elf

19M  reva-client-eth.riscv.O1.elf

74M  stateless.amd64.O0.elf

28M  stateless.amd64.O1.elf

29M  stateless.amd64.O3.elf

67M  stateless.riscv.O0.elf

58M  stateless.riscv.O1.elf

64M  stateless.riscv.O3.elf

```

## Supplementary Benchmark Results on AARCH64

The benchmark was performed for stateless (Go) only. `WAMR -O3` targets dynamically-linked AArch64 Linux MUSL. To reduce the impact of dynamic linker overhead and WAMR runtime setup, multiple runs of the business logic were performed. In the following tables, all WAMR rows without a `--b-c=0` annotation used the `--bounds-checks=0` option during WAMR compilation. All other rows used the `--bounds-checks=1` option during WAMR compilation.

|pipeline
/
number of runs|wasmtime|wasmer
(cranelift)|wasmer
(llvm)|WAMR
-O3|directy|

|---|---|---|---|---|---|

|1x|659,241,636|663,867,152|626,137,758|990,892,303|166,611,730|

|1x|(same)|(same)|(same)|699,810,538
^{--b-c=0|(same)|

|10x|2,533,334,795|2,268,562,071|2,002,956,919|3,100,038,538|660,390,007|

|25x|5,686,210,736|4,978,224,909|4,338,984,157|6,674,027,814|1,477,959,349|

|50x|10,929,448,352|-|-|12,581,544,465|2,830,756,855|

|50x|(same)|-|-|8,338,818,655
^{--b-c=0|(same)|}}

The following table presents the ratio between the number of steps executed for a given compilation pipeline and the number of steps executed for a directly compiled program.

|pipeline
/
number of runs|wasmtime|wasmer
(cranelift)|wasmer
(llvm)|WAMR
-O3|directy|

|---|---|---|---|---|---|

|1x|3.95|3.98|3.75|5.94|1.0|

|1x|(same)|(same)|(same)|4.19
^{--b-c=0|(same)|

|10x|3.83|3.43|3.03|4.69|1.0|

|25x|3.84|3.36|2.93|4.51|1.0|

|50x|3.86|-|-|4.44|1.0|

|50x|(same)|-|-|2.94
^{--b-c=0|(same)|}}

The `--bounds-checks=0` option appears to be critical for WAMR performance. Only with this option can WAMR outperform Cranelift-based frameworks in some scenarios. For a single run, `wasmtime`, `wasmer (cranelift)`, `and wasmer (llvm)` show similar performance. Single-run results for WAMR are difficult to interpret due to overhead from dynamic linking and Linux WAMR runtime setup, which would not be present on a zkVM bare-metal platform. For compute-intensive programs (50x), WAMR appears to have an edge over other compilation pipelines. The best-performing WebAssembly approaches appear to be 3-4 times slower than direct compilation of the stateless Go program.

These results have not been taken into account in the "Analysis" section.

## Known Issues

### WAMR -O3 Bug

Running WAMR with non-zero optimization levels on RISC-V currently fails with a relocation error.

Issue: https://github.com/bytecodealliance/wasm-micro-runtime/issues/4765

### Wasmer (LLVM) Bug

The wasmer team is actively working on fixing RISC-V target support.

Issues:

- https://github.com/wasmerio/wasmer/issues/5954

- https://github.com/wasmerio/wasmer/issues/5951

### GCC Bug

The `w2c2 optimized` pipeline for `reva-client-eth` uses the `-O1` optimization level. Higher optimization levels cause GCC to hang during compilation. This has been confirmed as a GCC bug based on the following observations:

- Clang successfully compiles the same sources

- When w2c2 is invoked with the `-f 100` option (which splits output into many source files), GCC hangs while compiling a single ~1000 LOC file

For reference, `reva-client-eth` compiled with Clang using `-O3` requires 1.2×10⁹ instructions to execute—not significantly fewer than when compiled with GCC using `-O1` (1.4×10⁹ instructions).

### Linking Problem

The `w2c2 optimized` pipeline for the `stateless` program fails to link when using non-zero optimization levels, producing the error:

```

guest.c:(.text.guestInitMemories+0x50): relocation truncated to fit: R_RISCV_JAL against `.L214'

collect2: error: ld returned 1 exit status

```

The issue stems from a single massive function `guestInitMemories` spanning over 100,000 lines of C code generated by w2c2 for `stateless`. GCC emits `R_RISCV_JAL` relocation for intra-function branches, which support only ±1MB PC-relative jumps. GCC lacks a fallback mechanism to automatically use AUIPC+JALR for out-of-range intra-function jumps when optimization creates this problem.

**Workaround:** Use the `-fno-reorder-blocks` flag to disable the optimization that creates large jumps. With this flag, `stateless` can be built with `-O3` optimization.

**Note:** This issue doesn't occur on x86 because that platform supports 32-bit relative jumps.

### Compilation Times

For higher optimization levels (e.g., `-O3`), expect compilation times of up to 60 minutes for `reva-client-eth` and `stateless`.

## Advanced Usage

### Custom WASM Imports

You can call platform-specific functions from your WASM code using custom imports.

In Go, use `//go:wasmimport`:

```go

// examples/go/with_import/example.go

package main

import "fmt"

//go:wasmimport testmodule testfunc

//go:noescape

func testfunc(a, b uint32) uint32

func main() {

    result := testfunc(1, 2)

    fmt.Printf("testfunc(1, 2) = %d\n", result)

}

```

Implement the import in `platform/*/custom_imports.c`:

```c

// platform/amd64/custom_imports.c

U32 testmodule__testfunc(void* p, U32 a, U32 b) {

    printf("testfunc called with %u, %u\n", a, b);

    return a + b;

}

```

### Memory Limits

For embedded targets with limited memory, use `debug.SetMemoryLimit()`:

```go

import "runtime/debug"

func main() {

    debug.SetMemoryLimit(400 * (1 << 20)) // 400MB limit

    // ...

}

```

## License

MIT + Apache
ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/eth-act/wasrisc

Awesome Lists containing this project

README