Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/remixlabs/wasicaml

Translate OCaml bytecode to WebAssembly
https://github.com/remixlabs/wasicaml

Last synced: 2 days ago
JSON representation

Translate OCaml bytecode to WebAssembly

Awesome Lists containing this project

README

        

# WASICaml

WASICaml is currently:
- a port of the OCaml bytecode interpreter to WebAssembly using the WASI
ABI (with very few ABI extensions)
- a postprocessor that takes a bytecode executable and translates it into
WebAssembly

The hope is that WASICaml evolves into a full compiler backend targeting
WebAssembly.

The motivation for WASICaml is to be able to run OCaml code in situations
where it is difficult or cumbersome to provide native binaries, for example
- run code in cross-platform environments, e.g. embed some OCaml in an
Electron app
- cross-platform plugins
- mobile OS
- user-generated code
- edge computing devices

Note that I left web browsers out in this list - of course, you can run
WebAssembly also in browsers, but as there are already solutions for
generating Javascript from OCaml, so this was not a driver for this
project.

Also be aware that there are many ways of running WebAssembly, and it is not
always Javascript that serves as a host environment:
- engine featuring JIT (e.g. all current browsers, node, wasmtime)
- engine not featuring JIT (e.g. wasm3)
- translation to other representations (e.g. translate to C with
wasm2c - note that you can then generate bitcode from OCaml)

Currently, WASICaml is, however, essentially restricted to a Javascript
host environment because we need exceptions, and the non-Javascript
environments generally do not yet implement exceptions (but it's a
coming WebAssembly feature, so this restriction will go away at some
point). At this point, it is assumed that there are a few functions
provided by the host environment that implement a throw/catch mechanism.

The WASI ABI is a very restricted Unix-like ABI allowing so far:
- access to environment variables
- access to date and time
- access to local files (you can mount local directories into a
virtual file tree)
- access to stdin, stdout, stderr

This is almost enough for running `ocamlc`. The only addition we need
is the `system` function for starting external processes, i.e. we
assume that `system` is provided by the host environment.

# Build

```
./configure
make
```

Optionally, do `make test` to run a couple of tests.

By default, the prefix is `~/.wasicaml`, but you can also specify a different
one: `./configure --prefix=/wherever`.

It is so far not recommended to use a shared prefix like `/usr/local`.
Also, do not install into the opam hierarchy.

Note that `make` installs files from the very beginning (i.e. we do not
cleanly distinguish between build and install phases yet).

You'll need `node` (14 or 16) and `npm`.

The build involves a git checkout of the OCaml sources (version 4.12.1),
and you'll get WebAssembly versions of the bytecode compiler. Also,
we download the WASI SDK, including a copy of `clang`. The latter is
currently only available for Linux and Mac as precompiled tarballs.
For other OS you are on your own.

# What you get

The installed files are:
- LLVM toolchain
+ `bin/wasi_cc`: wrapper for `clang` in WASI SDK
+ `bin/wasi_ld`: wrapper for `wasm-ld` in WASI SDK
+ `bin/wasi_ar`: wrapper for `ar` in WASI SDK
+ `bin/wasi_nm`: wrapper for `nm` in WASI SDK
+ `bin/wasi_objdump`: wrapper for `objdump` in WASI SDK
+ `bin/wasi_ranlib`: wrapper for `ranlib` in WASI SDK
+ `bin/wasi_strip`: wrapper for `strip` in WASI SDK
+ `bin/wasi_preamble`: see below
+ `lib/wasi-sdk`: the WASI SDK tree
+ `lib/initwasi.o`: this object is linked into all executables by
`wasi_cc` and performs global initializations
- WASI-enabled WebAssembly runtime
+ `bin/wasi_run`: starts a node process running a WebAssembly executable
+ `js/`: the Javascript code for this
- OCaml toolchain
+ `bin/ocaml`: WebAssembly port of the OCaml toploop (using bytecode)
+ `bin/ocamlc`: WebAssembly port of the OCaml compiler (using bytecode)
+ `bin/ocamlrun`: WebAssembly port of the OCaml bytecode engine
+ a few other standard OCaml tools
- WASICaml toolchain
+ `bin/wasicaml`: translates OCaml bytecode to WebAssembly
+ `lib/initruntime.o`: linked into executables produced by
`bin/wasicaml` to initialize the OCaml runtime

## The LLVM toolchain

This toolchain is a wrapper around `wasi-sdk`. You get:
- `wasi_cc`
- `wasi_ld`
- `wasi_ar`
- `wasi_nm`
- `wasi_objdump`
- `wasi_ranlib`
- `wasi_strip`
- `wasi_run`

Here, `wasi_cc` is a specially configured Clang compiler that uses
a special version of Musl as `libc`. The .o files and the final executables
are WebAssembly modules.

Our wrapper also prepends a preamble to the executables so that
the files can be directly run on the host system. This preamble is
equivalent to these two lines of Shell code (but actually a real
executable, `wasi_preamble`):

```
#! /bin/sh
exec /wasicaml/bin/wasi_run "$0" "$0" "$@"
```

These two lines are then followed by the "wasm" code. If the name of the
executable has the suffix ".wasm", however, this starter is omitted, and
the file consists ONLY of the WebAssembly module. So, e.g.

```
$ wasi_cc -o hello hello.c
```

would create `hello` as WebAssembly code where the preamble is prepended,
and

```
$ wasi_cc -o hello.wasm hello.c
```

would create `hello.wasm` as only the WebAssembly code.

Note that the same convention also applies to `ocamlc` and `wasicaml`:
```
$ ocamlc -o hello ... # with preamble
$ ocamlc -o hello.wasm ... # pure WebAssembly
$ wasicaml -o hello ... # with preamble
$ wasicaml -o hello.wasm ... # pure WebAssembly
```

## Additions to WASI

We build so far for `wasi-snapshot-preview1`, which is an early pre-release
of WASI mainly featuring access to the file system. Besides that, there
are environment variables, and you can read the clock. That's it!
Anything else is unavailable (e.g. subprocesses, pipes, network, signals, ...).

Also, an important feature is so far missing from both WASI and WebAssembly:
setjmp/longjmp, or (equivalently) exceptions. There is a proposal to add
exceptions to WebAssembly, but the work is not yet fully done
(it is available in `nodejs`, and WasiCaml supports it since release
1.1 with the option `-enable-exception-handling`).

All in all this means that pure WASI is currently not sufficient to run
OCaml code. We "added" a few things as host functions:
- a try/catch mechanism (unless `enable-exception-handling`)
- the `system()` function to start external processes
- a working `rename()` function (as the one in the current `wasi-sdk` is
broken)

Note that adding try/catch is kind of problematic, as it exploits the
way WebAssembly is implemented in Javascript (where you can use JS
exceptions to jump out of WebAssembly context), and the same trick might
not be available in other WebAssembly implementations (in particular,
I don't see how to do the same in the Rust-based ones). This can first be
improved when WebAssembly-level exceptions become widely available.

We also link `initwasi.o` into every executable. Besides some functions that
are missing in `libc` there is an initializer that sets the current working
directory to the contents of the environment variable `WASICAML_CWD`.
Without this initializer, the working directory would be set to `/` inside
the WASI sandbox when the executable is started. This trick ensures that
the working directory inside the sandbox is the same as outside.

So far, we do not support to restrict the filesystem in the sandbox,
i.e. `/` in the sandbox is the real root of the host machine.
(This will be configurable again at some point in the future.)

## The OCaml toolchain

So far, the bytecode interpreter can be built, and we can do a bootstrap
cycle of the compiler.

This is supported:
- Complete language
- Standard library, with a few exceptions in `Sys`
- The library `str`
- The library `unix` (but many functions are mssing)
- `ocamlc -custom`

This does not work:
- Dynamic loading of C libraries ( - there is no `dlopen` in WASI)
- No threads
- No debugger
- No ocamldoc

Note that WebAssembly is so far always a 32 bit environment.

## The WASICaml postprocessor

`wasicaml` takes a bytecode executable as input, and generates
WebAssembly as output, e.g.

```
$ ocamlc -o hello hello.ml
$ wasicaml -o hello_wasm hello
```

The resulting code is somewhat faster than the bytecode, and sometimes
a lot faster (e.g. small loops on arrays and strings). Occasionally,
it can also happen that WebAssembly is slower, in particular, when
many exceptions are thrown.

It is recommended to pass the `-g` switch to `ocamlc`. This way the
OCaml function names can be encoded in the symbol names, which is
results in better stack traces.

There are a couple of restrictions:
- Tail calls are by default not available. There is some emulation
for self calls, and when one function of a `let rec` definition
calls another function of the same `let rec`. (Note, however,
that there is a tail call proposal for WebAssembly, and WasiCaml
supports it now with the `-enable-tail-call` command-line switch.)
- You cannot obtain backtraces from the running OCaml program.
You can do from Javascript, though.
- You cannot change the limit of the OCaml stack from inside the
running OCaml program. The stack is preallocated at the beginning,
and the translation scheme prevents that it can be moved to
another address. (Attempting to change the limit with `Gc.set`
currently crashes the program.)

You can link in C libraries, e.g.
```
$ wasicaml -o hello_wasm hello -cclib ~/.wasicaml/lib/ocaml/libunix.a
```

Of course, these libraries must be compiled for WebAssembly, too.

# Troubleshooting

## WASI

During the development I figured out that the current WASI SDK
is quite buggy. In particular, WASI files are sometimes not cleanly
mapped to physical files (random characters are inserted into the
names). The `rename()` function was completely unusable.

# How it works

The reason for taking the bytecode as starting point for `wasicaml` is
that it allows us an easy start of the project. It is a good starting
point because it is a stack machine - close to what we need, but there
are also downsides, in particular information is missing that would
be very helpful (e.g. types, or whether the function to call is a
certain static function). But anyway, we don't have to create a full
new toolchain, but can just append to an existing one.

For the following, keep in mind that WebAssembly code is normally not
directly interpreted. There is a JIT pass, and even non-JIT
interpreters do not directly operate on the code as it is, but rather
on a preprocessed version that is optimized for execution. This
normally includes some mapping from logical stores to physical ones,
and often also advanced techniques like function inlining.

## The outer structure

The bytecode is decoded into instructions, and a control flow graph
(CFG) is created.

WebAssembly doesn't allow arbitrary branches, but just those that
are allowed in structured programming (there is `if/then/else`, and
a way to break out of blocks, and a jump back to the start of a loop -
but there is no way to jump to an arbitrary label). Because of this,
the code has to be restructured so that we can emit it in one of
the permitted forms. This is done by a pass over the CFG.

Roughly, one OCaml function is translated to one WebAssembly function.
However, this is not fully true, in particular all functions of a `let
rec` block go into the same WebAssembly function. This allows us
to translate tail calls from one function to the other with a simple
branch statement. The further functions a WebAssembly function implements
are called "subfunctions".

Another complication are exceptions. As there is not yet direct support
for exceptions in WebAssembly, we assume that there is a host mechanism
where a special function `wasicaml_try4(f, arg1, arg2, arg3, arg4)`
calls the function like `f(arg1, arg2, arg3, arg4)` and traps when
the called function invokes `wasicaml_throw()`. Because of this mechanism
the code inside the "try" block must be a function of its own.
In our translation scheme, this is ensured by allocating another
subfunction index for "try" blocks.

Summarized, a WebAssembly function includes the code of all the
OCaml functions of the same `let rec` definition where each function
is identified by a subfunction index. Also, the code inside
`try` blocks are also translated to subfunctions.

The subfunction index is encoded in code pointers.

## Functions

A WebAssembly function implementing an OCaml function takes four parameters:
- `envptr`: a pointer to the stack location where the closure value is stored
- `extra_args`: the number of further arguments on the stack (i.e. in total
there are `extra_args+1` arguments)
- `codeptr`: the code pointer (initially taken from `*envptr[0]` but
can be overridden here)
- `fp`: the address of the first argument

The OCaml arguments are always passed via the stack, and not via the
WebAssembly parameters. (It is unclear whether/when this would be
an advantage - still need to be figured out.)

There are several deviations from the scheme the bytecode interpreter uses.
First, the closure value (called `env` in the interpreter sources) is
not directly passed, but indirectly. This way we don't have to save it
when calling functions that could run the GC.

Second, `extra_args` is not a global counter, but passed on from function to
function. This also has the nice effect that we don't need to save it
when calling other functions.

Third, the code pointer is not an address. As WebAssembly is a Harvard
architecture, code generally doesn't have addresses. Instead, there is
an indirection mechanism using so-called tables. Essentially, this
mechanism boils down to enumerating the indirectly callable functions
with small integers. We reserve now a part of the 32 bits of the code
pointer for these function IDs. Another part is used for encoding the
subfunction index. One bit indicates whether a function is restarted
(as if it went through the `RESTART` instruction). The LSB of code
pointers is always 1, so that the GC doesn't touch code pointers.

Fourth, there is no global stack pointer. Instead, each function is
called with a frame pointer, which is the address of the first
argument (and further arguments may follow at higher stack positions).
The function can now use more stack space, so that the non-negative
offsets are arguments, and the negative ones are local values.
(Note that WebAssembly doesn't support negative offsets directly,
and as a workaround we compute another pointer `bp` which is just
`fp` shifted by some value so that negative offsets do not occur
anymore.)

A function returns one 32 bit value, which is normally the result
value. If, however, a `NULL` is returned, this has to be treated by
the caller as if an exception was raised (as a quicker alternative
to calling `wasicaml_throw()`).

OCaml functions can be called with too few arguments. This case is
tested for at runtime (`extra_args` is too small), and if true, a
closure is created that saves the existing arguments, and prepares for
re-execution when more arguments become available. The code pointer of
this closure sets the `RESTART` bit - but is otherwise the same code
pointer. The `RESTART` bit causes that the saved arguments are pushed
back to the stack when the closure is finally called. The translation
of this scheme involves just two additional cheap tests that are
always done when a function is called. Unfortunately, quite some
additional code needs to be generated for the case that one of the
tests yields a true result.

The opposite edge case is that an OCaml function is called with too
many arguments. The additional arguments can be "consumed" when the
function returns another function. For this reason, it has to be
checked on function return whether there are arguments left and a
closure to call. If so, this is done.

## Inside functions

Now, the question is how the OCaml bytecode instructions are translated
into WebAssembly.

The main difficulty is the garbage collector (GC). WebAssembly knows
"local variables" which could be seen as an open number of registers of a
classical CPU. However, there is no way to access the variables from
outside, and hence the GC has no chance of scanning them for addresses.
Because of this, we generally do not put boxed values into variables
(actually, there is one exception: the `accu` variable - leading to
special treatment here and there).

We inherit the stack of the OCaml bytecode interpreter and use it as
"shadow stack" for OCaml functions. All OCaml values are allowed on
the shadow stack because it is scannable.

We have to take care that the shadow stack is always fully initialized
with values (invalid addresses confuse the non-conservative GC).

The OCaml bytecode assumes a stack machine, and uses push/pop for
manipulating the stack. This is non-optimal for us. It is better to
access the stack positions relative to the beginning of the frame.
For this reason, the stack code is transformed to that style
(which is kind of a conversion from stack machine to register machine).

Instructions that give integer results normally do not write the
integers to the stack, but first to a local variable. Only if needed
the variable is later put onto the stack (e.g. for calling other
functions). This gives the WebAssembly engine some more freedom to
generate optimal code in the JIT phase, and chances are that
the lowest-level arithmetic is done in CPU registers.

The translation scheme is still fairly simple, and there are a number
of ideas how to improve it (see the Github issues for this project).

## Emitted code

The `Wc_emit` module returns an S-expression representing the code
in WAT format. We also write this code to a file (which is very
readable).

Unfortunately, we cannot directly feed the `.wat` file to an assembler
for further processing. We had to go another route. LLVM features an
integrated assembler, and it turned out it also understands a form
of textual WebAssembly. The S-expression is converted to this format,
resulting into an `.s` file.