Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/jserv/amacc

Small C Compiler generating ELF executable Arm architecture, supporting JIT execution
https://github.com/jserv/amacc

arm armv7a c compiler dynamic-linking jit-compiler linux self-hosting

Last synced: 4 days ago
JSON representation

Small C Compiler generating ELF executable Arm architecture, supporting JIT execution

Awesome Lists containing this project

README

        

# AMaCC = Arguably Minimalist Arm C Compiler

## Introduction
AMaCC is a 32-bit Arm architecture compiler built from scratch.
It serves as a stripped-down version of C, designed as a pedagogical tool for
learning about compilers, linkers, and loaders.

There are two execution modes AMaCC implements:
* Just-in-Time (JIT) compiler for Arm backend.
* Generation of valid GNU/Linux executables using the Executable and Linkable Format (ELF).

It is worth mentioning that AMaCC is designed to compile a subset of C necessary
to self-host with the above execution modes. For instance, it supports global
variables, particularly global arrays.

A simple stack-based Abstract Syntax Tree (AST) is generated through cooperative
`stmt()` and `expr()` parsing functions, both fed by a token-generating function.
The `expr()` function performs some literal constant optimizations. The AST is
transformed into a stack-based VM Intermediate Representation (IR) using the
`gen()` function. The IR can be examined via a command-line option. Finally, the
`codegen()` function generates Arm32 instructions from the IR, which can be
executed via either `jit()` or `elf32()` executable generation

AMaCC combines classical recursive descent and operator precedence parsing. An
operator precedence parser proves to be considerably faster than a recursive
descent parser (RDP) for expressions when operator precedence is defined using
grammar productions that would otherwise be turned into methods.

## Compatibility
AMaCC is capable of compiling C source files written in the following
syntax:

* support for all C89 statements except typedef.
* support for all C89 expression operators.
* data types: char, int, enum, struct, union, and multi-level pointers
- type modifiers, qualifiers, and storage class specifiers are
currently unsupported, though many keywords of this nature
are not routinely used, and can be easily worked around with
simple alternative constructs.
- struct/union assignments are not supported at the language level
in AMaCC, e.g. s1 = s2. This also applies to function return
values and parameters. Passing and returning pointers is recommended.
Use memcpy if you want to copy a full struct, e.g.
memcpy(&s1, &s2, sizeof(struct xxx));
* global/local variable initializations for supported data types
- e.g., `int i = [expr]`
- New variables are allowed to be declared within functions anywhere.
- item-by-item array initialization is supported
- but aggregate array declaration and initialization is yet to be supported
e.g., `int foo[2][2] = { { 1, 0 }, { 0, 1 } };`

The architecture support targets armv7hf with Linux ABI, and it has been verified
on Raspberry Pi 2/3/4 with GNU/Linux.

## Prerequisites
* Code generator in AMaCC relies on several GNU/Linux behaviors, and it
is necessary to have Arm/Linux installed in your build environment.
* Install [GNU Toolchain for the A-profile Architecture](https://developer.arm.com/tools-and-software/open-source-software/developer-tools/gnu-toolchain/gnu-a/downloads)
- Select `arm-linux-none-gnueabihf` (AArch32 target with hard float)

* Install QEMU for Arm user emulation
```shell
sudo apt-get install qemu-user
```

## Running AMaCC
Run `make check` and you should see this:
```
[ C to IR translation ] Passed
[ JIT compilation + execution ] Passed
[ ELF generation ] Passed
[ nested/self compilation ] Passed
[ Compatibility with GCC/Arm ] ........................................
----------------------------------------------------------------------
Ran 52 tests in 8.842s

OK
```

Check the messages generated by `make help` to learn more.

## Benchmark
AMaCC is able to generate machine code really fast and provides 70% of the performance of `gcc -O0`.

Test environment:
* Raspberry Pi 4B (SoC: bcm2711, ARMv8-A architecture)
* Raspbian GNU/Linux, kernel 5.10.17-v7l+, gcc 8.3.0 (armv7l userland)

Input source file: `amacc.c`

| compiler driver | binary size (KiB) | compile time (s) |
| ---------------------------------- | ----------------- | ---------------- |
| gcc with `-O0 -ldl` (compile+link) | 56 | 0.5683 |
| gcc with `-O0 -c` (compile only) | 56 | 0.4884 |
| AMaCC | 100 | 0.0217 |

## Internals
Check [Intermediate Representation (IR) for AMaCC Compilation](docs/IR.md).

## Acknowledgements
AMaCC is based on the infrastructure of [c4](https://github.com/rswier/c4).

## Related Materials
* [Curated list of awesome resources on Compilers, Interpreters and Runtimes](http://aalhour.com/awesome-compilers/)
* [Hacker News discussions](https://news.ycombinator.com/item?id=11411124)
* [A Compiler Writing Journey](https://github.com/DoctorWkt/acwj) by Warren Toomey.