https://github.com/nebulastream/nautilus
Nautilus is a lightweight tracing JIT compiler for C++
https://github.com/nebulastream/nautilus
cpp jit-compiler llvm mlir
Last synced: about 2 months ago
JSON representation
Nautilus is a lightweight tracing JIT compiler for C++
- Host: GitHub
- URL: https://github.com/nebulastream/nautilus
- Owner: nebulastream
- License: mit
- Created: 2024-02-05T19:46:49.000Z (about 2 years ago)
- Default Branch: main
- Last Pushed: 2025-12-25T17:07:44.000Z (3 months ago)
- Last Synced: 2025-12-26T08:20:28.812Z (3 months ago)
- Topics: cpp, jit-compiler, llvm, mlir
- Language: C++
- Homepage: https://nebula.stream
- Size: 5.85 MB
- Stars: 30
- Watchers: 9
- Forks: 9
- Open Issues: 10
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Nautilus: A tracing jit compiler for C++
[](https://github.com/nebulastream/nautilus/actions/workflows/build.yml)
Nautilus is a lightweight and adaptable just-in-time (JIT) compiler for C++ projects.
It offers:
1. A high-level code generation API that accommodates C++ control flows.
2. A tracing JIT compiler that produces a lightweight intermediate representation (IR) from imperative code fragments.
3. Multiple code-generation backends, allowing users to balance compilation latency and code quality at runtime (see [benchmarks](https://nebulastream.github.io/nautilus/dev/bench/)).
Nautilus is used for the query compiler of NebulaStream, a data management system from the DIMA group at TU Berlin.
Learn more about Nebula Stream at https://www.nebula.stream
### Example
The example below demonstrates Nautilus with a simplified aggregation operator,
`ConditionalSum`. This function aggregates integer values based on a boolean mask.
Nautilus introduce `val<>` objects to capture all executed operations in an intermediate representation during tracing.
Depending on the execution context, it can utilize a bytecode interpreter or generate efficient MLIR or C++ code.
This enables Nautilus to trade of performance characteristics and to optimize the generated code towards the target
hardware.
```c++
val conditionalSum(val size, val mask, val array) {
val sum = 0;
for (val i = 0; i < size; i++) {
// check mask
if (mask[i]) {
// load value from array at position i
val value = array[i];
// add value to sum
sum += value;
}
}
return sum;
}
int main(int, char*[]) {
engine::Options options;
options.setOption("engine.backend", "cpp");
// options.setOption("engine.Compilation", false);
auto engine = engine::NautilusEngine(options);
auto function = engine.registerFunction(conditionalSum);
auto mask = new bool[4] {true, true, false, true};
auto array = new int32_t[4] {1, 2, 3, 4};
auto result = function(4, mask, array);
std::cout << "Result: " << result << std::endl;
return 0;
}
```
### Build:
To build Nautilus from source execute use cmake:
```sh
mkdir build
cd build
cmake ..
cmake --build . --target nautilus
```
### Components:
The codebase is structured in the following components:
| Component | Description |
|-----------------------------------|-----------------------------------------------------------------------------------------------------------|
| [include](nautilus/include) | Contains the public api of Nautilus, e.g., `val` objects. |
| [tracing](nautilus/src/tracing) | Hosts core functionality for tracing generic C++ code. |
| [compiler](nautilus/src/compiler) | Implements the Nautilus compiler, including its IR, optimization passes, and various generation backends. |
### Publication:
This paper discusses Nautilus's architecture and its usage in the NebulaStream query compiler.
Note that it references an earlier version of the code-generation API, which has changed.
```BibTeX
@article{10.1145/3654968,
author = {Grulich, Philipp M. and Lepping, Aljoscha P. and Nugroho, Dwi P. A. and Pandey, Varun and Del Monte, Bonaventura and Zeuch, Steffen and Markl, Volker},
title = {Query Compilation Without Regrets},
year = {2024},
issue_date = {June 2024},
volume = {2},
number = {3},
url = {https://doi.org/10.1145/3654968},
doi = {10.1145/3654968},
journal = {Proc. ACM Manag. Data},
articleno = {165},
numpages = {28},
}
```
### Related Work:
The following work is related to Nautilus and influenced our design decisions.
* [Tidy Tuples and Flying Start](db.in.tum.de/~kersten/Tidy%20Tuples%20and%20Flying%20Start%20Fast%20Compilation%20and%20Fast%20Execution%20of%20Relational%20Queries%20in%20Umbra.pdf):
This paper describes the low-latency query compilation approach of [Umbra](https://umbra-db.com/).
This work was one of the main motivations for the creation of the Nautilus project and its use in NebulaStream.
* [Flounder](https://vldb.org/pvldb/vol14/p2691-funke.pdf):
Flounder is simple low latency jit compiler that based on [AsmJit](https://asmjit.com/), which is designed for query
compilation.
* [Build-It](https://buildit.so/):
BuildIt is a framework for developing Domain Specific Languages in C++.
It pioneered the capability of extracting control-flow information form imperative C++ code.
* [GraalVM](https://www.graalvm.org/):
The GraalVM project provides a framework to implement AST interpreters that can be turned into high-performance code
through partial evaluation.
* [MLIR](https://mlir.llvm.org/):
The MLIR project provides a novel approach to building reusable and extensible compiler infrastructure.
Nautilus leverages it as a foundation for its high-performance compilation backend.
* [MIR](https://github.com/vnmakarov/mir):
The MIR projects provides a lightweight jit compiler that targets low compilation latency.
Nautilus leverages MIR as a low latency compilation backend.