https://github.com/siboehm/lleaves

Compiler for LightGBM gradient-boosted trees, based on LLVM. Speeds up prediction by ≥10x.
https://github.com/siboehm/lleaves

decision-trees gradient-boosting lightgbm llvm machine-learning python

Last synced: 5 months ago
JSON representation

Compiler for LightGBM gradient-boosted trees, based on LLVM. Speeds up prediction by ≥10x.

Host: GitHub
URL: https://github.com/siboehm/lleaves
Owner: siboehm
License: mit
Created: 2021-04-27T06:35:23.000Z (over 4 years ago)
Default Branch: master
Last Pushed: 2025-05-05T23:52:48.000Z (5 months ago)
Last Synced: 2025-05-14T11:11:05.241Z (5 months ago)
Topics: decision-trees, gradient-boosting, lightgbm, llvm, machine-learning, python
Language: Python
Homepage: https://lleaves.readthedocs.io/en/latest/
Size: 4.75 MB
Stars: 416
Watchers: 9
Forks: 32
Open Issues: 22
Metadata Files:
- Readme: README.md
- License: LICENSE
- Citation: CITATION.cff

Awesome Lists containing this project

README

          # lleaves 🍃

![CI](https://github.com/siboehm/lleaves/workflows/CI/badge.svg)

[![Documentation Status](https://readthedocs.org/projects/lleaves/badge/?version=latest)](https://lleaves.readthedocs.io/en/latest/?badge=latest)

![Downloads](https://static.pepy.tech/badge/lleaves)

A LLVM-based compiler for LightGBM decision trees.

`lleaves` converts trained LightGBM models to optimized machine code, speeding-up prediction by ≥10x.

## Example

```python

lgbm_model = lightgbm.Booster(model_file="NYC_taxi/model.txt")

%timeit lgbm_model.predict(df)

# 12.77s

llvm_model = lleaves.Model(model_file="NYC_taxi/model.txt")

llvm_model.compile()

%timeit llvm_model.predict(df)

# 0.90s 

```

## Why lleaves?

- Speed: Both low-latency single-row prediction and high-throughput batch-prediction.

- Drop-in replacement: The interface of `lleaves.Model` is a subset of `LightGBM.Booster`.

- Dependencies: `llvmlite` and `numpy`. LLVM comes statically linked.

## Installation

`conda install -c conda-forge lleaves` or `pip install lleaves` (Linux and MacOS only).

## Benchmarks

Ran on a dedicated Intel i7-4770 Haswell, 4 cores.

Stated runtime is the minimum over 20.000 runs.

### Dataset: [NYC-taxi](https://www1.nyc.gov/site/tlc/about/tlc-trip-record-data.page)

mostly numerical features.

|batchsize   | 1  | 10| 100 |

|---|---:|---:|---:|

|LightGBM   | 52.31μs   | 84.46μs   | 441.15μs |

|ONNX  Runtime| 11.00μs | 36.74μs | 190.87μs  |

|Treelite   | 28.03μs   | 40.81μs   | 94.14μs  |

|``lleaves``   | 9.61μs | 14.06μs | 31.88μs  |

### Dataset: [MTPL2](https://www.openml.org/d/41214)

mix of categorical and numerical features.

|batchsize   | 10,000  | 100,000  | 678,000 |

|---|---:|---:|---:|

|LightGBM   | 95.14ms | 992.47ms   | 7034.65ms  |

|ONNX  Runtime | 38.83ms  | 381.40ms  | 2849.42ms  |

|Treelite   | 38.15ms | 414.15ms  | 2854.10ms  |

|``lleaves``  | 5.90ms  | 56.96ms | 388.88ms |

## Advanced Usage

To avoid expensive recompilation, you can call `lleaves.Model.compile()` and pass a `cache=` argument.

This will store an ELF (Linux) / Mach-O (macOS) file at the given path when the method is first called.

Subsequent calls of `compile(cache=)` will skip compilation and load the stored binary file instead.

For more info, see [docs](https://lleaves.readthedocs.io/en/latest/).

To eliminate any Python overhead during inference you can link against this generated binary.

For an example of how to do this see `benchmarks/c_bench/`.

The function signature might change between major versions.

## Development

High-level explanation of the inner workings of the lleaves compiler: [link](https://siboehm.com/articles/21/lleaves)

```bash

mamba env create

conda activate lleaves

pip install -e .

pre-commit install

./benchmarks/data/setup_data.sh

pytest -k "not benchmark"

```

## Cite

If you're using lleaves for your research, I'd appreciate if you could cite it. Use:

```

@software{Boehm_lleaves,

  author = {Boehm, Simon},

  title = {lleaves},

  url = {https://github.com/siboehm/lleaves},

  license = {MIT},

}

```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/siboehm/lleaves

Awesome Lists containing this project

README