https://github.com/perpetual-ml/perpetual

A self-generalizing gradient boosting machine that doesn't need hyperparameter optimization
https://github.com/perpetual-ml/perpetual

gbdt gbm gradient-boosted-trees gradient-boosting gradient-boosting-decision-trees kaggle machine-learning python rust

Last synced: 3 months ago
JSON representation

A self-generalizing gradient boosting machine that doesn't need hyperparameter optimization

Host: GitHub
URL: https://github.com/perpetual-ml/perpetual
Owner: perpetual-ml
License: gpl-3.0
Created: 2024-05-20T14:14:55.000Z (over 1 year ago)
Default Branch: main
Last Pushed: 2025-06-15T10:26:33.000Z (4 months ago)
Last Synced: 2025-06-15T11:27:58.152Z (4 months ago)
Topics: gbdt, gbm, gradient-boosted-trees, gradient-boosting, gradient-boosting-decision-trees, kaggle, machine-learning, python, rust
Language: Rust
Homepage: https://perpetual-ml.com/
Size: 780 KB
Stars: 506
Watchers: 8
Forks: 26
Open Issues: 7
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE

Awesome Lists containing this project

awesome-rust - perpetual-ml/perpetual - A self-generalizing gradient boosting machine which doesn't need hyperparameter optimization. (Libraries / Artificial Intelligence)
fucking-awesome-rust - perpetual-ml/perpetual - A self-generalizing gradient boosting machine which doesn't need hyperparameter optimization. (Libraries / Artificial Intelligence)
fucking-awesome-rust - perpetual-ml/perpetual - A self-generalizing gradient boosting machine which doesn't need hyperparameter optimization. (Libraries / Artificial Intelligence)
fucking-awesome-datascience - PerpetualBooster
awesome-datascience - PerpetualBooster

README

          


  





[![Python Versions](https://img.shields.io/pypi/pyversions/perpetual.svg?logo=python&logoColor=white)](https://pypi.org/project/perpetual)

[![PyPI Version](https://img.shields.io/pypi/v/perpetual.svg?logo=pypi&logoColor=white)](https://pypi.org/project/perpetual)

[![Crates.io Version](https://img.shields.io/crates/v/perpetual?logo=rust&logoColor=white)](https://crates.io/crates/perpetual)

[![Static Badge](https://img.shields.io/badge/join-discord-blue?logo=discord)](https://discord.gg/AyUK7rr6wy)

![PyPI - Downloads](https://img.shields.io/pypi/dm/perpetual)



# Perpetual

PerpetualBooster is a gradient boosting machine (GBM) algorithm that doesn't need hyperparameter optimization unlike other GBM algorithms. Similar to AutoML libraries, it has a `budget` parameter. Increasing the `budget` parameter increases the predictive power of the algorithm and gives better results on unseen data. Start with a small budget (e.g. 0.5) and increase it (e.g. 1.0) once you are confident with your features. If you don't see any improvement with further increasing the `budget`, it means that you are already extracting the most predictive power out of your data.

## Usage

You can use the algorithm like in the example below. Check examples folders for both Rust and Python.

```python

from perpetual import PerpetualBooster

model = PerpetualBooster(objective="SquaredLoss", budget=0.5)

model.fit(X, y)

```

## Documentation

Documentation for the Python API can be found [here](https://perpetual-ml.github.io/perpetual) and for the Rust API [here](https://docs.rs/perpetual/latest/perpetual/).

## Benchmark

### PerpetualBooster vs. Optuna + LightGBM

Hyperparameter optimization usually takes 100 iterations with plain GBM algorithms. PerpetualBooster achieves the same accuracy in a single run. Thus, it achieves up to 100x speed-up at the same accuracy with different `budget` levels and with different datasets.

The following table summarizes the results for the [California Housing](https://scikit-learn.org/stable/modules/generated/sklearn.datasets.fetch_california_housing.html) dataset (regression):

| Perpetual budget | LightGBM n_estimators | Perpetual mse | LightGBM mse | Speed-up wall time | Speed-up cpu time |

| ---------------- | --------------------- | ------------- | ------------ | ------------------ | ----------------- |

| 1.0              | 100                   | 0.192         | 0.192        | 54x                | 56x               |

| 1.5              | 300                   | 0.188         | 0.188        | 59x                | 58x               |

| 2.1              | 1000                  | 0.185         | 0.186        | 42x                | 41x               |

The following table summarizes the results for the [Cover Types](https://scikit-learn.org/stable/modules/generated/sklearn.datasets.fetch_covtype.html) dataset (classification):

| Perpetual budget | LightGBM n_estimators | Perpetual log loss | LightGBM log loss | Speed-up wall time | Speed-up cpu time |

| ---------------- | --------------------- | ------------------ | ----------------- | ------------------ | ----------------- |

| 0.9              | 100                   | 0.091              | 0.084             | 72x                | 78x               |

The results can be reproduced using the scripts in the [examples](./python-package/examples) folder.

### PerpetualBooster vs. AutoGluon

PerpetualBooster is a GBM but behaves like AutoML so it is benchmarked also against AutoGluon (v1.2, best quality preset), the current leader in [AutoML benchmark](https://automlbenchmark.streamlit.app/cd_diagram). Top 10 datasets with the most number of rows are selected from [OpenML datasets](https://www.openml.org/) for both regression and classification tasks. 

The results are summarized in the following table for regression tasks:

| OpenML Task | Perpetual Training Duration | Perpetual Inference Duration | Perpetual RMSE | AutoGluon Training Duration | AutoGluon Inference Duration | AutoGluon RMSE |

| -------------------------------------------------------- | ----- | ----- | ------------------- | -------- | ------ | ------------------ |

| [Airlines_DepDelay_10M](https://www.openml.org/t/359929) | 518   | 11.3  | 29.0                | 520      | 30.9   |  28.8   |

| [bates_regr_100](https://www.openml.org/t/361940)        | 3421  | 15.1  |  1.084   | OOM      | OOM    | OOM                |

| [BNG(libras_move)](https://www.openml.org/t/7327)        | 1956  | 4.2   |  2.51    | 1922     | 97.6   | 2.53               |

| [BNG(satellite_image)](https://www.openml.org/t/7326)    | 334   | 1.6   | 0.731               | 337      | 10.0   |  0.721  |

| [COMET_MC](https://www.openml.org/t/14949)               | 44    | 1.0   |  0.0615  | 47       | 5.0    | 0.0662             |

| [friedman1](https://www.openml.org/t/361939)             | 275   | 4.2   |  1.047   | 278      | 5.1    | 1.487              |

| [poker](https://www.openml.org/t/10102)                  | 38    | 0.6   |  0.256   | 41       | 1.2    | 0.722              |

| [subset_higgs](https://www.openml.org/t/361955)          | 868   | 10.6  |  0.420   | 870      | 24.5   | 0.421              |

| [BNG(autoHorse)](https://www.openml.org/t/7319)          | 107   | 1.1   |  19.0    | 107      | 3.2    | 20.5               |

| [BNG(pbc)](https://www.openml.org/t/7318)                | 48    | 0.6   |  836.5   | 51       | 0.2    | 957.1              |

| average                                                  | 465   | 3.9   | -                   | 464      | 19.7   | -                  |

PerpetualBooster outperformed AutoGluon on 8 out of 10 regression tasks, training equally fast and inferring 5.1x faster. 

The results are summarized in the following table for classification tasks:

| OpenML Task | Perpetual Training Duration | Perpetual Inference Duration | Perpetual AUC | AutoGluon Training Duration | AutoGluon Inference Duration | AutoGluon AUC |

| -------------------------------------------------------- | ------- | ------ | ------------------- | -------- | ------ | ------------------ |

| [BNG(spambase)](https://www.openml.org/t/146163)         | 70.1    | 2.1   |  0.671  | 73.1     | 3.7    | 0.669              |

| [BNG(trains)](https://www.openml.org/t/208)              | 89.5    | 1.7   |  0.996  | 106.4    | 2.4    | 0.994              |

| [breast](https://www.openml.org/t/361942)                | 13699.3 | 97.7  |  0.991  | 13330.7  | 79.7   | 0.949              |

| [Click_prediction_small](https://www.openml.org/t/7291)  | 89.1    | 1.0   |  0.749  | 101.0    | 2.8    | 0.703              |

| [colon](https://www.openml.org/t/361938)                 | 12435.2 | 126.7 |  0.997  | 12356.2  | 152.3  | 0.997              |

| [Higgs](https://www.openml.org/t/362113)                 | 3485.3  | 40.9  |  0.843  | 3501.4   | 67.9   | 0.816              |

| [SEA(50000)](https://www.openml.org/t/230)               | 21.9    | 0.2   |  0.936  | 25.6     | 0.5    | 0.935              |

| [sf-police-incidents](https://www.openml.org/t/359994)   | 85.8    | 1.5   |  0.687  | 99.4     | 2.8    | 0.659              |

| [bates_classif_100](https://www.openml.org/t/361941)     | 11152.8 | 50.0  |  0.864  | OOM      | OOM    | OOM                |

| [prostate](https://www.openml.org/t/361945)              | 13699.9 | 79.8  |  0.987  | OOM      | OOM    | OOM                |

| average                                                  | 3747.0  | 34.0  | -                  | 3699.2   | 39.0   | -                  |

PerpetualBooster outperformed AutoGluon on 10 out of 10 classification tasks, training equally fast and inferring 1.1x faster. 

PerpetualBooster demonstrates greater robustness compared to AutoGluon, successfully training on all 20 tasks, whereas AutoGluon encountered out-of-memory errors on 3 of those tasks.

The results can be reproduced using the automlbenchmark fork [here](https://github.com/deadsoul44/automlbenchmark).

## Installation

The package can be installed directly from [pypi](https://pypi.org/project/perpetual):

```shell

pip install perpetual

```

Using [conda-forge](https://anaconda.org/conda-forge/perpetual):

```shell

conda install conda-forge::perpetual

```

To use in a Rust project and to get the package from [crates.io](https://crates.io/crates/perpetual):

```shell

cargo add perpetual

```

## Contribution

Contributions are welcome. Check CONTRIBUTING.md for the guideline.

## Paper

PerpetualBooster prevents overfitting with a generalization algorithm. The paper is work-in-progress to explain how the algorithm works. Check our [blog post](https://perpetual-ml.com/blog/how-perpetual-works) for a high level introduction to the algorithm.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/perpetual-ml/perpetual

Awesome Lists containing this project

README