https://github.com/rapidsai/legate-boost

GBM implementation on Legate
https://github.com/rapidsai/legate-boost

Last synced: 3 months ago
JSON representation

GBM implementation on Legate

Host: GitHub
URL: https://github.com/rapidsai/legate-boost
Owner: rapidsai
License: apache-2.0
Created: 2023-06-21T14:23:26.000Z (over 2 years ago)
Default Branch: main
Last Pushed: 2025-10-02T21:54:52.000Z (3 months ago)
Last Synced: 2025-10-02T23:36:12.301Z (3 months ago)
Language: Python
Homepage: https://rapidsai.github.io/legate-boost/
Size: 1.98 MB
Stars: 13
Watchers: 6
Forks: 11
Open Issues: 22
Metadata Files:
- Readme: README.md
- Contributing: contributing.md
- License: LICENSE

Awesome Lists containing this project

README

          # legate-boost

GBM implementation on Legate. The primary goals of `legate-boost` is to provide a state-of-the-art distributed GBM implementation on Legate, capable of running on CPUs or GPUs at supercomputer scale.

[API Documentation](https://rapidsai.github.io/legate-boost)

For developers - see [contributing](contributing.md)

## Installation

Install using `conda`.

```shell

# stable release

conda install -c legate -c conda-forge -c nvidia legate-boost

# nightly release

conda install -c legate/label/experimental -c legate -c conda-forge -c nvidia legate-boost

```

On systems without a GPU, the CPU-only package should automatically be installed.

On systems with a GPU and compatible CUDA version, the GPU package should automatically be installed.

To force `conda` to prefer one, pass the build strings `*_cpu*` or `*_gpu*`, for example:

```shell

# nightly release (CPU-only)

conda install --dry-run -c legate/label/experimental -c legate -c conda-forge -c nvidia \

    'legate-boost=*=*_cpu*'

```

For more details on building from source and setting up a development environment, see [`contributing.md`](./contributing.md).

## Simple example

Run with the legate launcher

```bash

legate example_script.py

```

```python

>>> import cupynumeric as cn

>>> import legateboost as lb

>>> X = cn.random.random((1000, 10))

>>> y = cn.random.random(X.shape[0])

>>> model = lb.LBRegressor().fit(X, y)

```

## Features

### Model ensembling

`legate-boost` can create models from linear combinations of other models. Ensembling is as easy as:

```python

>>> import cupynumeric as cn

>>> import legateboost as lb

>>> X = cn.random.random((1000, 10))

>>> X_train_a = X[:500]

>>> X_train_b = X[500:]

>>> y = cn.random.random(X.shape[0])

>>> y_train_a = y[:500]

>>> y_train_b = y[500:]

>>> model_a = lb.LBRegressor().fit(X_train_a, y_train_a)

>>> len(model_a)

100

>>> model_b = lb.LBRegressor().fit(X_train_b, y_train_b)

>>> len(model_b)

100

>>> model_c = (model_a + model_b) * 0.5

>>> len(model_c)

200

```

### Probabilistic regression

`legate-boost` can learn distributions for continuous data. This is useful in cases where simply predicting the mean does not carry enough information about the training data:



The above example can be found here: [examples/probabilistic_regression](https://github.com/rapidsai/legate-boost/tree/main/examples/probabalistic_regression).

### Batch training

`legate-boost` can train on datasets that do not fit into memory by splitting the dataset into batches and training the model with `partial_fit`.

```python

>>> import cupynumeric as cn

>>> import legateboost as lb

>>> from sklearn.utils import gen_even_slices

>>> X = cn.random.random((1000, 10))

>>> y = cn.random.random(X.shape[0])

>>> total_estimators = 100

>>> estimators_per_batch = 10

>>> n_batches = total_estimators // estimators_per_batch

>>> train_batches = [(X[i], y[i]) for i in gen_even_slices(X.shape[0], n_batches)]

>>> model = lb.LBRegressor(n_estimators=estimators_per_batch)

>>> for i in range(total_estimators // estimators_per_batch):

...     X_batch, y_batch = train_batches[i % n_batches]

...     model = model.partial_fit(X_batch, y_batch)

```



The above example can be found here: [examples/batch_training](https://github.com/rapidsai/legate-boost/tree/main/examples/batch_training).

### Different model types

`legate-boost` supports tree models, linear models, kernel ridge regression models, custom user models and any combinations of these models.

The following example shows a model combining linear and decision tree base learners on a synthetic dataset.

```python

model = lb.LBRegressor(base_models=(lb.models.Linear(), lb.models.Tree(max_depth=1),), **params).fit(X, y)

```



The second example shows a model combining kernel ridge regression and decision tree base learners on the wine quality dataset.

```python

model = lb.LBRegressor(base_models=(lb.models.KRR(sigma=0.5), lb.models.Tree(max_depth=5),), **params).fit(X, y)

```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/rapidsai/legate-boost

Awesome Lists containing this project

README