https://github.com/NVlabs/gbrl

Gradient Boosting Reinforcement Learning (GBRL)
https://github.com/NVlabs/gbrl

Last synced: 3 months ago
JSON representation

Gradient Boosting Reinforcement Learning (GBRL)

Host: GitHub
URL: https://github.com/NVlabs/gbrl
Owner: NVlabs
License: mit
Created: 2024-06-02T16:20:40.000Z (almost 2 years ago)
Default Branch: master
Last Pushed: 2025-11-09T18:38:22.000Z (5 months ago)
Last Synced: 2026-01-14T12:18:48.669Z (3 months ago)
Language: C++
Homepage: https://nvlabs.github.io/gbrl/
Size: 8.14 MB
Stars: 133
Watchers: 7
Forks: 6
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE
- Cla: CLA.md

Awesome Lists containing this project

awesome-gradient-boosting-papers - [Paper

README

          # Gradient Boosting Reinforcement Learning (GBRL)

GBRL is a Python-based Gradient Boosting Trees (GBT) library, similar to popular packages such as [XGBoost](https://xgboost.readthedocs.io/en/stable/), [CatBoost](https://catboost.ai/), but specifically designed and optimized for reinforcement learning (RL). GBRL is implemented in C++/CUDA aimed to seamlessly integrate within popular RL libraries. 

[![License](https://img.shields.io/badge/license-MIT-green.svg)](https://github.com/NVlabs/gbrl/blob/master/LICENSE)

[![PyPI version](https://badge.fury.io/py/gbrl.svg)](https://badge.fury.io/py/gbrl)

## Overview

GBRL adapts the power of Gradient Boosting Trees to the unique challenges of RL environments, including non-stationarity and the absence of predefined targets. The following diagram illustrates how GBRL uses gradient boosting trees in RL:

![GBRL Diagram](https://github.com/NVlabs/gbrl/raw/master/docs/images/gbrl_diagram.png)

GBRL features a shared tree-based structure for policy and value functions, significantly reducing memory and computational overhead, enabling it to tackle complex, high-dimensional RL problems.

## Key Features: 

- GBT Tailored for RL: GBRL adapts the power of Gradient Boosting Trees to the unique challenges of RL environments, including non-stationarity and the absence of predefined targets.

- Optimized Actor-Critic Architecture: GBRL features a shared tree-based structure for policy and value functions. This significantly reduces memory and computational overhead, enabling it to tackle complex, high-dimensional RL problems.

- Hardware Acceleration: GBRL leverages CUDA for hardware-accelerated computation, ensuring efficiency and speed.

- Seamless Integration: GBRL is designed for easy integration with popular RL libraries. We implemented GBT-based actor-critic algorithm implementations (A2C, PPO, and AWR) in stable_baselines3 [GBRL_SB3](https://github.com/NVlabs/gbrl_sb3). 

## Performance

The following results, obtained using the `GBRL_SB3` repository, demonstrate the performance of PPO with GBRL compared to neural-networks across various scenarios and environments:

![PPO GBRL results in stable_baselines3](https://github.com/NVlabs/gbrl/raw/master/docs/images/relative_ppo_performance.png)

## Getting started

### Dependencies

- Python 3.9 or higher

### Installation

GBRL provides pre-compiled binaries for easy installation. Choose **one** of the following options:

**CPU-only installation** (default):  

```pip install gbrl```

**GPU-enabled installation** (requires CUDA 12 runtime libraries):  

```pip install gbrl-gpu```

For further installation details and dependencies see the documentation. 

### Usage Example

For a detailed usage example, see `tutorial.ipynb`

## Current Supported Features

### Tree Fitting

- Greedy (Depth-wise) tree building - (CPU/GPU)  

- Oblivious (Symmetric) tree building - (CPU/GPU)  

- L2 split score - (CPU/GPU)  

- Cosine split score - (CPU/GPU) 

- Uniform based candidate generation - (CPU/GPU)

- Quantile based candidate generation - (CPU/GPU)

- Supervised learning fitting / Multi-iteration fitting - (CPU/GPU)

    - MultiRMSE loss (only)

- Categorical inputs

- Input feature weights - (CPU/GPU)

### GBT Inference

- SGD optimizer - (CPU/GPU)

- ADAM optimizer - (CPU only)

- Control Variates (gradient variance reduction technique) - (CPU only)

- Shared Tree for policy and value function - (CPU/GPU)

- Linear and constant learning rate scheduler - (CPU/GPU only constant)

- Support for up to two different optimizers (e.g, policy/value) - **(CPU/GPU if both are SGD)

- SHAP value calculation

# Documentation 

For comprehensive documentation, visit the [GBRL documentation](https://nvlabs.github.io/gbrl/).

# Contributing

To contribute to GBRL, please review and sign the Contributor License Agreement (CLA) available at: [https://github.com/NVlabs/gbrl/blob/master/CLA.md](https://github.com/NVlabs/gbrl/blob/master/CLA.md)

# Citation

``` 

@inproceedings{

fuhrer2025gradient,

title={Gradient Boosting Reinforcement Learning},

author={Benjamin Fuhrer and Chen Tessler and Gal Dalal},

booktitle={Forty-second International Conference on Machine Learning},

year={2025},

url={https://arxiv.org/abs/2407.08250}

}

```

# Licenses

Copyright © 2024-2025, NVIDIA Corporation. All rights reserved.

This work is made available under the NVIDIA The MIT License. Click [here](https://github.com/NVlabs/gbrl/blob/master/LICENSE). to view a copy of this license.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/NVlabs/gbrl

Awesome Lists containing this project

README