https://github.com/zouzias/microgbt
microGBT is a minimalistic Gradient Boosting Trees implementation
https://github.com/zouzias/microgbt
cpp11 gradient-boosting gradient-boosting-decision-trees xgboost-algorithm
Last synced: 4 months ago
JSON representation
microGBT is a minimalistic Gradient Boosting Trees implementation
- Host: GitHub
- URL: https://github.com/zouzias/microgbt
- Owner: zouzias
- License: apache-2.0
- Created: 2019-06-09T18:13:11.000Z (almost 7 years ago)
- Default Branch: master
- Last Pushed: 2024-11-17T00:35:42.000Z (over 1 year ago)
- Last Synced: 2024-11-17T01:21:29.942Z (over 1 year ago)
- Topics: cpp11, gradient-boosting, gradient-boosting-decision-trees, xgboost-algorithm
- Language: C++
- Homepage:
- Size: 678 KB
- Stars: 1
- Watchers: 3
- Forks: 0
- Open Issues: 7
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
[](https://app.codacy.com/manual/zouzias/microgbt?utm_source=github.com&utm_medium=referral&utm_content=zouzias/microgbt&utm_campaign=Badge_Grade_Settings)
[](https://travis-ci.org/zouzias/microgbt/builds)
[](https://coveralls.io/github/zouzias/microgbt?branch=master)
[](LICENSE)
# microGBT
microGBT is a minimalistic ([606 LOC](NOTES.md)) Gradient Boosting Trees implementation in C++11 following [xgboost's paper](https://arxiv.org/abs/1603.02754), i.e., the tree building process is based on the gradient and Hessian vectors (Newton-Raphson method).
A minimalist Python API is available using [pybind11](https://github.com/pybind/pybind11). To use it,
```python
import microgbtpy
params = {
"gamma": 0.1,
"lambda": 1.0,
"max_depth": 4.0,
"shrinkage_rate": 1.0,
"min_split_gain": 0.1,
"learning_rate": 0.1,
"min_tree_size": 3,
"num_boosting_rounds": 100.0,
"metric": 0.0
}
gbt = microgbtpy.GBT(params)
# Training
gbt.train(X_train, y_train, X_valid, y_valid, num_iters, early_stopping_rounds)
# Predict
y_pred = gbt.predict(x, gbt.best_iteration())
```
## Goals
The main goal of the project is to be educational and provide a minimalistic codebase that allows experimentation with Gradient Boosting Trees.
## Features
Currently, the following loss functions are supported:
* Logistic loss for binary classification, `logloss.h`
* Root Mean Squared Error (RMSE) for regression, `rmse.h`
Set the parameter `metric` to 0.0 and 1.0 for logistic regression and RMSE, respectively.
## Installation
To install locally
```bash
pip install git+https://github.com/zouzias/microgbt.git
```
Then, follow the instructions to run the titanic classification dataset.
## Development (docker)
```bash
git clone https://github.com/zouzias/microgbt.git
cd microgbt
docker-compose build microgbt
docker-compose run microgbt
./runBuild
```
### Binary Classification (Titanic)
A binary classification example using the [Titanic dataset](https://www.kaggle.com/naresh31/titanic-machine-learning-from-disaster). Run
```bash
cd examples/
./test-titanic.py
```
the output should include
````
precision recall f1-score support
0 0.75 0.96 0.84 78
1 0.91 0.55 0.69 56
micro avg 0.79 0.79 0.79 134
macro avg 0.83 0.76 0.77 134
weighted avg 0.82 0.79 0.78 134
`
````
### Regression Example (Lightgbm)
To run the LightGBM regression [example](https://github.com/microsoft/LightGBM/tree/master/examples/regression), type
````bash
cd examples/
./test-lightgbm-example.py
````
the output should end with
```
2019-05-19 22:54:04,825 - __main__ - INFO - *************[Testing]*************
2019-05-19 22:54:04,825 - __main__ - INFO - ******************************
2019-05-19 22:54:04,825 - __main__ - INFO - * [Testing]RMSE=0.447120
2019-05-19 22:54:04,826 - __main__ - INFO - * [Testing]R^2-Score=0.194094
2019-05-19 22:54:04,826 - __main__ - INFO - ******************************
```