Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/btcross26/genestboost
General boosting framework for any regression estimator
https://github.com/btcross26/genestboost
data-science gradient-boosting machine-learning python3
Last synced: 14 days ago
JSON representation
General boosting framework for any regression estimator
- Host: GitHub
- URL: https://github.com/btcross26/genestboost
- Owner: btcross26
- License: bsd-3-clause
- Created: 2019-02-25T11:10:40.000Z (almost 6 years ago)
- Default Branch: main
- Last Pushed: 2022-02-28T02:34:48.000Z (almost 3 years ago)
- Last Synced: 2024-10-23T02:19:20.847Z (2 months ago)
- Topics: data-science, gradient-boosting, machine-learning, python3
- Language: Python
- Homepage: https://btcross26.github.io/genestboost/build/html/index.html
- Size: 13.3 MB
- Stars: 2
- Watchers: 3
- Forks: 0
- Open Issues: 9
-
Metadata Files:
- Readme: README.rst
- Changelog: changelog.rst
- License: LICENSE.txt
Awesome Lists containing this project
README
.. README.rst
.. image:: https://img.shields.io/badge/python-3.7-green.svg
:target: https://www.python.org
.. image:: https://img.shields.io/badge/python-3.8-green.svg
:target: https://www.python.org
.. image:: https://img.shields.io/badge/python-3.9-green.svg
:target: https://www.python.org
.. image:: https://img.shields.io/badge/code%20style-black-000000.svg
:target: https://github.com/psf/black
.. image:: https://github.com/btcross26/genestboost/workflows/build_tests/badge.svg
:target: https://github.com/btcross26/genestboost/actions/build_tests
.. image:: https://img.shields.io/badge/License-BSD%203--Clause-blue.svg
:target: https://opensource.org/licenses/BSD-3-Clause
.. image:: https://badge.fury.io/py/genestboost.svg
:target: https://pypi.python.org/pypi/genestboost
.. image:: https://img.shields.io/conda/vn/conda-forge/genestboost.svg
:target: https://anaconda.org/conda-forge/genestboost|
.. image:: https://user-images.githubusercontent.com/7505706/120132584-968dd780-c198-11eb-8843-55bc23310657.png`Documentation Home `__ | `Quick Coding Example`_ | `Additional Examples`_ | `Limitations`_ | `Installation`_ | `Changelog `__
:code:`genestboost` is an ML boosting library that separates the modeling algorithm from the boosting algorithm. The result is that you can boost any generic regression model, not just trees. Build a forward-thinking (forward-propagating) neural network if you wish, or build an ensemble of support vector machines if you would so desire. Mix and match link and loss functions at will.
Quick Coding Example
--------------------Boost simple neural networks to predict a binary target:
.. code-block:: python
from sklearn.neural_network import MLPRegressor
from sklearn.datasets import make_classification
import matplotlib.pyplot as pltfrom genestboost import BoostedModel
from genestboost.loss_functions import LogLoss
from genestboost.link_functions import LogitLink# generate a dummy dataset - the library expects numpy arrays of dtype float
X, y = make_classification(
n_samples=10000,
n_features=50,
n_informative=30,
weights=[0.90, 0.10],
random_state=17,
)# create a boosted model instance
model = BoostedModel(
link=LogitLink(), # link function to use
loss=LogLoss(), # loss function to use
model_callback=MLPRegressor, # callback creates model with fit, predict
model_callback_kwargs={ # keyword arguments to the callback
"hidden_layer_sizes": (16,),
"max_iter": 1000,
"alpha": 0.2,
},
weights="newton", # newton = scale gradients with second derivatives
alpha=1.0, # initial learning rate to try
step_type="decaying", # learning rate type
step_decay_factor=0.50, # learning rate decay factor
validation_fraction=0.20, # fraction of training set to use for holdout
validation_iter_stop=5, # stopping criteria
validation_stratify=True, # stratify the holdout set by the target (classification)
)# fit the model
model.fit(X, y, min_iterations=10, iterations=100)# evaluate the model
print(model.get_iterations())
predictions = model.predict(X) # predicted y's (probabilities in this case)
scores = model.decision_function(X) # predicted links (logits in this case)
plt.plot(model.get_loss_history(), label=["Training", "Holdout"])
plt.legend(loc="best")Additional Examples
-------------------
- `Quantile Regression with Different Modeling Algorithms `_
- `Binary Target Boosting with Custom Model Callback Wrapper `_
- `BoostedLinearModel with SimplePLS Algorithm `_
- `Alternative Fitting Procedure with Surrogate Loss Function `_
- `Forward Propagating Neural Network `_Limitations
-----------Separating the boosting and modeling algorithm may not give the most optimal performance outcomes when it comes to training and prediction speeds. The tool is also programmed in pure Python - for now. Thus, in its current state the library is primarily for research and development. In particular, the library classes can be easily extended to handle custom loss functions and custom link functions. The library can also serve as a foundation for more specialized boosting algorithms when the need to optimize for performance arises.
In the future, the library will be restructured slightly under the hood, and there are plans to parallelize ensemble prediction and move some performance bottlenecks to Nim (i.e., C-extensions). Support for boosting of multivariate targets will be added when time permits.
Installation
------------Create a virtual environment with Python >=3.7,<=3.9, and install from git:
.. code-block::
$ pip install git+https://github.com/btcross26/genestboost.git
Alternatively, you can install directly from PyPI:
.. code-block:: bash
$ pip install genestboost
Or from conda-forge:
.. code-block:: bash
$ conda install -c conda-forge genestboost
Documentation
-------------Documentation is a work in progress. The most recent documentation is available on `GitHub Pages `_.
Bugs / Requests
---------------Please use the `GitHub issue tracker `_ to submit bugs or request features.
Changelog
---------Consult the `Changelog `_ for the latest release information.
Contributing
------------If you would like to contribute, please fork this repository, create a branch off of :code:`main` for your contribution, and submit a PR to the :code:`dev_staging` branch. Also, please create an issue describing the nature of the contribution if it has not already been done.