{"id":22283584,"url":"https://github.com/statmixedml/py-boostlss","last_synced_at":"2026-03-09T13:36:43.642Z","repository":{"id":64719475,"uuid":"572881958","full_name":"StatMixedML/Py-BoostLSS","owner":"StatMixedML","description":"An extension of Py-Boost to probabilistic modelling","archived":false,"fork":false,"pushed_at":"2023-01-19T11:15:36.000Z","size":5319,"stargazers_count":20,"open_issues_count":0,"forks_count":0,"subscribers_count":4,"default_branch":"main","last_synced_at":"2024-05-14T00:07:16.953Z","etag":null,"topics":["distributional-regression","gamlss","gradient-boosting-machine","machine-learning","multi-target-regression","probabilistic-modeling","quantile-estimation","uncertainty-estimation"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/StatMixedML.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2022-12-01T08:28:36.000Z","updated_at":"2024-05-04T12:51:36.000Z","dependencies_parsed_at":"2023-01-29T11:32:44.789Z","dependency_job_id":null,"html_url":"https://github.com/StatMixedML/Py-BoostLSS","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/StatMixedML%2FPy-BoostLSS","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/StatMixedML%2FPy-BoostLSS/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/StatMixedML%2FPy-BoostLSS/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/StatMixedML%2FPy-BoostLSS/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/StatMixedML","download_url":"https://codeload.github.com/StatMixedML/Py-BoostLSS/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":227959920,"owners_count":17847692,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["distributional-regression","gamlss","gradient-boosting-machine","machine-learning","multi-target-regression","probabilistic-modeling","quantile-estimation","uncertainty-estimation"],"created_at":"2024-12-03T16:41:11.301Z","updated_at":"2026-03-09T13:36:43.576Z","avatar_url":"https://github.com/StatMixedML.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Py-BoostLSS: An extension of Py-Boost to probabilistic modelling\n\nWe present a probabilistic extension of the recently introduced [Py-Boost](https://github.com/sb-ai-lab/Py-Boost) approach and model all moments of a parametric multivariate distribution as functions of covariates. This allows us to create probabilistic predictions from which intervals and quantiles of interest can be derived. \n\n## Motivation\n\nExisting implementations of Gradient Boosting Machines, such as [XGBoost](https://github.com/dmlc/xgboost) and [LightGBM](https://github.com/microsoft/LightGBM), are mostly designed for single-target regression tasks. While efficient for low to medium target-dimensions, the computational cost of estimating them becomes prohibitive in high-dimensional settings. \n\nAs an example, consider modelling a multivariate Gaussian distribution with `D=100` target variables, where the covariance matrix is approximated using the Cholesky-Decomposition. Modelling all conditional moments (i.e., means, standard-deviations and all pairwise correlations) requires estimation of `D(D + 3)/2 = 5,150` parameters. Because most GBM implementations are based on a *one vs. all estimation strategy*, where a separate tree is grown for each parameter, estimating this many parameters for a large dataset can become computationally extremely expensive. \n\nThe recently introduced [Py-Boost](https://github.com/sb-ai-lab/Py-Boost) approach provides a more runtime efficient GBM implementation, making it a good candidate for estimating high-dimensional target variables in a probabilistic setting. Borrowing from the original paper [SketchBoost: Fast Gradient Boosted Decision Tree for Multioutput Problems](https://openreview.net/forum?id=WSxarC8t-T), the following figure illustrates the runtime-efficiency of the Py-Boost model.\n\n\u003cp align=\"center\"\u003e\n\u003cimg src=\"https://user-images.githubusercontent.com/41187941/205011855-0e06247f-609f-4c12-9c53-9e00df91b2d9.png\" width=\"350\" height=\"200\" /\u003e\n\u003c/p\u003e\n\nEven though the original implementation of Py-Boost also supports estimation of univariate responses, Py-BoostLSS focuses on multi-target probabilistic regression settings. For univariate probabilistic GBMs, we refer to our implementations of [XGBoostLSS](https://github.com/StatMixedML/XGBoostLSS) and [LightGBMLSS](https://github.com/StatMixedML/LightGBMLSS).\n\n## Installation\n\nSince Py-BoostLSS is entirely GPU-based, we first need to install the corresponding PyTorch and CuPy packages. If you are on Windows, it is preferable to install CuPy via conda. All other OS can use pip. You can check your cuda version with `nvcc --version`.\n\n```python\n# CuPy (replace with your cuda version)\n  # Windows only\n  conda install -c conda-forge cupy cudatoolkit=11.x \n  # Others\n  pip install cupy-cuda11x\n\n# PyTorch (replace with your cuda version)\npip3 install torch --extra-index-url https://download.pytorch.org/whl/cu11x\n```\n\nNext, you can install Py-BoostLSS.\n\n```python\npip install git+https://github.com/StatMixedML/Py-BoostLSS.git \n```\n\n## How to use\nWe refer to the [examples section](https://github.com/StatMixedML/Py-BoostLSS/tree/main/examples) for example notebooks.\n\n## Available Distributions\nPy-BoostLSS currently supports the following distributions. More distribution follow soon.\n\n| Distribution                                                 |  Usage          |Type                               | Support                   \n| :----------------------------------------------------------: |:--------------: |:--------------------------------: | :-----------------------: | \n| Multivariate Normal \u003cbr /\u003e (Cholesky)                        | `MVN()`         | Continous \u003cbr /\u003e (Multivariate)   | $y \\in (-\\infty,\\infty)$  | \n| Multivariate Normal \u003cbr /\u003e (Low-Rank Approximation)          | `MVN_LRA()`     | Continous \u003cbr /\u003e (Multivariate)   | $y \\in (-\\infty,\\infty)$  | \n| Multivariate Student-T \u003cbr /\u003e (Cholesky)                     | `MVT()`         | Continous \u003cbr /\u003e (Multivariate)   | $y \\in (-\\infty,\\infty)$  | \n| Dirichlet                                                    | `DIRICHLET()`   | Continous \u003cbr /\u003e (Multivariate)   | $y \\in [0,1]$             | \n\n\n\u003c!---\n| Distribution                               |  Usage  |Type                               | Support                   | Location                   | Scale                      | Shape | Correlation          |\n| :----------------------------------------: |:-------:|:--------------------------------: | :-----------------------: | :------------------------: | :------------------------: | :---: | :-------------------:| \n| Multivariate Normal \u003cbr /\u003e (Cholesky)      | `MVN()` | Continous \u003cbr /\u003e (Multivariate)   | $y \\in (-\\infty,\\infty)$  | $\\mu \\in (-\\infty,\\infty)$ | $\\sigma \\in (0,\\infty)$    | None  | $\\rho \\in [-1,1]$    |\n|     ...                                    |   ...   |      ...                          |    ...                    |     ...                    |       ...                  |  ...  |  ...                 |\n---\u003e\n\n\n\n\n## Feedback\nPlease provide feedback on how to improve Py-BoostLSS, or if you request additional distributions to be implemented, by opening a new issue or via the discussion section.\n\n\n## Acknowledgements\n\nThe implementation of Py-BoostLSS relies on the following resources:\n\n- [Py-boost: a research tool for exploring GBDTs](https://github.com/sb-ai-lab/Py-Boost)\n- [SketchBoost: Fast Gradient Boosted Decision Tree for Multioutput Problems](https://openreview.net/forum?id=WSxarC8t-T)\n\nWe genuinely thank the original authors [Anton Vakhrushev](https://www.kaggle.com/btbpanda) and [Leonid Iosipoi](http://iosipoi.com/) for making their work publicly available. \n\n## Reference Paper\n[![Arxiv link](https://img.shields.io/badge/arXiv-Multi%20Target%20XGBoostLSS%20Regression-color=brightgreen)](https://arxiv.org/abs/2210.06831) \u003cbr/\u003e\n[![Arxiv link](https://img.shields.io/badge/arXiv-Distributional%20Gradient%20Boosting%20Machines-color=brightgreen)](https://arxiv.org/abs/2204.00778) \u003cbr/\u003e\n[![Arxiv link](https://img.shields.io/badge/arXiv-XGBoostLSS%3A%20An%20extension%20of%20XGBoost%20to%20probabilistic%20forecasting-color=brightgreen)](https://arxiv.org/abs/1907.03178) \u003cbr/\u003e\n\n\u003c!---\nMärz, Alexander (2022) [*Multi-Target XGBoostLSS Regression*](https://arxiv.org/abs/2210.06831). \u003cbr/\u003e\n\u003cbr /\u003e\nMärz, A. and Kneib, T. (2022) [*\"Distributional Gradient Boosting Machines\"*](https://arxiv.org/abs/2204.00778). \u003cbr/\u003e\n\u003cbr /\u003e\nMärz, Alexander (2019) [*XGBoostLSS - An extension of XGBoost to probabilistic forecasting*](https://arxiv.org/abs/1907.03178).\n---\u003e\n\n\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fstatmixedml%2Fpy-boostlss","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fstatmixedml%2Fpy-boostlss","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fstatmixedml%2Fpy-boostlss/lists"}