{"id":13857278,"url":"https://github.com/Blunde1/agtboost","last_synced_at":"2025-07-13T21:32:09.637Z","repository":{"id":41121001,"uuid":"206168752","full_name":"Blunde1/agtboost","owner":"Blunde1","description":"Adaptive and automatic gradient boosting computations.","archived":false,"fork":false,"pushed_at":"2022-08-20T15:52:38.000Z","size":6409,"stargazers_count":66,"open_issues_count":28,"forks_count":12,"subscribers_count":4,"default_branch":"master","last_synced_at":"2024-05-19T23:37:15.449Z","etag":null,"topics":["adaptive-learning","gradient-boosting","information-theory","machine-learning"],"latest_commit_sha":null,"homepage":"","language":"R","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Blunde1.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2019-09-03T20:42:51.000Z","updated_at":"2024-01-04T16:37:24.000Z","dependencies_parsed_at":"2022-07-13T12:20:30.107Z","dependency_job_id":null,"html_url":"https://github.com/Blunde1/agtboost","commit_stats":null,"previous_names":[],"tags_count":4,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Blunde1%2Fagtboost","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Blunde1%2Fagtboost/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Blunde1%2Fagtboost/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Blunde1%2Fagtboost/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Blunde1","download_url":"https://codeload.github.com/Blunde1/agtboost/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":213988340,"owners_count":15666966,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["adaptive-learning","gradient-boosting","information-theory","machine-learning"],"created_at":"2024-08-05T03:01:32.278Z","updated_at":"2024-08-05T03:03:04.518Z","avatar_url":"https://github.com/Blunde1.png","language":"R","funding_links":[],"categories":["R"],"sub_categories":[],"readme":"\u003c!-- badges: start --\u003e\n[![Travis build status](https://travis-ci.org/Blunde1/agtboost.svg?branch=master)](https://travis-ci.org/Blunde1/agtboost)\n[![Lifecycle: experimental](https://img.shields.io/badge/lifecycle-experimental-orange.svg)](https://www.tidyverse.org/lifecycle/#experimental)\n[![License:\nMIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)\n[![CRAN RStudio mirror downloads](https://cranlogs.r-pkg.org/badges/grand-total/agtboost?color=blue)](https://r-pkg.org/pkg/agtboost)\n---------\n\n# aGTBoost\n\n**Adaptive and automatic gradient tree boosting computations**\n\naGTBoost is a lightning fast gradient boosting library designed to **avoid manual tuning** and **cross-validation** by utilizing an information theoretic approach.\nThis makes the algorithm **adaptive** to the dataset at hand; it is **completely automatic**, and with **minimal worries of overfitting**.\nConsequently, the speed-ups relative to state-of-the-art implementations are in the thousands while mathematical and technical knowledge required on the user are minimized.\n\n*Note: Currently for academic purposes: Implementing and testing new innovations w.r.t. information theoretic choices of GTB-complexity. See below for to-do research list.*\n\n\n## Installation\n\n**R**: Finally on CRAN! Install the stable version with\n```r\ninstall.packages(\"agtboost\")\n```\nor install the development version from GitHub\n```r\ndevtools::install_github(\"Blunde1/agtboost/R-package\")\n```\nUsers experiencing errors after warnings during installlation, may be helped by the following command prior to installation:\n\n```r\nSys.setenv(R_REMOTES_NO_ERRORS_FROM_WARNINGS=\"true\")\n```\n\n## Example code and documentation\n\n`agtboost` essentially has two functions, a train function `gbt.train` and a predict function `predict`.\nFrom the code below it should be clear how to train an aGTBoost model using a design matrix `x` and a response vector `y`, write `?gbt.train` in the console for detailed documentation. \n```r\nlibrary(agtboost)\n\n# -- Load data --\ndata(caravan.train, package = \"agtboost\")\ndata(caravan.test, package = \"agtboost\")\ntrain \u003c- caravan.train\ntest \u003c- caravan.test\n\n# -- Model building --\nmod \u003c- gbt.train(train$y, train$x, loss_function = \"logloss\", verbose=10)\n\n# -- Predictions --\nprob \u003c- predict(mod, test$x) # Score after logistic transformation: Probabilities\n```\n`agtboost`also contain functions for model inspection and validation. \n\n- Feature importance: `gbt.importance` generates a typical feature importance plot. \nTechniques like inserting noise-features are redundant due to computations w.r.t. approximate generalization (test) loss.\n- Convergence: `gbt.convergence` computes the loss over the path of boosting iterations. Check visually for convergence on test loss.\n- Model validation: `gbt.ksval` transforms observations to standard uniformly distributed random variables, if the model is specified \ncorrectly. Perform a formal Kolmogorov-Smirnov test and plots transformed observations for visual inspection.\n```r\n# -- Feature importance --\ngbt.importance(feature_names=colnames(caravan.train$x), object=mod)\n\n# -- Model validation --\ngbt.ksval(object=mod, y=caravan.test$y, x=caravan.test$x)\n```\nThe functions `gbt.ksval` and `gbt.importance` create the following plots:\n\u003cimg src=\"docs/img/agtboost_validation.png\" width=\"700\" height=\"300\" /\u003e\n\nFurthermore, an aGTBoost model is (see example code)\n\n- highly robust to dimensions: [Comparisons to (penalized) linear regression in (very) high dimensions](R-package/demo/high-dimensions.R)\n- has minimal worries of overfitting: [Stock market classificatin](R-package/demo/stock-market-classification.R)\n- and can train further given previous models: [Boosting from a regularized linear model](R-package/demo/boost-from-predictions.R)\n\n\n\n## Dependencies\n\n- [My research](https://berentlunde.netlify.com/) \n- [Eigen](http://eigen.tuxfamily.org/index.php?title=Main_Page) Linear algebra\n- [Rcpp](https://github.com/RcppCore/Rcpp) for the R-package\n\n## Scheduled updates\n\n- [x] Adaptive and automatic deterministic frequentist gradient tree boosting.\n- [ ] Information criterion for fast histogram algorithm (non-exact search) (Fall 2020, planned)\n- [ ] Adaptive L2-penalized gradient tree boosting. (Fall 2020, planned)\n- [ ] Automatic stochastic gradient tree boosting. (Fall 2020/Spring 2021, planned)\n\n## Hopeful updates\n\n- Optimal stochastic gradient tree boosting.\n\n## References\n- [An information criterion for automatic gradient tree boosting](https://arxiv.org/abs/2008.05926)\n- [agtboost: Adaptive and Automatic Gradient Tree Boosting Computations](https://arxiv.org/abs/2008.12625)\n\n## Contribute\n\nAny help on the following subjects are especially welcome:\n\n- Utilizing sparsity (possibly Eigen sparsity).\n- Paralellizatin (CPU and/or GPU).\n- Distribution (Python, Java, Scala, ...),\n- good ideas and coding best-practices in general.\n\nPlease note that the priority is to work on and push the above mentioned scheduled updates. Patience is a virtue. :)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FBlunde1%2Fagtboost","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FBlunde1%2Fagtboost","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FBlunde1%2Fagtboost/lists"}