{"id":13665869,"url":"https://github.com/ecpolley/SuperLearner","last_synced_at":"2025-04-26T08:33:32.129Z","repository":{"id":56934962,"uuid":"1622048","full_name":"ecpolley/SuperLearner","owner":"ecpolley","description":"Current version of the SuperLearner R package","archived":false,"fork":false,"pushed_at":"2024-02-19T19:35:42.000Z","size":881,"stargazers_count":275,"open_issues_count":18,"forks_count":73,"subscribers_count":17,"default_branch":"master","last_synced_at":"2025-04-10T12:40:16.624Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"R","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ecpolley.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2011-04-16T05:18:51.000Z","updated_at":"2025-03-28T15:26:31.000Z","dependencies_parsed_at":"2022-08-21T05:50:08.772Z","dependency_job_id":"1df12d09-3b80-4d8a-a4de-9971e30f5597","html_url":"https://github.com/ecpolley/SuperLearner","commit_stats":{"total_commits":403,"total_committers":12,"mean_commits":"33.583333333333336","dds":0.575682382133995,"last_synced_commit":"801aa6039460648d4dfd87c1fad77e5f29391cb7"},"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ecpolley%2FSuperLearner","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ecpolley%2FSuperLearner/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ecpolley%2FSuperLearner/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ecpolley%2FSuperLearner/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ecpolley","download_url":"https://codeload.github.com/ecpolley/SuperLearner/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":250961152,"owners_count":21514591,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-08-02T06:00:52.577Z","updated_at":"2025-04-26T08:33:27.119Z","avatar_url":"https://github.com/ecpolley.png","language":"R","readme":"# SuperLearner: Prediction model ensembling method\n\n[![CRAN_Status_Badge](http://www.r-pkg.org/badges/version/SuperLearner)](http://cran.r-project.org/web/packages/SuperLearner)\n[![Downloads](http://cranlogs.r-pkg.org/badges/SuperLearner)](http://cran.rstudio.com/package=SuperLearner)\n[![codecov](https://codecov.io/gh/ecpolley/SuperLearner/branch/master/graph/badge.svg)](https://codecov.io/gh/ecpolley/SuperLearner)\n\nThis is the current version of the SuperLearner R package (version 2.*).\n\n**Features**\n\n* Automatic optimal predictor ensembling via cross-validation with one line of code.\n* Dozens of algorithms: XGBoost, Random Forest, GBM, Lasso, SVM, BART, KNN, Decision Trees, Neural Networks, and more.\n* Integrates with [caret](http://github.com/topepo/caret) to support even more algorithms.\n* Includes framework to quickly add custom algorithms to the ensemble.\n* Visualize the performance of each algorithm using built-in plotting.\n* Easily check multiple hyperparameter configurations for each algorithm in the ensemble.\n* Add new algorithms or change the default parameters for existing ones.\n* Screen variables (feature selection) based on univariate association, Random Forest, Elastic Net, et al. or custom screening algorithms.\n* Multicore and multinode parallelization for scalability.\n* External cross-validation to estimate the performance of the ensembling predictor.\n* Ensemble can optimize for any target metric: mean-squared error, AUC, log likelihood, etc.\n* Includes framework to provide custom loss functions and stacking algorithms.\n\n### Install the development version from GitHub:\n\n```r\n# install.packages(\"remotes\")\nremotes::install_github(\"ecpolley/SuperLearner\")\n```\n\n### Install the current release from CRAN:\n```r\ninstall.packages(\"SuperLearner\")\n```\n\n[devtools]: https://github.com/hadley/devtools\n[remotes]: https://cran.r-project.org/web/packages/remotes/index.html\n[CRAN]: https://cran.r-project.org/web/packages/SuperLearner/index.html\n\n## Examples \n\nSuperLearner makes it trivial to run many algorithms and use the best one or an ensemble.\n\n```r\ndata(Boston, package = \"MASS\")\n\nset.seed(1)\n\nsl_lib = c(\"SL.xgboost\", \"SL.randomForest\", \"SL.glmnet\", \"SL.nnet\", \"SL.ksvm\",\n           \"SL.bartMachine\", \"SL.kernelKnn\", \"SL.rpartPrune\", \"SL.lm\", \"SL.mean\")\n\n# Fit XGBoost, RF, Lasso, Neural Net, SVM, BART, K-nearest neighbors, Decision Tree, \n# OLS, and simple mean; create automatic ensemble.\nresult = SuperLearner(Y = Boston$medv, X = Boston[, -14], SL.library = sl_lib)\n\n# Review performance of each algorithm and ensemble weights.\nresult\n\n# Use external (aka nested) cross-validation to estimate ensemble accuracy.\n# This will take a while to run.\nresult2 = CV.SuperLearner(Y = Boston$medv, X = Boston[, -14], SL.library = sl_lib)\n\n# Plot performance of individual algorithms and compare to the ensemble.\nplot(result2) + theme_minimal()\n\n# Hyperparameter optimization --\n# Fit elastic net with 5 different alphas: 0, 0.2, 0.4, 0.6, 0.8, 1.0.\n# 0 corresponds to ridge and 1 to lasso.\nenet = create.Learner(\"SL.glmnet\", detailed_names = T,\n                      tune = list(alpha = seq(0, 1, length.out = 5)))\n\nsl_lib2 = c(\"SL.mean\", \"SL.lm\", enet$names)\n\nenet_sl = SuperLearner(Y = Boston$medv, X = Boston[, -14], SL.library = sl_lib2)\n\n# Identify the best-performing alpha value or use the automatic ensemble.\nenet_sl\n```\n\nFor more detailed examples please review the vignette:\n\n```r\nvignette(package = \"SuperLearner\")\n```\n\n## References \n\nPolley EC, van der Laan MJ (2010) Super Learner in Prediction. U.C. Berkeley Division of Biostatistics Working Paper Series. Paper 226. \u003chttp://biostats.bepress.com/ucbbiostat/paper266/\u003e\n\nvan der Laan, M. J., Polley, E. C. and Hubbard, A. E. (2007) Super Learner. Statistical Applications of Genetics and Molecular Biology, 6, article 25. \u003chttp://www.degruyter.com/view/j/sagmb.2007.6.issue-1/sagmb.2007.6.1.1309/sagmb.2007.6.1.1309.xml\u003e\n\nvan der Laan, M. J., \u0026 Rose, S. (2011). Targeted learning: causal inference for observational and experimental data. Springer Science \u0026 Business Media. \n","funding_links":[],"categories":["R","Machine Learning","[](https://github.com/josephmisiti/awesome-machine-learning/blob/master/README.md#r)R"],"sub_categories":["General-Purpose Machine Learning"],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fecpolley%2FSuperLearner","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fecpolley%2FSuperLearner","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fecpolley%2FSuperLearner/lists"}