{"id":23100663,"url":"https://github.com/benkeser/cvma","last_synced_at":"2025-08-16T14:31:55.919Z","repository":{"id":129606438,"uuid":"104618760","full_name":"benkeser/cvma","owner":"benkeser","description":"Cross-validation-based maximal associations ","archived":false,"fork":false,"pushed_at":"2019-04-03T15:08:56.000Z","size":174,"stargazers_count":2,"open_issues_count":1,"forks_count":3,"subscribers_count":3,"default_branch":"master","last_synced_at":"2025-04-04T12:52:45.732Z","etag":null,"topics":["canonical-correlation-analysis","cross-validation","machine-learning","multivariate-analysis","stacked-ensembles"],"latest_commit_sha":null,"homepage":null,"language":"R","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/benkeser.png","metadata":{"files":{"readme":"README.Rmd","changelog":"NEWS.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2017-09-24T04:30:59.000Z","updated_at":"2023-12-19T17:41:05.000Z","dependencies_parsed_at":"2023-03-23T16:33:32.183Z","dependency_job_id":null,"html_url":"https://github.com/benkeser/cvma","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/benkeser/cvma","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/benkeser%2Fcvma","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/benkeser%2Fcvma/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/benkeser%2Fcvma/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/benkeser%2Fcvma/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/benkeser","download_url":"https://codeload.github.com/benkeser/cvma/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/benkeser%2Fcvma/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":270723211,"owners_count":24634339,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-08-16T02:00:11.002Z","response_time":91,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["canonical-correlation-analysis","cross-validation","machine-learning","multivariate-analysis","stacked-ensembles"],"created_at":"2024-12-16T23:33:33.734Z","updated_at":"2025-08-16T14:31:55.900Z","avatar_url":"https://github.com/benkeser.png","language":"R","funding_links":[],"categories":[],"sub_categories":[],"readme":"---\noutput: \n    github_document\n---\n\u003c!-- README.md is generated from README.Rmd. Please edit that file --\u003e\n\n```{r, echo = FALSE}\nknitr::opts_chunk$set(\n  collapse = TRUE,\n  comment = \"#\u003e\",\n  fig.path = \"README-\"\n)\n```\n\n# R/`cvma`\n\n[![Travis-CI Build Status](https://travis-ci.org/benkeser/cvma.svg?branch=master)](https://travis-ci.org/benkeser/cvma)\n[![AppVeyor Build  Status](https://ci.appveyor.com/api/projects/status/github/benkeser/cvma?branch=master\u0026svg=true)](https://ci.appveyor.com/project/benkeser/cvma)\n[![Coverage Status](https://img.shields.io/codecov/c/github/benkeser/cvma/master.svg)](https://codecov.io/github/benkeser/cvma?branch=master)\n[![MIT license](http://img.shields.io/badge/license-MIT-brightgreen.svg)](http://opensource.org/licenses/MIT)\n\n\u003e Machine learning-based summary of association with multivariate outcomes\n\n__Authors:__ [David Benkeser](https://www.benkeserstatistics.com/) and [Ivana Malenica](https://github.com/podTockom)\n\n## Introduction\n\nThis package provides a method for summarizing the strength of association between a set of variables and a multivariate outcome. In particular, cross-validation is combined with stacked regression (aka super learning) to estimate the convex combination of a multivariate outcome that maximizes cross-validated R-squared of a super learner-based prediction. The method is particularly well suited for situations with high-dimensional covariates and/or complex relationships between covariates and outcomes. \n\n## Installation\n\nYou can install a stable release of `cvma` from GitHub via\n[`devtools`](https://www.rstudio.com/products/rpackages/devtools/) with:\n\n```{r gh-installation, eval = FALSE}\ndevtools::install_github(\"benkeser/cvma\")\n```\n\nIn the future, the package will be available from [CRAN](https://cran.r-project.org/) via\n\n```{r cran-installation, eval = FALSE}\ninstall.packages(\"cvma\")\n```\n\n## Issues\n\nIf you encounter any bugs or have any specific feature requests, please [file an issue](https://github.com/benkeser/cvma/issues).\n\n## Example\n\nThis minimal example shows how to use `cvma` with very simple, simulated data set. For more examples and detailed explanations, we refer the user to the vignette. To start with, we use the nonparametric R^2 to evaluate the strength of association between a set of variables and a multivariate outcome:\n\n```{r, echo=FALSE}\noptions(warn=-1)\n```\n\n```{r nonparametric R^2}\nsuppressMessages(library(cvma))\nset.seed(1234)\n\n#Simulate data:\nX \u003c- data.frame(x1=runif(n=100,0,5), x2=runif(n=100,0,5))\nY1 \u003c- rnorm(100, X$x1 + X$x2, 1)\nY2 \u003c- rnorm(100, X$x1 + X$x2, 3)\nY \u003c- data.frame(Y1 = Y1, Y2 = Y2)\n\n#cvma with nonparametric R^2:\nfit \u003c- cvma(Y = Y, X = X, V = 10, \n                learners = c(\"SL.glm\",\"SL.mean\"))\nfit\n```\n\nThe following example evaluates the strength of association using AUC:\n\n```{r AUC}\n\n#Simulate data:\nX \u003c- data.frame(x1=runif(n=100,0,5), x2=runif(n=100,0,5))\nY1 \u003c- rbinom(100, 1, plogis(-2 + 0.1*X$x1 + 0.2*X$x2))\nY2 \u003c- rbinom(100, 1, plogis(-2 + 0.1*X$x1))\nY \u003c- data.frame(Y1 = Y1, Y2 = Y2)\n\n#cvma with AUC:\nfit \u003c- cvma(Y = Y, X = X, V = 5, \n                learners = c(\"SL.glm\",\"SL.mean\"),\n                sl_control = list(ensemble_fn = \"ensemble_linear\",\n                                   optim_risk_fn = \"optim_risk_sl_nloglik\",\n                                   weight_fn = \"weight_sl_convex\",\n                                   cv_risk_fn = \"cv_risk_sl_auc\",\n                                   family = binomial(),\n                                   alpha = 0.05),\n                y_weight_control = list(ensemble_fn = \"ensemble_linear\",\n                                  weight_fn = \"weight_y_01\",\n                                  optim_risk_fn = \"optim_risk_y_auc\",\n                                  cv_risk_fn = \"cv_risk_y_auc\",\n                                  alpha = 0.05))\nfit\n```\n\n## Variable importance\n\nThe cross-validated performance of two fits can be compared using the `compare_cvma` function. This can be used to define a variable importance measure for a set of variables. \n\n```{r}\n\n#Simulate data:\nX \u003c- data.frame(x1=runif(n=100,0,5), x2=runif(n=100,0,5))\nY1 \u003c- rnorm(100, X$x1 + X$x2, 1)\nY2 \u003c- rnorm(100, X$x1 + X$x2, 3)\nY \u003c- data.frame(Y1 = Y1, Y2 = Y2)\n\n# fit data with full X\nfit1 \u003c- cvma(Y = Y, X = X, V = 10, \n                learners = c(\"SL.glm\",\"SL.mean\"))\n# fit data with only x1\nfit2 \u003c- cvma(Y = Y, X = X[, -2, drop = FALSE], V = 10, \n                learners = c(\"SL.glm\",\"SL.mean\"))\n# difference in cross-validated R^2 for the two fits\ncompare_cvma(fit1, fit2)\n```\n\n## License\n\u0026copy; 2017 [David C. Benkeser](http://www.benkeserstatistics.com)\n\nThe contents of this repository are distributed under the MIT license. See\nbelow for details:\n```\nThe MIT License (MIT)\n\nCopyright (c) 2016-2017 David C. Benkeser\n\nPermission is hereby granted, free of charge, to any person obtaining a copy\nof this software and associated documentation files (the \"Software\"), to deal\nin the Software without restriction, including without limitation the rights\nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell\ncopies of the Software, and to permit persons to whom the Software is\nfurnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in all\ncopies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE\nSOFTWARE.\n```","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbenkeser%2Fcvma","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbenkeser%2Fcvma","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbenkeser%2Fcvma/lists"}