An open API service indexing awesome lists of open source software.

https://github.com/aaronpeikert/iv

Independend Validation (IV) with 'rsample'
https://github.com/aaronpeikert/iv

Last synced: about 2 months ago
JSON representation

Independend Validation (IV) with 'rsample'

Awesome Lists containing this project

README

        

---
output: github_document
---

```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
```

# Independend Validation (IV)

[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT) [![Ask Me Anything
\!](https://img.shields.io/badge/Ask%20me-anything-1abc9c.svg)](https://github.com/aaronpeikert/iv/issues/new)

Independend Validation is a procedure proposed by von Oertzen (in prep), which produces independend assessment sets. This property is assumed when performing most statistical tests on the performance measures associated with the assesment sets. Importantly classical resampling procedures (like cross validation or bootstrapping) do violate this assumption, because even when the original sample are independend, the resulting assesment and holdout sets are not.

## Installation

You can install the development version from [GitHub](https://github.com/) with:

``` r
# install.packages("devtools")
devtools::install_github("aaronpeikert/iv")
```

## Examples

```{r data}
# install.packages("modeldata")
library(modeldata)
data("attrition")
# downsample
attrition <- attrition[sample(seq_len(nrow(attrition)), 100), ]
```

```{r iv}
library(iv)
library(rsample)
iv_obj <- iv(attrition, m = 20)
iv_obj
```

```{r lm_func}
mod_form <- as.formula(Attrition ~ JobSatisfaction + Gender + MonthlyIncome)
## splits will be the `rsplit` object
holdout_results <- function(splits, ...) {
# Fit the model to the 90%
mod <- glm(..., data = analysis(splits), family = binomial)
# Save the 10%
holdout <- assessment(splits)
# `augment` will save the predictions with the holdout data set
res <- broom::augment(mod, newdata = holdout)
# Class predictions on the assessment set from class probs
lvls <- levels(holdout$Attrition)
predictions <- factor(ifelse(res$.fitted > 0, lvls[2], lvls[1]),
levels = lvls)
# Calculate whether the prediction was correct
res$correct <- predictions == holdout$Attrition
# Return the assessment data set with the additional columns
res
}
```

```{r model_purrr, warning=FALSE}
library(purrr)
iv_obj$results <- map(iv_obj$splits,
holdout_results,
mod_form)
iv_obj$accuracy <- map_dbl(iv_obj$results, function(x) mean(x$correct))
summary(iv_obj$accuracy)
```

by [Aaron Peikert![ORCID
iD](https://orcid.org/sites/default/files/images/orcid_16x16.png)](https://orcid.org/0000-0001-7813-818X)
and [Andreas Brandmaier![ORCID
iD](https://orcid.org/sites/default/files/images/orcid_16x16.png)](http://orcid.org/0000-0001-8765-6982).