https://github.com/aaronpeikert/iv

Independend Validation (IV) with 'rsample'
https://github.com/aaronpeikert/iv

Last synced: about 2 months ago
JSON representation

Independend Validation (IV) with 'rsample'

Host: GitHub
URL: https://github.com/aaronpeikert/iv
Owner: aaronpeikert
License: other
Created: 2019-10-03T20:26:35.000Z (over 5 years ago)
Default Branch: master
Last Pushed: 2023-10-05T06:19:15.000Z (over 1 year ago)
Last Synced: 2025-01-29T05:42:42.345Z (4 months ago)
Language: R
Size: 27.3 KB
Stars: 0
Watchers: 2
Forks: 0
Open Issues: 1
Metadata Files:
- Readme: README.Rmd
- Contributing: .github/CONTRIBUTING.md
- License: LICENSE
- Code of conduct: CODE_OF_CONDUCT.md

Awesome Lists containing this project

README

        ---

output: github_document

---

```{r, include = FALSE}

knitr::opts_chunk$set(

  collapse = TRUE,

  comment = "#>",

  fig.path = "man/figures/README-",

  out.width = "100%"

)

```

# Independend Validation (IV)

[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT) [![Ask Me Anything

\!](https://img.shields.io/badge/Ask%20me-anything-1abc9c.svg)](https://github.com/aaronpeikert/iv/issues/new)

Independend Validation is a procedure proposed by von Oertzen (in prep), which produces independend assessment sets. This property is assumed when performing most statistical tests on the performance measures associated with the assesment sets. Importantly classical resampling procedures (like cross validation or bootstrapping) do violate this assumption, because even when the original sample are independend, the resulting assesment and holdout sets are not.

## Installation

You can install the development version from [GitHub](https://github.com/) with:

``` r

# install.packages("devtools")

devtools::install_github("aaronpeikert/iv")

```

## Examples

```{r data}

# install.packages("modeldata")

library(modeldata)

data("attrition")

# downsample

attrition <- attrition[sample(seq_len(nrow(attrition)), 100), ]

```

```{r iv}

library(iv)

library(rsample)

iv_obj <- iv(attrition, m = 20)

iv_obj

```

```{r lm_func}

mod_form <- as.formula(Attrition ~ JobSatisfaction + Gender + MonthlyIncome)

## splits will be the `rsplit` object

holdout_results <- function(splits, ...) {

  # Fit the model to the 90%

  mod <- glm(..., data = analysis(splits), family = binomial)

  # Save the 10%

  holdout <- assessment(splits)

  # `augment` will save the predictions with the holdout data set

  res <- broom::augment(mod, newdata = holdout)

  # Class predictions on the assessment set from class probs

  lvls <- levels(holdout$Attrition)

  predictions <- factor(ifelse(res$.fitted > 0, lvls[2], lvls[1]),

                        levels = lvls)

  # Calculate whether the prediction was correct

  res$correct <- predictions == holdout$Attrition

  # Return the assessment data set with the additional columns

  res

}

```

```{r model_purrr, warning=FALSE}

library(purrr)

iv_obj$results <- map(iv_obj$splits,

                      holdout_results,

                      mod_form)

iv_obj$accuracy <- map_dbl(iv_obj$results, function(x) mean(x$correct))

summary(iv_obj$accuracy)

```

by [Aaron Peikert![ORCID

iD](https://orcid.org/sites/default/files/images/orcid_16x16.png)](https://orcid.org/0000-0001-7813-818X)

and [Andreas Brandmaier![ORCID

iD](https://orcid.org/sites/default/files/images/orcid_16x16.png)](http://orcid.org/0000-0001-8765-6982).

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/aaronpeikert/iv

Awesome Lists containing this project

README