https://github.com/tidyverse/modelr

Helper functions for modelling
https://github.com/tidyverse/modelr

modelling r

Last synced: about 2 months ago
JSON representation

Helper functions for modelling

Host: GitHub
URL: https://github.com/tidyverse/modelr
Owner: tidyverse
License: gpl-3.0
Created: 2016-05-06T14:25:25.000Z (about 9 years ago)
Default Branch: main
Last Pushed: 2023-10-31T17:21:39.000Z (over 1 year ago)
Last Synced: 2025-04-13T09:05:40.485Z (2 months ago)
Topics: modelling, r
Language: R
Homepage: https://modelr.tidyverse.org
Size: 5.8 MB
Stars: 400
Watchers: 25
Forks: 65
Open Issues: 2
Metadata Files:
- Readme: README.Rmd
- Contributing: .github/CONTRIBUTING.md
- License: LICENSE.md
- Code of conduct: .github/CODE_OF_CONDUCT.md
- Codeowners: .github/CODEOWNERS
- Support: .github/SUPPORT.md

Awesome Lists containing this project

jimsghstars - tidyverse/modelr - Helper functions for modelling (R)

README

        ---

output: github_document

---

```{r, echo = FALSE}

knitr::opts_chunk$set(collapse = TRUE, comment = "#>")

set.seed(1014)

```

# modelr 

[![Lifecycle: superseded](https://img.shields.io/badge/lifecycle-superseded-blue.svg)](https://lifecycle.r-lib.org/articles/stages.html#superseded)

[![R-CMD-check](https://github.com/tidyverse/modelr/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/tidyverse/modelr/actions/workflows/R-CMD-check.yaml)

[![Codecov test coverage](https://codecov.io/gh/tidyverse/modelr/branch/main/graph/badge.svg)](https://app.codecov.io/gh/tidyverse/modelr?branch=main)

## Overview

The modelr package provides functions that help you create elegant pipelines when modelling. 

It was designed primarily to support teaching the basics of modelling for the 1st edition of [R for Data Science](https://r4ds.had.co.nz/model-basics.html). 

We no longer recommend it and instead suggest  for a more comprehensive framework for modelling within the tidyverse.

## Installation

```{r, eval = FALSE}

# The easiest way to get modelr is to install the whole tidyverse:

install.packages("tidyverse")

# Alternatively, install just modelr:

install.packages("modelr")

```

## Getting started

```{r}

library(modelr)

```

### Partitioning and sampling

The `resample` class stores a "reference" to the original dataset and a vector of row indices. A resample can be turned into a dataframe by calling `as.data.frame()`. The indices can be extracted using `as.integer()`:

```{r}

# a subsample of the first ten rows in the data frame

rs <- resample(mtcars, 1:10)

as.data.frame(rs)

as.integer(rs)

```

The class can be utilized in generating an exclusive partitioning of a data frame:

```{r}

# generate a 30% testing partition and a 70% training partition

ex <- resample_partition(mtcars, c(test = 0.3, train = 0.7))

lapply(ex, dim)

```

modelr offers several resampling methods that result in a list of `resample` objects (organized in a data frame):

```{r}

# bootstrap

boot <- bootstrap(mtcars, 100)

# k-fold cross-validation

cv1 <- crossv_kfold(mtcars, 5)

# Monte Carlo cross-validation

cv2 <- crossv_mc(mtcars, 100)

dim(boot$strap[[1]])

dim(cv1$train[[1]])

dim(cv1$test[[1]])

dim(cv2$train[[1]])

dim(cv2$test[[1]])

```

### Model quality metrics

modelr includes several often-used model quality metrics:

```{r}

mod <- lm(mpg ~ wt, data = mtcars)

rmse(mod, mtcars)

rsquare(mod, mtcars)

mae(mod, mtcars)

qae(mod, mtcars)

```

### Interacting with models

A set of functions let you seamlessly add predictions and residuals as additional columns to an existing data frame:

```{r}

set.seed(1014)

df <- tibble::tibble(

  x = sort(runif(100)),

  y = 5 * x + 0.5 * x ^ 2 + 3 + rnorm(length(x))

)

mod <- lm(y ~ x, data = df)

df %>% add_predictions(mod)

df %>% add_residuals(mod)

```

For visualization purposes it is often useful to use an evenly spaced grid of points from the data:

```{r}

data_grid(mtcars, wt = seq_range(wt, 10), cyl, vs)

# For continuous variables, seq_range is useful

mtcars_mod <- lm(mpg ~ wt + cyl + vs, data = mtcars)

data_grid(mtcars, wt = seq_range(wt, 10), cyl, vs) %>% add_predictions(mtcars_mod)

```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/tidyverse/modelr

Awesome Lists containing this project

README