Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/tidyverse/modelr
Helper functions for modelling
https://github.com/tidyverse/modelr
modelling r
Last synced: 2 months ago
JSON representation
Helper functions for modelling
- Host: GitHub
- URL: https://github.com/tidyverse/modelr
- Owner: tidyverse
- License: gpl-3.0
- Created: 2016-05-06T14:25:25.000Z (over 8 years ago)
- Default Branch: main
- Last Pushed: 2023-10-31T17:21:39.000Z (about 1 year ago)
- Last Synced: 2024-08-02T06:03:33.089Z (5 months ago)
- Topics: modelling, r
- Language: R
- Homepage: https://modelr.tidyverse.org
- Size: 5.8 MB
- Stars: 400
- Watchers: 28
- Forks: 66
- Open Issues: 1
-
Metadata Files:
- Readme: README.Rmd
- Contributing: .github/CONTRIBUTING.md
- License: LICENSE.md
- Code of conduct: .github/CODE_OF_CONDUCT.md
- Codeowners: .github/CODEOWNERS
- Support: .github/SUPPORT.md
Awesome Lists containing this project
- jimsghstars - tidyverse/modelr - Helper functions for modelling (R)
README
---
output: github_document
---```{r, echo = FALSE}
knitr::opts_chunk$set(collapse = TRUE, comment = "#>")
set.seed(1014)
```# modelr
[![Lifecycle: superseded](https://img.shields.io/badge/lifecycle-superseded-blue.svg)](https://lifecycle.r-lib.org/articles/stages.html#superseded)
[![R-CMD-check](https://github.com/tidyverse/modelr/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/tidyverse/modelr/actions/workflows/R-CMD-check.yaml)
[![Codecov test coverage](https://codecov.io/gh/tidyverse/modelr/branch/main/graph/badge.svg)](https://app.codecov.io/gh/tidyverse/modelr?branch=main)## Overview
The modelr package provides functions that help you create elegant pipelines when modelling.
It was designed primarily to support teaching the basics of modelling for the 1st edition of [R for Data Science](https://r4ds.had.co.nz/model-basics.html).We no longer recommend it and instead suggest for a more comprehensive framework for modelling within the tidyverse.
## Installation
```{r, eval = FALSE}
# The easiest way to get modelr is to install the whole tidyverse:
install.packages("tidyverse")# Alternatively, install just modelr:
install.packages("modelr")
```## Getting started
```{r}
library(modelr)
```### Partitioning and sampling
The `resample` class stores a "reference" to the original dataset and a vector of row indices. A resample can be turned into a dataframe by calling `as.data.frame()`. The indices can be extracted using `as.integer()`:
```{r}
# a subsample of the first ten rows in the data frame
rs <- resample(mtcars, 1:10)
as.data.frame(rs)
as.integer(rs)
```The class can be utilized in generating an exclusive partitioning of a data frame:
```{r}
# generate a 30% testing partition and a 70% training partition
ex <- resample_partition(mtcars, c(test = 0.3, train = 0.7))
lapply(ex, dim)
```modelr offers several resampling methods that result in a list of `resample` objects (organized in a data frame):
```{r}
# bootstrap
boot <- bootstrap(mtcars, 100)
# k-fold cross-validation
cv1 <- crossv_kfold(mtcars, 5)
# Monte Carlo cross-validation
cv2 <- crossv_mc(mtcars, 100)dim(boot$strap[[1]])
dim(cv1$train[[1]])
dim(cv1$test[[1]])
dim(cv2$train[[1]])
dim(cv2$test[[1]])
```### Model quality metrics
modelr includes several often-used model quality metrics:
```{r}
mod <- lm(mpg ~ wt, data = mtcars)
rmse(mod, mtcars)
rsquare(mod, mtcars)
mae(mod, mtcars)
qae(mod, mtcars)
```### Interacting with models
A set of functions let you seamlessly add predictions and residuals as additional columns to an existing data frame:
```{r}
set.seed(1014)
df <- tibble::tibble(
x = sort(runif(100)),
y = 5 * x + 0.5 * x ^ 2 + 3 + rnorm(length(x))
)mod <- lm(y ~ x, data = df)
df %>% add_predictions(mod)
df %>% add_residuals(mod)
```For visualization purposes it is often useful to use an evenly spaced grid of points from the data:
```{r}
data_grid(mtcars, wt = seq_range(wt, 10), cyl, vs)# For continuous variables, seq_range is useful
mtcars_mod <- lm(mpg ~ wt + cyl + vs, data = mtcars)
data_grid(mtcars, wt = seq_range(wt, 10), cyl, vs) %>% add_predictions(mtcars_mod)
```