{"id":13665953,"url":"https://github.com/tidyverse/modelr","last_synced_at":"2025-04-26T09:31:53.240Z","repository":{"id":8386250,"uuid":"58212672","full_name":"tidyverse/modelr","owner":"tidyverse","description":"Helper functions for modelling","archived":false,"fork":false,"pushed_at":"2023-10-31T17:21:39.000Z","size":6082,"stargazers_count":400,"open_issues_count":2,"forks_count":65,"subscribers_count":25,"default_branch":"main","last_synced_at":"2025-04-13T09:05:40.485Z","etag":null,"topics":["modelling","r"],"latest_commit_sha":null,"homepage":"https://modelr.tidyverse.org","language":"R","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/tidyverse.png","metadata":{"files":{"readme":"README.Rmd","changelog":null,"contributing":".github/CONTRIBUTING.md","funding":null,"license":"LICENSE.md","code_of_conduct":".github/CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":".github/CODEOWNERS","security":null,"support":".github/SUPPORT.md","governance":null,"roadmap":null,"authors":null,"dei":null}},"created_at":"2016-05-06T14:25:25.000Z","updated_at":"2025-03-22T08:14:48.000Z","dependencies_parsed_at":"2023-01-11T18:46:25.780Z","dependency_job_id":"20a79406-fbc3-4e8a-acb9-43bbf89e4eea","html_url":"https://github.com/tidyverse/modelr","commit_stats":{"total_commits":202,"total_committers":19,"mean_commits":"10.631578947368421","dds":"0.18811881188118806","last_synced_commit":"5da567415b024d14d2bcaad6df8d90e4acbae3f2"},"previous_names":[],"tags_count":11,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tidyverse%2Fmodelr","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tidyverse%2Fmodelr/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tidyverse%2Fmodelr/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tidyverse%2Fmodelr/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/tidyverse","download_url":"https://codeload.github.com/tidyverse/modelr/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":250221401,"owners_count":21394699,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["modelling","r"],"created_at":"2024-08-02T06:00:54.685Z","updated_at":"2025-04-26T09:31:52.874Z","avatar_url":"https://github.com/tidyverse.png","language":"R","funding_links":[],"categories":["R"],"sub_categories":[],"readme":"---\noutput: github_document\n---\n\n```{r, echo = FALSE}\nknitr::opts_chunk$set(collapse = TRUE, comment = \"#\u003e\")\nset.seed(1014)\n```\n\n# modelr \u003cimg src=\"man/figures/logo.png\" align=\"right\" /\u003e\n\n\u003c!-- badges: start --\u003e\n[![Lifecycle: superseded](https://img.shields.io/badge/lifecycle-superseded-blue.svg)](https://lifecycle.r-lib.org/articles/stages.html#superseded)\n[![R-CMD-check](https://github.com/tidyverse/modelr/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/tidyverse/modelr/actions/workflows/R-CMD-check.yaml)\n[![Codecov test coverage](https://codecov.io/gh/tidyverse/modelr/branch/main/graph/badge.svg)](https://app.codecov.io/gh/tidyverse/modelr?branch=main)\n\u003c!-- badges: end --\u003e\n\n## Overview\n\nThe modelr package provides functions that help you create elegant pipelines when modelling. \nIt was designed primarily to support teaching the basics of modelling for the 1st edition of [R for Data Science](https://r4ds.had.co.nz/model-basics.html). \n\nWe no longer recommend it and instead suggest \u003chttps://www.tidymodels.org/\u003e for a more comprehensive framework for modelling within the tidyverse.\n\n## Installation\n\n```{r, eval = FALSE}\n# The easiest way to get modelr is to install the whole tidyverse:\ninstall.packages(\"tidyverse\")\n\n# Alternatively, install just modelr:\ninstall.packages(\"modelr\")\n```\n\n## Getting started\n\n```{r}\nlibrary(modelr)\n```\n\n### Partitioning and sampling\n\nThe `resample` class stores a \"reference\" to the original dataset and a vector of row indices. A resample can be turned into a dataframe by calling `as.data.frame()`. The indices can be extracted using `as.integer()`:\n\n```{r}\n# a subsample of the first ten rows in the data frame\nrs \u003c- resample(mtcars, 1:10)\nas.data.frame(rs)\nas.integer(rs)\n```\n\nThe class can be utilized in generating an exclusive partitioning of a data frame:\n\n```{r}\n# generate a 30% testing partition and a 70% training partition\nex \u003c- resample_partition(mtcars, c(test = 0.3, train = 0.7))\nlapply(ex, dim)\n```\n\nmodelr offers several resampling methods that result in a list of `resample` objects (organized in a data frame):\n\n```{r}\n# bootstrap\nboot \u003c- bootstrap(mtcars, 100)\n# k-fold cross-validation\ncv1 \u003c- crossv_kfold(mtcars, 5)\n# Monte Carlo cross-validation\ncv2 \u003c- crossv_mc(mtcars, 100)\n\ndim(boot$strap[[1]])\ndim(cv1$train[[1]])\ndim(cv1$test[[1]])\ndim(cv2$train[[1]])\ndim(cv2$test[[1]])\n```\n\n### Model quality metrics\n\nmodelr includes several often-used model quality metrics:\n\n```{r}\nmod \u003c- lm(mpg ~ wt, data = mtcars)\nrmse(mod, mtcars)\nrsquare(mod, mtcars)\nmae(mod, mtcars)\nqae(mod, mtcars)\n```\n\n### Interacting with models\n\nA set of functions let you seamlessly add predictions and residuals as additional columns to an existing data frame:\n\n```{r}\nset.seed(1014)\ndf \u003c- tibble::tibble(\n  x = sort(runif(100)),\n  y = 5 * x + 0.5 * x ^ 2 + 3 + rnorm(length(x))\n)\n\nmod \u003c- lm(y ~ x, data = df)\ndf %\u003e% add_predictions(mod)\ndf %\u003e% add_residuals(mod)\n```\n\nFor visualization purposes it is often useful to use an evenly spaced grid of points from the data:\n\n```{r}\ndata_grid(mtcars, wt = seq_range(wt, 10), cyl, vs)\n\n# For continuous variables, seq_range is useful\nmtcars_mod \u003c- lm(mpg ~ wt + cyl + vs, data = mtcars)\ndata_grid(mtcars, wt = seq_range(wt, 10), cyl, vs) %\u003e% add_predictions(mtcars_mod)\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftidyverse%2Fmodelr","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ftidyverse%2Fmodelr","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftidyverse%2Fmodelr/lists"}