https://github.com/tidymodels/parsnip

A tidy unified interface to models
https://github.com/tidymodels/parsnip

Last synced: about 2 months ago
JSON representation

A tidy unified interface to models

Host: GitHub
URL: https://github.com/tidymodels/parsnip
Owner: tidymodels
License: other
Created: 2017-12-10T22:48:42.000Z (over 7 years ago)
Default Branch: main
Last Pushed: 2025-04-22T18:13:01.000Z (2 months ago)
Last Synced: 2025-04-28T12:09:46.979Z (2 months ago)
Language: R
Homepage: https://parsnip.tidymodels.org
Size: 30.3 MB
Stars: 619
Watchers: 27
Forks: 91
Open Issues: 89
Metadata Files:
- Readme: README.Rmd
- Changelog: NEWS.md
- Contributing: .github/CONTRIBUTING.md
- License: LICENSE
- Code of conduct: .github/CODE_OF_CONDUCT.md

Awesome Lists containing this project

jimsghstars - tidymodels/parsnip - A tidy unified interface to models (R)

README

        ---

output: github_document

---

```{r, include = FALSE}

knitr::opts_chunk$set(

  collapse = TRUE,

  comment = "#>",

  fig.path = "man/figures/README-",

  out.width = "100%"

)

```

# parsnip 

[![R-CMD-check](https://github.com/tidymodels/parsnip/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/tidymodels/parsnip/actions/workflows/R-CMD-check.yaml)

[![Codecov test coverage](https://codecov.io/gh/tidymodels/parsnip/branch/main/graph/badge.svg)](https://app.codecov.io/gh/tidymodels/parsnip?branch=main)

[![CRAN status](https://www.r-pkg.org/badges/version/parsnip)](https://CRAN.R-project.org/package=parsnip)

[![Downloads](https://cranlogs.r-pkg.org/badges/parsnip)](https://CRAN.R-project.org/package=parsnip)

[![lifecycle](https://img.shields.io/badge/lifecycle-stable-brightgreen.svg)](https://lifecycle.r-lib.org/articles/stages.html)

## Introduction

The goal of parsnip is to provide a tidy, unified interface to models that can be used to try a range of models without getting bogged down in the syntactical minutiae of the underlying packages. 

## Installation

```{r, eval = FALSE}

# The easiest way to get parsnip is to install all of tidymodels:

install.packages("tidymodels")

# Alternatively, install just parsnip:

install.packages("parsnip")

# Or the development version from GitHub:

# install.packages("pak")

pak::pak("tidymodels/parsnip")

```

## Getting started

One challenge with different modeling functions available in R _that do the same thing_ is that they can have different interfaces and arguments. For example, to fit a random forest regression model, we might have:

```{r eval = FALSE}

# From randomForest

rf_1 <- randomForest(

  y ~ ., 

  data = dat, 

  mtry = 10, 

  ntree = 2000, 

  importance = TRUE

)

# From ranger

rf_2 <- ranger(

  y ~ ., 

  data = dat, 

  mtry = 10, 

  num.trees = 2000, 

  importance = "impurity"

)

# From sparklyr

rf_3 <- ml_random_forest(

  dat, 

  intercept = FALSE, 

  response = "y", 

  features = names(dat)[names(dat) != "y"], 

  col.sample.rate = 10,

  num.trees = 2000

)

```

Note that the model syntax can be very different and that the argument names (and formats) are also different. This is a pain if you switch between implementations. 

In this example: 

* the **type** of model is "random forest", 

* the **mode** of the model is "regression" (as opposed to classification, etc), and 

* the computational **engine** is the name of the R package. 

The goals of parsnip are to:

* Separate the definition of a model from its evaluation.

* Decouple the model specification from the implementation (whether the implementation is in R, spark, or something else). For example, the user would call `rand_forest` instead of `ranger::ranger` or other specific packages. 

* Harmonize argument names (e.g. `n.trees`, `ntrees`, `trees`) so that users only need to remember a single name. This will help _across_ model types too so that `trees` will be the same argument across random forest as well as boosting or bagging. 

Using the example above, the parsnip approach would be:

```{r}

library(parsnip)

rand_forest(mtry = 10, trees = 2000) %>%

  set_engine("ranger", importance = "impurity") %>%

  set_mode("regression")

```

The engine can be easily changed. To use Spark, the change is straightforward:

```{r}

rand_forest(mtry = 10, trees = 2000) %>%

  set_engine("spark") %>%

  set_mode("regression")

```

Either one of these model specifications can be fit in the same way:

```{r}

set.seed(192)

rand_forest(mtry = 10, trees = 2000) %>%

  set_engine("ranger", importance = "impurity") %>%

  set_mode("regression") %>%

  fit(mpg ~ ., data = mtcars)

```

A list of all parsnip models across different CRAN packages can be found at https://www.tidymodels.org/find/parsnip.

## Contributing

This project is released with a [Contributor Code of Conduct](https://contributor-covenant.org/version/2/0/CODE_OF_CONDUCT.html). By contributing to this project, you agree to abide by its terms.

- For questions and discussions about tidymodels packages, modeling, and machine learning, please [post on RStudio Community](https://forum.posit.co/new-topic?category_id=15&tags=tidymodels,question).

- If you think you have encountered a bug, please [submit an issue](https://github.com/tidymodels/parsnip/issues).

- Either way, learn how to create and share a [reprex](https://reprex.tidyverse.org/articles/articles/learn-reprex.html) (a minimal, reproducible example), to clearly communicate about your code.

- Check out further details on [contributing guidelines for tidymodels packages](https://www.tidymodels.org/contribute/) and [how to get help](https://www.tidymodels.org/help/).

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/tidymodels/parsnip

Awesome Lists containing this project

README