Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/alexpghayes/safepredict

Consistent prediction following tidymodels principles
https://github.com/alexpghayes/safepredict

Last synced: 28 days ago
JSON representation

Consistent prediction following tidymodels principles

Awesome Lists containing this project

README

        

---
output: github_document
---

```{r setup, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
out.width = "100%"
)

options(tibble.print_min = 5, tibble.print_max = 5)

set.seed(27)
```

# safepredict
[![Travis build status](https://travis-ci.org/alexpghayes/safepredict.svg?branch=master)](https://travis-ci.org/alexpghayes/safepredict)
[![Coverage status](https://codecov.io/gh/alexpghayes/safepredict/branch/master/graph/badge.svg)](https://codecov.io/github/alexpghayes/safepredict?branch=master)
[![lifecycle](https://img.shields.io/badge/lifecycle-experimental-orange.svg)](https://www.tidyverse.org/lifecycle/#experimental)

`safepredict` has two goals: to provide a consistent interface to prediction via the `safe_predict()` generic, and to accurately quantify prediction uncertainty.

`safe_predict()`:

- always returns a tibble.
- never drops rows with missing data.
- always uses the same arguments in the same order.

`safepredict` follows the [tidymodels prediction specification][pred_spec].

[pred_spec]: https://tidymodels.github.io/model-implementation-principles/model-predictions.html

## Installation

`safepredict` is currently in the beginning stages of development and is available only on Github. You can install it with:

```r
# install.packages("devtools")
devtools::install_github("alexpghayes/safepredict")
```

## Arguments

The three main arguments to `safe_predict()` are always the same:

- `object`: A model object you would like to get predictions from.

- `new_data`: **Required**. Data in the same format as required for the
`predict()` method.

- `type`: What kind of predictions you would like. Options are:

| type | application |
|-------------- |--------------------------------------------- |
| `response` | numeric predictions |
| `class` | hard class predictions |
| `prob` | class probabilities, survivor probabilities |
| `link` | `glm` linear predictor |
| `conf_int` | confidence intervals |
| `pred_int` | prediction intervals |

## Examples

Suppose you fit a logistic regression using `glm`:

```{r}
library(tibble)

data <- tibble(
y = as.factor(rep(c("A", "B"), each = 50)),
x = c(rnorm(50, 1), rnorm(50, 3))
)

fit <- glm(y ~ x, data, family = binomial)
```

You can predict class probabilities:

```{r}
library(safepredict)

test <- tibble(x = rnorm(10, 2))

safe_predict(fit, new_data = test, type = "prob")
```

or can jump straight to hard class decisions

```{r}
safe_predict(fit, new_data = test, type = "class")
```

We can also get predictions on the link scale:

```{r}
safe_predict(fit, new_data = test, type = "link")
```

or we can get confidence intervals on the response scale

```{r}
safe_predict(fit, new_data = test, type = "conf_int")
```

## Related work

- [`parsnip`](https://tidymodels.github.io/parsnip/) provides a consistent interface to supervised models. Eventually `safepredict` will act as the prediction backend for `parsnip`.

- [`broom`](https://broom.tidyverse.org/) summarizes key information about models in tidy tibbles.

- The many ongoing [`tidymodels`](https://github.com/tidymodels) projects.

- The [`prediction`](https://github.com/leeper/prediction) package by Thomas Leeper. `prediction` supports a much wider variety of models than `safepredict` at the moment, but makes fewer consistency guarantees.