Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/jonocarroll/formulize

generate formula and recipe wrappers for modelling functions
https://github.com/jonocarroll/formulize

Last synced: 8 days ago
JSON representation

generate formula and recipe wrappers for modelling functions

Awesome Lists containing this project

README

        

---
output: github_document
---

```{r setup, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
```

# formulize

[![Travis build status](https://travis-ci.org/alexpghayes/formulize.svg?branch=master)](https://travis-ci.org/alexpghayes/formulize)
[![AppVeyor Build Status](https://ci.appveyor.com/api/projects/status/github/alexpghayes/formulize?branch=master&svg=true)](https://ci.appveyor.com/project/alexpghayes/formulize)
[![Coverage status](https://codecov.io/gh/alexpghayes/formulize/branch/master/graph/badge.svg)](https://codecov.io/github/alexpghayes/formulize?branch=master)
[![CRAN_Status_Badge](http://www.r-pkg.org/badges/version/formulize)](https://cran.rstudio.com/web/packages/formulize/index.html)
![Downloads](http://cranlogs.r-pkg.org/badges/formulize)

If you:

- like using formulas, recipes and data frames to specify design matrices
- develop nervous ticks when you come across modelling packages that only offer matrix/vector interfaces
- don't have the time or motivation to write a formula wrapper around these interfaces
- like untested and hacky software written by amateurs

then `formulize` may be for you. Formulize is very new, but you can install it from CRAN if you have R 3.4+ with:

```{r, eval = FALSE}
install.packages("formulize")
```

You can get the development version (recommended) with:

```{r, eval = FALSE}
# install.packages("devtools")
devtools::install_github("alexpghayes/formulize")
```

## Adding a formula or recipe interface

Suppose you want to add a formula interface to an existing modelling function, say `cv.glmnet`. Then you could do the following

```{r, message = FALSE, warning = FALSE}
library(recipes)
library(glmnet)
library(formulize)

glmnet_cv <- formulize(cv.glmnet)

glmnet_model <- glmnet_cv(mpg ~ drat + hp - 1, mtcars)
predict(glmnet_model, head(mtcars))
```

Similarly `glmnet_cv` works with recipe objects like so

```{r}
rec <- recipe(mpg ~ drat + hp, data = mtcars)

glmnet_model2 <- glmnet_cv(rec, mtcars)
predict(glmnet_model2, head(mtcars))
```

You may also be interested in the more ~~dangerous~~ exciting version `genericize`, which you should call for its side effects.

```{r}
genericize(cv.glmnet)

form <- mpg ~ drat + hp - 1
X <- model.matrix(form, mtcars)
y <- mtcars$mpg

set.seed(27)
mat_model <- cv.glmnet(X, y, intercept = TRUE)

set.seed(27)
frm_model <- cv.glmnet(form, mtcars, intercept = TRUE)

set.seed(27)
rec_model <- cv.glmnet(rec, mtcars, intercept = TRUE)

predict(mat_model, head(X))
predict(frm_model, head(mtcars))
predict(rec_model, head(mtcars))
```

This creates a new S3 generic `cv.glmnet`, sets the provided function as the default method (`cv.glmnet.default`), and adds methods `cv.glmnet.formula` and `cv.glmnet.recipe` using `formulize`.

This will mask `cv.glmnet` and features no safety checks because safety isn't fun.

## Caveats

- `formulize` doesn't do anything special with intercepts. This means that you need to careful with functions that require you to specify intercepts in non-standard ways, such as `cv.glmnet` above.
- If the original modelling function doesn't return a list, `formulize` will probably break.
- If you're just looking for a formula interface to `glmnet`, take a look at [glmnetUtils](https://github.com/Hong-Revo/glmnetUtils).