An open API service indexing awesome lists of open source software.

https://github.com/jonocarroll/chaint

"chain with a tee" -- add tee functions to magrittr/dplyr chains
https://github.com/jonocarroll/chaint

Last synced: about 1 year ago
JSON representation

"chain with a tee" -- add tee functions to magrittr/dplyr chains

Awesome Lists containing this project

README

          

---
output:
md_document:
variant: markdown_github
---

```{r, echo = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "README-"
)
suppressWarnings(suppressPackageStartupMessages(library(dplyr)))
```
# chaint
### "chain with a tee" - tee functions with magrittr/dplyr chains

## Installation:

```
devtools::install_github("jonocarroll/chaint")
```

## Motivation:

`dplyr` chains are undoubtedly more readable than composed functions, and this makes processing of sequential steps much simpler. When written with line breaks, such as

```{r}
mtcars %>%
group_by(cyl) %>%
summarise(meanMPG = mean(mpg))
```

individual processing steps can be skipped over by commenting out a single line

```{r}
mtcars %>%
# group_by(cyl) %>%
summarise(meanMPG = mean(mpg))
```

The ouptut can then be viewed, and processing resumed.

Often we are interested in what is happening in the middle of this chain, or would like to do something conditionally. `magrittr::%T>%` provides a tee-like function which essentially boils down to

```{r, eval=FALSE}
%T>% <- function(lhs, rhs) {
do.call(rhs, lhs)
return(lhs)
}
```
This is useful for passing data to the `rhs` function which has side effects and for which the return value is to be discarded. However, `%T>%` doesn't take arguments, and the `rhs` function is required to do something with the incomming data.

`chaint` provides a similar function `tee` which is a more sophisticated version of `%T>%` which

- still takes the data as input to the first argument, but
- doesn't require the data to be used in any way,
- provides a mechanism for adding messages within a chain,
- can be conditional on the data,
- can take arguments to be passed on to the `rhs` function.

This is primarily intended for interactive use, not for programming with, as it it used for the side effect of printing to console.

## Example usage:

To add a simple message (maybe a checkpoint flag or a notification about the processing), use `say` with a quoted message:

```{r}
mtcars %>%
group_by(cyl) %>%
chaint::say("caution, grouped data!") %>%
summarise(meanMPG = mean(mpg))
```

The `tee` function prints a message and applies a function to the incomming data (with optional arguments) then transparently returns the incomming data to allow the chain to continue, allowing inspection of the intermediate data. No commenting out of lines.

```{r}
mtcars %>%
chaint::tee("checkpoint 1", head, n=10) %>%
filter(cyl < 6) %>%
chaint::tee("checkpoint 2", head, n=3) %>%
summarise(meanMPG = mean(mpg))
```