An open API service indexing awesome lists of open source software.

https://github.com/r-world-devs/cohortBuilder


https://github.com/r-world-devs/cohortBuilder

Last synced: 4 months ago
JSON representation

Awesome Lists containing this project

README

        

---
output: github_document
---

```{r setup, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
eval = TRUE,
echo = TRUE,
message = TRUE,
warning = TRUE,
fig.width = 8,
fig.height = 6,
dpi = 200,
fig.align = "center",
fig.path = "man/figures/README-"
)
knitr::opts_chunk$set()
library(magrittr)
library(cohortBuilder)
set.seed(123)
options(tibble.width = Inf)
iris <- tibble::as.tibble(iris)
options("tibble.print_max" = 5)
options("tibble.print_min" = 5)
pkg_version <- read.dcf("DESCRIPTION", fields = "Version")[1, 1]
```

# cohortBuilder

[![version](https://img.shields.io/static/v1.svg?label=github.com&message=v.`r I(pkg_version)`&color=ff69b4)](https://r-world-devs.github.io/cohortBuilder/)
[![lifecycle](https://img.shields.io/badge/lifecycle-experimental-orange.svg)](https://lifecycle.r-lib.org/articles/stages.html#experimental)

## Overview

`cohortBuilder` provides common API for creating cohorts on multiple data sources,
such as local data frame, database schema or external data api.

With only two steps:

1. Configuring data source with `set_source`.
2. Initializing cohort with `cohort`.

You can operate on data using common methods, such as:

- `filter` - to define and `run` to apply filtering rules,
- `step` - to perform multi-stage filtering,
- `get_data`, `stat`, `attrition`, `plot_data` - to extract, sum up or visualize your cohort data.

With `cohortBuilder` you can share the cohort easier with useful methods:

- `code` - to get reproducible cohort creation code,
- `get_state` - to get cohort state (e.g. in JSON) that can be then easily restored with `restore`.

Or modify the cohort configuration with:

- `add_filter`, `rm_filter`, `update_filter` - to manage filters definition
- `add_step`, `rm_step` - to manage filtering steps,
- `update_source` - to manage the cohort source.

## Data sources and extensions

The goal of `cohortBuilder` is to provide common API for creating data cohorts,
but also to be easily extendable for working on different data sources (and interactive dashboards).

`cohortBuilder` allows to operate on local data frames (or list of data frames),
yet you may easily switch to a database source by loading `cohortBuilder.db` layer.

As a standalone R package, you use `cohortBuilder` to perform all the operations in non-interactive
R script, but its shiny layer `shinyCohortBuilder` package helps you to easily switch to intuitive gui mode.
More to that you may integrate `cohortBuilder` with your custom Shiny application.

If you want to learn how to write custom source extension, please check `vignette("custom-extensions")`.

## Installation

```{r, eval = FALSE}
# CRAN version
install.packages("cohortBuilder")

# Latest development version
remotes::install_github("https://github.com/r-world-devs/cohortBuilder")
```

## Usage

```{r}
librarian_source <- set_source(
as.tblist(librarian)
)

coh <- librarian_source %>%
cohort(
filter(
"discrete", id = "author", dataset = "books",
variable = "author", value = "Dan Brown"
),
filter(
"range", id = "copies", dataset = "books",
variable = "copies", range = c(5, 10)
),
filter(
"date_range", id = "registered", dataset = "borrowers",
variable = "registered", range = c(as.Date("2010-01-01"), Inf)
)
) %>%
run()

get_data(coh)
```

```{r}
coh <- librarian_source %>%
cohort() %->%
step(
filter(
"discrete", id = "author", dataset = "books",
variable = "author", value = "Dan Brown"
),
filter(
"date_range", id = "registered", dataset = "borrowers",
variable = "registered", range = c(as.Date("2010-01-01"), Inf)
)
) %->%
step(
filter(
"range", id = "copies", dataset = "books",
variable = "copies", range = c(5, 10)
)
) %>%
run()
```

```{r}
get_data(coh, step_id = 1)
```

```{r}
get_data(coh, step_id = 2)
```

```{r}
update_filter(
coh, step_id = 1, filter_id = "author",
range = c(5, 6)
)
run(coh)

get_data(coh, step_id = 2)
```

```{r}
code(coh)
```

```{r}
attrition(coh, dataset = "books")
```

```{r}
get_state(coh, json = TRUE)
```

## Acknowledgement

Special thanks to:

- [Kamil Wais](mailto:[email protected]) for highlighting the need for the package and its relevance to real-world applications.
- [Adam Foryś](mailto:[email protected]) for technical support, numerous suggestions for the current and future implementation of the package.
- [Paweł Kawski](mailto:[email protected]) for indication of initial assumptions about the package based on real-world medical data.

## Getting help

In a case you found any bugs, have feature request or general question please file an issue at the package [Github](https://github.com/r-world-devs/cohortBuilder/issues).
You may also contact the package author directly via email at [[email protected]](mailto:[email protected]).