An open API service indexing awesome lists of open source software.

https://github.com/marksendak/constellation

Time series joins to identify sequence of related events
https://github.com/marksendak/constellation

electronic-health-record electronic-health-records healthcare patients r timeseries

Last synced: about 2 months ago
JSON representation

Time series joins to identify sequence of related events

Awesome Lists containing this project

README

          

---
output:
github_document
---

```{r, echo = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "README-"
)
```

# constellation

[![Build Status](https://travis-ci.org/marksendak/constellation.svg?branch=master)](https://travis-ci.org/marksendak/constellation)
[![Windows build status](https://ci.appveyor.com/api/projects/status/github/marksendak/constellation?branch=master&svg=true)](https://ci.appveyor.com/project/marksendak/constellation)

## Overview

Constellation contains a set of functions for applying multidimensional, time window based logic to time series data frames of arbitrary length. Constellation was developed to enable rapid and flexible identification of series of events that occur in hospitalized patients. The functions have been abstracted for general purpose use with time series data. Constellation extends and provides a friendly API to rolling joins and overlap joins implemented in [data.table](https://CRAN.R-project.org/package=data.table). Three datasets (labs, vitals, and orders) with randomly synthesized time series data for a cohort of 100 patients are included to facilitate testing of functions.

There are five functions included in constellation to build complex features from time series data:

* `value_change()` identify increases or decreases in a value within a given time window
* `constellate()` identify time stamps when a series of events occurs within a given time window
* `constellate_criteria()` identify which events occur within a given time window for every measurement time stamp
* `bundle()` identify which events occur within a given time window of a given event
* `incidents()` identify distinct, incident episodes that must be separated in time by a minimum of a given time window

The `constellate_criteria()` and `bundle()` function are similar, but the `bundle()` function is anchored around a specific event table. The `bundle()` function identifies events that occur within a given time window of a **specific** event data frame that is supplied to the function. On the other hand, the `constellate_criteria()` function identifies events that occur within a given time window of **any** event data frame that is supplied to the function. The first data frame passed to the `bundle()` function is used as an anchor to search through the subsequent data frames passed to the function. The order of data frames is significant and passing different data frames as the first argument will generate different results. On the other hand, the order in which you pass data frames to the `constellate_criteria()` function is insignificant. Passing data frames in different orders will generate equivalent results.

Constellation can be used to build point-based scores for time series data (via `constellate_criteria()`), identify particular sequences of events that occur near each other (via `constellate()`), identify when specific changes occur for a given parameter (via `value_change()`), identify individual events that occur around a specified time stamp (via `bundle()`), and distinguish between eveents that are separated by a specified time window (via `incidents()`).

If you are new to constellation, the best place to start is the `vignette("constellation", "identify_sepsis")`. You can also view the sepsis vignette on [CRAN](https://cran.r-project.org/package=constellation/vignettes/identify_sepsis.html).

## Installation

You can install constellation from CRAN with:
```{r cran_install, eval = FALSE, message = FALSE}
install.packages("constellation")
library(constellation)
```

You can install the development version of constellation from github with:

```{r gh-installation, message = FALSE}
devtools::install_github("marksendak/constellation")
```

If you have any questions, comments, or feedback, please email mark.sendak@gmail.com.

## Example

Below are several variations of finding systolic blood pressure drops of 40 over a 6 hour period.

Examine systolic blood pressure data:
```{r example, message = FALSE}
library(constellation)

systolic_bp <- vitals[VARIABLE == "SYSTOLIC_BP"]
systolic_bp[, RECORDED_TIME := as.POSIXct(RECORDED_TIME, format = "%Y-%m-%dT%H:%M:%SZ", tz = "UTC")]
head(systolic_bp)
```

Identify the first systolic blood pressure drop per patient:
```{r first_drop, message = FALSE}
systolic_bp_drop <- value_change(systolic_bp, value = 40, direction = "down",
window_hours = 6, join_key = "PAT_ID", time_var = "RECORDED_TIME",
value_var = "VALUE", mult = "first")
head(systolic_bp_drop)
```

Identify the last systolic blood pressure drop per patient:
```{r last_drop, message = FALSE}
systolic_bp_drop <- value_change(systolic_bp, value = 40, direction = "down",
window_hours = 6, join_key = "PAT_ID", time_var = "RECORDED_TIME",
value_var = "VALUE", mult = "last")
head(systolic_bp_drop)
```

Identify all systolic blood pressure drops per patient:
```{r all_drops, message = FALSE}
systolic_bp_drop <- value_change(systolic_bp, value = 40, direction = "down",
window_hours = 6, join_key = "PAT_ID", time_var = "RECORDED_TIME",
value_var = "VALUE", mult = "all")
head(systolic_bp_drop)
```

## Why constellation?

In clinical medicine, there are a subset of conditions that are defined by a sequence of related events that unfold over time. These conditions are described as a "*constellation of signs and symptoms*."

Another piece of medical jargon that made it into the package is the concept of a treatment bundle. The `bundle()` function was originally designed to calculate the time stamp at which a group of treatments is delivered for every patient within a specified amount of time of developing a condition.

## Duke Institute for Health Innovation

constellation was originally developed to support a machine learning project at the [Duke Institute for Health Innovation](http://www.dihi.org/) to predict sepsis.