Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/hendersontrent/theftdlc

Analyse and interpret time-series features calculated from the theft R package
https://github.com/hendersontrent/theftdlc

data-science data-visualization machine-learning r statistics time-series

Last synced: 2 months ago
JSON representation

Analyse and interpret time-series features calculated from the theft R package

Awesome Lists containing this project

README

        

---
output: rmarkdown::github_document
---

# theftdlc
[![CRAN version](https://www.r-pkg.org/badges/version/theftdlc)](https://www.r-pkg.org/pkg/theftdlc)
[![CRAN RStudio mirror downloads](https://cranlogs.r-pkg.org/badges/theftdlc)](https://www.r-pkg.org/pkg/theftdlc)

Analyse and Interpret Time Series Features

```{r, include = FALSE}
knitr::opts_chunk$set(
comment = NA, fig.width = 14, fig.height = 10, cache = FALSE)
```

## Installation

You can install the stable version of `theftdlc` from CRAN:

```{r eval = FALSE}
install.packages("theftdlc")
```

You can install the development version of `theftdlc` from GitHub using the following:

```{r eval = FALSE}
devtools::install_github("hendersontrent/theftdlc")
```

## General purpose

The [`theft`](https://hendersontrent.github.io/theft/) package for R facilitates user-friendly access to a structured analytical workflow for the extraction of time-series features from six different feature sets (or a set of user-supplied features): `"catch22"`, `"feasts"`, `"Kats"`, `"tsfeatures"`, `"tsfresh"`, and `"TSFEL"`.

`theftdlc` extends this feature-based ecosystem by providing a suite of functions for analysing, interpreting, and visualising time-series features calculated using `theft`. Functionality including data quality assessments and normalisation methods, low dimensional projections (linear and nonlinear), data matrix and feature distribution visualisations, time-series classification machine learning procedures, statistical hypothesis testing, and various other statistical and graphical tools.

Hex stickers of the theft and theftdlc packages for R

A high-level overview of how the `theft` ecosystem for R is typically accessed by users is shown below. Many more functions and options for customisation are available within the packages.

Schematic of the theft ecosystem in R

### What's in a name?

`theftdlc` means 'downloadable content' (DLC) for `theft`---just like you get [DLCs and expansions](https://en.bandainamcoent.eu/elden-ring/elden-ring/shadow-of-the-erdtree) for video games.

## Quick tour

`theft` and `theftdlc` combine to create an intuitive and efficient tidy feature-based workflow. Here is an example of a single code chunk that calculates features using [`catch22`](https://github.com/hendersontrent/Rcatch22) and a custom set of mean and standard deviation, and projects the feature space into an interpretable two-dimensional space using principal components analysis:

```{r, message = FALSE, warning = FALSE, fig.height=6, fig.width=6}
library(dplyr)
library(theft)
library(theftdlc)

calculate_features(data = theft::simData,
group_var = "process",
feature_set = "catch22",
features = list("mean" = mean, "sd" = sd)) %>%
project(norm_method = "RobustSigmoid",
unit_int = TRUE,
low_dim_method = "PCA") %>%
plot()
```

In that example, `calculate_features` comes from `theft`, while `project` and the `plot` generic come from `theftdlc`.

Similarly, we can perform time-series classification using a similar simple workflow to compare the performance of `catch22` against our custom set of the first two moments of the distribution:

```{r, message = FALSE, warning = FALSE}
calculate_features(data = theft::simData,
group_var = "process",
feature_set = "catch22",
features = list("mean" = mean, "sd" = sd)) %>%
classify(by_set = TRUE,
n_resamples = 5,
use_null = TRUE) %>%
compare_features(by_set = TRUE,
hypothesis = "null") %>%
head()
```

In this example, `classify` and `compare_features` come from `theftdlc`.

Please see the vignette for more information and the full functionality of both packages.

## Citation

If you use `theft` or `theftdlc` in your own work, please cite both the paper:

T. Henderson and Ben D. Fulcher. [Feature-Based Time-Series Analysis in R using the theft Package](https://arxiv.org/abs/2208.06146). arXiv, (2022).

and the software:

```{r, echo = FALSE}
citation("theft")
citation("theftdlc")
```