Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/mlr-org/mlr3fda

Functional Data Analysis for mlr3
https://github.com/mlr-org/mlr3fda

data-analysis data-analysis-in-r data-science functional-data machine-learning mlr3 r r-package

Last synced: 3 months ago
JSON representation

Functional Data Analysis for mlr3

Awesome Lists containing this project

README

        

---
output: github_document
---

```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)

lgr::get_logger("mlr3")$set_threshold("warn")
set.seed(1L)
options(datatable.print.class = FALSE, datatable.print.keys = FALSE)
library(mlr3fda)
library(mlr3misc)
```

# mlr3fda

Package website: [release](https://mlr3fda.mlr-org.com/) \| [dev](https://mlr3fda.mlr-org.com/dev/)

Extending mlr3 to functional data.

[![RCMD Check](https://github.com/mlr-org/mlr3fda/actions/workflows/rcmdcheck.yaml/badge.svg)](https://github.com/mlr-org/mlr3fda/actions/workflows/rcmdcheck.yaml)
[![CRAN status](https://www.r-pkg.org/badges/version/mlr3fda)](https://CRAN.R-project.org/package=mlr3fda)
[![StackOverflow](https://img.shields.io/badge/stackoverflow-mlr3-orange.svg)](https://stackoverflow.com/questions/tagged/mlr3)
[![Mattermost](https://img.shields.io/badge/chat-mattermost-orange.svg)](https://lmmisld-lmu-stats-slds.srv.mwn.de/mlr_invite/)

## Installation

Install the last release from [CRAN](https://CRAN.R-project.org):

```{r, eval = FALSE}
install.packages("mlr3fda")
```

Install the development version from [GitHub](https://github.com/):

```{r, eval = FALSE}
# install.packages("pak")
pak::pak("mlr-org/mlr3fda")
```

## What is mlr3fda?

The goal of `mlr3fda` is to extend `mlr3` to [functional data](https://en.wikipedia.org/wiki/Functional_data_analysis).
This is achieved by adding support for functional feature types and providing preprocessing `PipeOp`s that operates on functional columns.
For representing functional data, the `tfd_reg` and `tfd_irreg` datatypes from the [tf](https://github.com/tidyfun/tf) package are used and are available after loading `mlr3fda`:

```{r task_feature_types}
library(mlr3fda)
mlr_reflections$task_feature_types[c("tfr", "tfi")]
```

These datatypes can be used to represent regular and irregular functional data respectively.
Currently, `Learner`s that directly operate on functional data are not available, so it is necessary to first extract scalar features from the functional columns.

# Quickstart

Here we will start with the predefined `dti` (Diffusion Tensor Imaging) task, see `tsk("dti")$help()` for more details.
Besides scalar columns, this task also contains two functional columns `cca` and `rcst`.

```{r data, dpi = 300}
task = tsk("dti")
task
```

To train a model on this task we first need to extract scalar features from the functions.
We illustrate this below by extracting the mean value.

```{r fda.extract, fig.width = 5, fig.height = 3}
po_fmean = po("fda.extract", features = "mean")

task_fmean = po_fmean$train(list(task))[[1L]]
task_fmean$head()
```

This can be combined with a `Lerner` into a `GraphLearner` that first extracts features and then trains a model.

```{r graph}
# split data into train and test set
ids = partition(task)

# define a Graph and convert it to a GraphLearner
graph = po("fda.extract", features = "mean", drop = TRUE) %>>%
po("learner", learner = lrn("regr.rpart"))

glrn = as_learner(graph)

# train the graph learner on the train set
glrn$train(task, row_ids = ids$train)

# make predictions on the test set
glrn$predict(task, row_ids = ids$test)
```

## Implemented PipeOps

```{r, echo = FALSE}
content = as.data.table(mlr_pipeops, objects = TRUE)
content = content[map_lgl(tags, function(t) "fda" %in% t), .(key, label, packages, tags)]
content[, packages := map(packages, function(x) setdiff(x, c("mlr3pipelines", "mlr3fda")))]
content[, `:=`(
key = sprintf("[%1$s](https://mlr3fda.mlr-org.com/reference/mlr_pipeops_%1$s)", key),
packages = map_chr(packages, function(pkg) {
toString(ifelse(
pkg %in% c("stats", "graphics", "datasets"), pkg, sprintf("[%1$s](https://cran.r-project.org/package=%1$s)", pkg)
))
}),
tags = map_chr(tags, toString)
)]
knitr::kable(content, format = "markdown", col.names = tools::toTitleCase(names(content)))
```

## Bugs, Questions, Feedback

*mlr3fda* is a free and open source software project that
encourages participation and feedback. If you have any issues,
questions, suggestions or feedback, please do not hesitate to open an
“issue” about it on the GitHub page\!

In case of problems / bugs, it is often helpful if you provide a
“minimum working example” that showcases the behaviour (but don’t
worry about this if the bug is obvious).

Please understand that the resources of the project are limited:
response may sometimes be delayed by a few days, and some feature
suggestions may be rejected if they are deemed too tangential to the
vision behind the project.

## Acknowledgements

The development of this R-package was supported by Roche Diagonstics R&D.