Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/mlr-org/mlr3fda

Functional Data Analysis for mlr3
https://github.com/mlr-org/mlr3fda

data-analysis data-analysis-in-r data-science functional-data machine-learning mlr3 r r-package

Last synced: 3 months ago
JSON representation

Functional Data Analysis for mlr3

Host: GitHub
URL: https://github.com/mlr-org/mlr3fda
Owner: mlr-org
License: lgpl-3.0
Created: 2018-11-16T10:33:56.000Z (about 6 years ago)
Default Branch: main
Last Pushed: 2024-04-24T21:57:08.000Z (9 months ago)
Last Synced: 2024-05-01T09:37:21.957Z (9 months ago)
Topics: data-analysis, data-analysis-in-r, data-science, functional-data, machine-learning, mlr3, r, r-package
Language: R
Homepage: https://mlr3fda.mlr-org.com/
Size: 12.1 MB
Stars: 3
Watchers: 12
Forks: 2
Open Issues: 5
Metadata Files:
- Readme: README.Rmd
- License: LICENSE

Awesome Lists containing this project

README

        ---

output: github_document

---

```{r, include = FALSE}

knitr::opts_chunk$set(

  collapse = TRUE,

  comment = "#>",

  fig.path = "man/figures/README-",

  out.width = "100%"

)

lgr::get_logger("mlr3")$set_threshold("warn")

set.seed(1L)

options(datatable.print.class = FALSE, datatable.print.keys = FALSE)

library(mlr3fda)

library(mlr3misc)

```

# mlr3fda

Package website: [release](https://mlr3fda.mlr-org.com/) \| [dev](https://mlr3fda.mlr-org.com/dev/)

Extending mlr3 to functional data.

[![RCMD Check](https://github.com/mlr-org/mlr3fda/actions/workflows/rcmdcheck.yaml/badge.svg)](https://github.com/mlr-org/mlr3fda/actions/workflows/rcmdcheck.yaml)

[![CRAN status](https://www.r-pkg.org/badges/version/mlr3fda)](https://CRAN.R-project.org/package=mlr3fda)

[![StackOverflow](https://img.shields.io/badge/stackoverflow-mlr3-orange.svg)](https://stackoverflow.com/questions/tagged/mlr3)

[![Mattermost](https://img.shields.io/badge/chat-mattermost-orange.svg)](https://lmmisld-lmu-stats-slds.srv.mwn.de/mlr_invite/)

## Installation

Install the last release from [CRAN](https://CRAN.R-project.org):

```{r, eval = FALSE}

install.packages("mlr3fda")

```

Install the development version from [GitHub](https://github.com/):

```{r, eval = FALSE}

# install.packages("pak")

pak::pak("mlr-org/mlr3fda")

```

## What is mlr3fda?

The goal of `mlr3fda` is to extend `mlr3` to [functional data](https://en.wikipedia.org/wiki/Functional_data_analysis).

This is achieved by adding support for functional feature types and providing preprocessing `PipeOp`s that operates on functional columns.

For representing functional data, the `tfd_reg` and `tfd_irreg` datatypes from the [tf](https://github.com/tidyfun/tf) package are used and are available after loading `mlr3fda`:

```{r task_feature_types}

library(mlr3fda)

mlr_reflections$task_feature_types[c("tfr", "tfi")]

```

These datatypes can be used to represent regular and irregular functional data respectively.

Currently, `Learner`s that directly operate on functional data are not available, so it is necessary to first extract scalar features from the functional columns.

# Quickstart

Here we will start with the predefined `dti` (Diffusion Tensor Imaging) task, see `tsk("dti")$help()` for more details.

Besides scalar columns, this task also contains two functional columns `cca` and `rcst`.

```{r data, dpi = 300}

task = tsk("dti")

task

```

To train a model on this task we first need to extract scalar features from the functions.

We illustrate this below by extracting the mean value.

```{r fda.extract, fig.width = 5, fig.height = 3}

po_fmean = po("fda.extract", features = "mean")

task_fmean = po_fmean$train(list(task))[[1L]]

task_fmean$head()

```

This can be combined with a `Lerner` into a `GraphLearner` that first extracts features and then trains a model.

```{r graph}

# split data into train and test set

ids = partition(task)

# define a Graph and convert it to a GraphLearner

graph = po("fda.extract", features = "mean", drop = TRUE) %>>%

  po("learner", learner = lrn("regr.rpart"))

glrn = as_learner(graph)

# train the graph learner on the train set

glrn$train(task, row_ids = ids$train)

# make predictions on the test set

glrn$predict(task, row_ids = ids$test)

```

## Implemented PipeOps

```{r, echo = FALSE}

content = as.data.table(mlr_pipeops, objects = TRUE)

content = content[map_lgl(tags, function(t) "fda" %in% t), .(key, label, packages, tags)]

content[, packages := map(packages, function(x) setdiff(x, c("mlr3pipelines", "mlr3fda")))]

content[, `:=`(

  key = sprintf("[%1$s](https://mlr3fda.mlr-org.com/reference/mlr_pipeops_%1$s)", key),

  packages = map_chr(packages, function(pkg) {

    toString(ifelse(

      pkg %in% c("stats", "graphics", "datasets"), pkg, sprintf("[%1$s](https://cran.r-project.org/package=%1$s)", pkg)

    ))

  }),

  tags = map_chr(tags, toString)

)]

knitr::kable(content, format = "markdown", col.names = tools::toTitleCase(names(content)))

```

## Bugs, Questions, Feedback

*mlr3fda* is a free and open source software project that

encourages participation and feedback. If you have any issues,

questions, suggestions or feedback, please do not hesitate to open an

“issue” about it on the GitHub page\!

In case of problems / bugs, it is often helpful if you provide a

“minimum working example” that showcases the behaviour (but don’t

worry about this if the bug is obvious).

Please understand that the resources of the project are limited:

response may sometimes be delayed by a few days, and some feature

suggestions may be rejected if they are deemed too tangential to the

vision behind the project.

## Acknowledgements

The development of this R-package was supported by Roche Diagonstics R&D.