Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/vanderschaarlab/temporai
TemporAI: ML-centric Toolkit for Medical Time Series
https://github.com/vanderschaarlab/temporai
automl machine-learning medicine time-series
Last synced: 3 days ago
JSON representation
TemporAI: ML-centric Toolkit for Medical Time Series
- Host: GitHub
- URL: https://github.com/vanderschaarlab/temporai
- Owner: vanderschaarlab
- License: apache-2.0
- Created: 2022-12-30T18:16:02.000Z (about 2 years ago)
- Default Branch: main
- Last Pushed: 2023-12-14T16:01:53.000Z (about 1 year ago)
- Last Synced: 2025-01-10T01:12:36.240Z (10 days ago)
- Topics: automl, machine-learning, medicine, time-series
- Language: Python
- Homepage: https://www.temporai.vanderschaar-lab.com/
- Size: 4.53 MB
- Stars: 103
- Watchers: 8
- Forks: 22
- Open Issues: 26
-
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE.txt
- Citation: CITATION.cff
- Codeowners: .github/CODEOWNERS
- Authors: AUTHORS.md
Awesome Lists containing this project
README
[![Test In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/vanderschaarlab/temporai/blob/main/tutorials/usage/tutorial04_prediction.ipynb)
[![Documentation Status](https://readthedocs.org/projects/temporai/badge/?version=latest)](https://temporai.readthedocs.io/en/latest/?badge=latest)[![Python 3.7+](https://img.shields.io/badge/python-3.7+-blue.svg)](https://www.python.org/downloads/release/python-370/)
[![PyPI-Server](https://img.shields.io/pypi/v/temporai?color=blue)](https://pypi.org/project/temporai/)
[![Downloads](https://static.pepy.tech/badge/temporai)](https://pepy.tech/project/temporai)
[![License](https://img.shields.io/badge/License-Apache_2.0-blue.svg)](./LICENSE.txt)[![Tests](https://github.com/vanderschaarlab/temporai/actions/workflows/test.yml/badge.svg)](https://github.com/vanderschaarlab/temporai/actions/workflows/test.yml)
[![Tests](https://github.com/vanderschaarlab/temporai/actions/workflows/test_full.yml/badge.svg)](https://github.com/vanderschaarlab/temporai/actions/workflows/test.yml)
[![codecov](https://codecov.io/gh/vanderschaarlab/temporai/branch/main/graph/badge.svg?token=FCKO12SND7)](https://codecov.io/gh/vanderschaarlab/temporai)[![arXiv](https://img.shields.io/badge/arXiv-2301.12260-b31b1b.svg)](https://arxiv.org/abs/2301.12260)
[![slack](https://img.shields.io/badge/chat-on%20slack-purple?logo=slack)](https://join.slack.com/t/vanderschaarlab/shared_invite/zt-1u2rmhw06-sHS5nQDMN3Ka2Zer6sAU6Q)
[![about](https://img.shields.io/badge/about-The%20van%20der%20Schaar%20Lab-blue)](https://www.vanderschaar-lab.com/)# TemporAI
> **⚗️ Status:** This project is still in *alpha*, and the API may change without warning.
## 📃 Overview
*TemporAI* is a Machine Learning-centric time-series library for medicine. The tasks that are currently of focus in TemporAI are: time-to-event (survival) analysis with time-series data, treatment effects (causal inference) over time, and time-series prediction. Data preprocessing methods, including missing value imputation for static and temporal covariates, are provided. AutoML tools for hyperparameter tuning and pipeline selection are also available.
### How is TemporAI unique?
* **🏥 Medicine-first:** Focused on use cases for medicine and healthcare, such as temporal treatment effects, survival analysis over time, imputation methods, models with built-in and post-hoc interpretability, ... See [methods](./#-methods).
* **🏗️ Fast prototyping:** A plugin design allowing for on-the-fly integration of new methods by the users.
* **🚀 From research to practice:** Relevant novel models from research community adapted for practical use.
* **🌍 A healthcare ecosystem vision:** A range of interactive demonstration apps, new medical problem settings, interpretability tools, data-centric tools etc. are planned.### Key concepts
## 🚀 Installation
### Instal with `pip`
From [the Python Package Index (PyPI)](https://pypi.org/):
```bash
$ pip install temporai
```Or from source:
```bash
$ git clone https://github.com/vanderschaarlab/temporai.git
$ cd temporai
$ pip install .
```### Install in a [conda](https://docs.conda.io/en/latest/) environment
While have not yet published TemporAI on `conda-forge`, you can still install TemporAI in your conda environment using `pip` as follows:
Create and activate conda environment as normal:
```bash
$ conda create -n
$ conda activate
```Then install inside your `conda` environment with pip:
```bash
$ pip install temporai
```## 💥 Sample Usage
(▶️ Expand to view the sections below.)
List the available plugins
```python
from tempor import plugin_loaderprint(plugin_loader.list())
```Use a time-to-event (survival) analysis model
```python
from tempor import plugin_loader# Load a time-to-event dataset:
dataset = plugin_loader.get("time_to_event.pbc", plugin_type="datasource").load()# Initialize the model:
model = plugin_loader.get("time_to_event.dynamic_deephit")# Train:
model.fit(dataset)# Make risk predictions:
prediction = model.predict(dataset, horizons=[0.25, 0.50, 0.75])
```Use a temporal treatment effects model
```python
import numpy as npfrom tempor import plugin_loader
# Load a dataset with temporal treatments and outcomes:
dataset = plugin_loader.get(
"treatments.temporal.dummy_treatments",
plugin_type="datasource",
temporal_covariates_missing_prob=0.0,
temporal_treatments_n_features=1,
temporal_treatments_n_categories=2,
).load()# Initialize the model:
model = plugin_loader.get("treatments.temporal.regression.crn_regressor", epochs=20)# Train:
model.fit(dataset)# Define target variable horizons for each sample:
horizons = [
tc.time_indexes()[0][len(tc.time_indexes()[0]) // 2 :] for tc in dataset.time_series
]# Define treatment scenarios for each sample:
treatment_scenarios = [
[np.asarray([1] * len(h)), np.asarray([0] * len(h))] for h in horizons
]# Predict counterfactuals:
counterfactuals = model.predict_counterfactuals(
dataset,
horizons=horizons,
treatment_scenarios=treatment_scenarios,
)
```Use a missing data imputer
```python
from tempor import plugin_loaderdataset = plugin_loader.get(
"prediction.one_off.sine", plugin_type="datasource", with_missing=True
).load()
static_data_n_missing = dataset.static.dataframe().isna().sum().sum()
temporal_data_n_missing = dataset.time_series.dataframe().isna().sum().sum()print(static_data_n_missing, temporal_data_n_missing)
assert static_data_n_missing > 0
assert temporal_data_n_missing > 0# Initialize the model:
model = plugin_loader.get("preprocessing.imputation.temporal.bfill")# Train:
model.fit(dataset)# Impute:
imputed = model.transform(dataset)
temporal_data_n_missing = imputed.time_series.dataframe().isna().sum().sum()print(static_data_n_missing, temporal_data_n_missing)
assert temporal_data_n_missing == 0
```Use a one-off classifier (prediction)
```python
from tempor import plugin_loaderdataset = plugin_loader.get("prediction.one_off.sine", plugin_type="datasource").load()
# Initialize the model:
model = plugin_loader.get("prediction.one_off.classification.nn_classifier", n_iter=50)# Train:
model.fit(dataset)# Predict:
prediction = model.predict(dataset)
```Use a temporal regressor (forecasting)
```python
from tempor import plugin_loader# Load a dataset with temporal targets.
dataset = plugin_loader.get(
"prediction.temporal.dummy_prediction",
plugin_type="datasource",
temporal_covariates_missing_prob=0.0,
).load()# Initialize the model:
model = plugin_loader.get("prediction.temporal.regression.seq2seq_regressor", epochs=10)# Train:
model.fit(dataset)# Predict:
prediction = model.predict(dataset, n_future_steps=5)
```Benchmark models, time-to-event task
```python
from tempor.benchmarks import benchmark_models
from tempor import plugin_loader
from tempor.methods.pipeline import pipelinetestcases = [
(
"pipeline1",
pipeline(
[
"preprocessing.scaling.temporal.ts_minmax_scaler",
"time_to_event.dynamic_deephit",
]
)({"ts_coxph": {"n_iter": 100}}),
),
(
"plugin1",
plugin_loader.get("time_to_event.dynamic_deephit", n_iter=100),
),
(
"plugin2",
plugin_loader.get("time_to_event.ts_coxph", n_iter=100),
),
]
dataset = plugin_loader.get("time_to_event.pbc", plugin_type="datasource").load()aggr_score, per_test_score = benchmark_models(
task_type="time_to_event",
tests=testcases,
data=dataset,
n_splits=2,
random_state=0,
horizons=[2.0, 4.0, 6.0],
)print(aggr_score)
```Serialization
```python
from tempor.utils.serialization import load, save
from tempor import plugin_loader# Initialize the model:
model = plugin_loader.get("prediction.one_off.classification.nn_classifier", n_iter=50)buff = save(model) # Save model to bytes.
reloaded = load(buff) # Reload model.# `save_to_file`, `load_from_file` also available in the serialization module.
```AutoML - search for the best pipeline for your task
```python
from tempor.automl.seeker import PipelineSeekerdataset = plugin_loader.get("prediction.one_off.sine", plugin_type="datasource").load()
# Specify the AutoML pipeline seeker for the task of your choice, providing candidate methods,
# metric, preprocessing steps etc.
seeker = PipelineSeeker(
study_name="my_automl_study",
task_type="prediction.one_off.classification",
estimator_names=[
"cde_classifier",
"ode_classifier",
"nn_classifier",
],
metric="aucroc",
dataset=dataset,
return_top_k=3,
num_iter=100,
tuner_type="bayesian",
static_imputers=["static_tabular_imputer"],
static_scalers=[],
temporal_imputers=["ffill", "bfill"],
temporal_scalers=["ts_minmax_scaler"],
)# The search will return the best pipelines.
best_pipelines, best_scores = seeker.search() # doctest: +SKIP
```## 📖 Tutorials
### Data
- [![Test In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/vanderschaarlab/temporai/blob/main/tutorials/data/tutorial01_data_format.ipynb) - [Data Format](./tutorials/data/tutorial01_data_format.ipynb)
- [![Test In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/vanderschaarlab/temporai/blob/main/tutorials/data/tutorial02_datasets.ipynb) - [Datasets](./tutorials/data/tutorial02_datasets.ipynb)
- [![Test In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/vanderschaarlab/temporai/blob/main/tutorials/data/tutorial03_datasources.ipynb) - [Data Loaders](./tutorials/data/tutorial03_datasources.ipynb)
- [![Test In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/vanderschaarlab/temporai/blob/main/tutorials/data/tutorial04_data_splitting.ipynb) - [Data Splitting](./tutorials/data/tutorial04_data_splitting.ipynb)
- [![Test In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/vanderschaarlab/temporai/blob/main/tutorials/data/tutorial05_other_data_formats.ipynb) - [Other Data Formats](./tutorials/data/tutorial05_other_data_formats.ipynb)
- [![Test In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/vanderschaarlab/temporai/blob/main/tutorials/data/tutorial06_mimic_use_case.ipynb) - [MIMIC Use Case](./tutorials/data/tutorial06_mimic_use_case.ipynb)### User Guide
- [![Test In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/vanderschaarlab/temporai/blob/main/tutorials/usage/tutorial01_plugins.ipynb) - [Plugins](./tutorials/usage/tutorial01_plugins.ipynb)
- [![Test In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/vanderschaarlab/temporai/blob/main/tutorials/usage/tutorial02_imputation.ipynb) - [Imputation](./tutorials/usage/tutorial02_imputation.ipynb)
- [![Test In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/vanderschaarlab/temporai/blob/main/tutorials/usage/tutorial03_scaling.ipynb) - [Scaling](./tutorials/usage/tutorial03_scaling.ipynb)
- [![Test In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/vanderschaarlab/temporai/blob/main/tutorials/usage/tutorial04_prediction.ipynb) - [Prediction](./tutorials/usage/tutorial04_prediction.ipynb)
- [![Test In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/vanderschaarlab/temporai/blob/main/tutorials/usage/tutorial05_time_to_event.ipynb) - [Time-to-event Analysis](./tutorials/usage/tutorial05_time_to_event.ipynb)
- [![Test In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/vanderschaarlab/temporai/blob/main/tutorials/usage/tutorial06_treatments.ipynb) - [Treatment Effects](./tutorials/usage/tutorial06_treatments.ipynb)
- [![Test In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/vanderschaarlab/temporai/blob/main/tutorials/usage/tutorial07_pipeline.ipynb) - [Pipeline](./tutorials/usage/tutorial07_pipeline.ipynb)
- [![Test In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/vanderschaarlab/temporai/blob/main/tutorials/usage/tutorial08_benchmarks.ipynb) - [Benchmarks](./tutorials/usage/tutorial08_benchmarks.ipynb)
- [![Test In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/vanderschaarlab/temporai/blob/main/tutorials/usage/tutorial09_automl.ipynb) - [AutoML](./tutorials/usage/tutorial09_automl.ipynb)### Extending TemporAI
- [![Test In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/vanderschaarlab/temporai/blob/main/tutorials/extending/tutorial01_custom_method.ipynb) - [Writing a Custom Method Plugin](./tutorials/extending/tutorial01_custom_method.ipynb)
- [![Test In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/vanderschaarlab/temporai/blob/main/tutorials/extending/tutorial02_testing_custom_method.ipynb) - [Testing a Custom Method Plugin](./tutorials/extending/tutorial02_testing_custom_method.ipynb)
- [![Test In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/vanderschaarlab/temporai/blob/main/tutorials/extending/tutorial03_custom_datasource.ipynb) - [Writing a Custom Data Source Plugin](./tutorials/extending/tutorial03_custom_datasource.ipynb)
- [![Test In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/vanderschaarlab/temporai/blob/main/tutorials/extending/tutorial04_custom_metric.ipynb) - [Writing a Custom Metric Plugin](./tutorials/extending/tutorial04_custom_metric.ipynb)
- [![Test In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/vanderschaarlab/temporai/blob/main/tutorials/extending/tutorial05_custom_dataformat.ipynb) - [Writing a Custom Data Format](./tutorials/extending/tutorial05_custom_dataformat.ipynb)## 📘 Documentation
See the full project documentation [here](https://temporai.readthedocs.io/en/latest/).
#### Note on documentation versions:
- If you have installed TemporAI from PyPI, you should refer to the *stable* documentation.
- If you have installed TemporAI from source, you should refer to the *latest* documentation.See the [**Instal with `pip`**](https://github.com/vanderschaarlab/temporai#instal-with-pip) section for reference.
[van der Schaar Lab]: https://www.vanderschaar-lab.com/
[docs]: https://temporai.readthedocs.io/en/latest/[docs/user_guide]: https://temporai.readthedocs.io/en/latest/user_guide/index.html
## 🌍 TemporAI Ecosystem (*Experimental*)
We provide additional tools in the TemporAI ecosystem, which are in active development, and are currently (very) experimental. Suggestions and contributions are welcome!
These include:
- [`temporai-clinic`](https://github.com/vanderschaarlab/temporai-clinic): A web app tool for interacting and visualising TemporAI models, data, and predictions.
- [`temporai-mivdp`](https://github.com/vanderschaarlab/temporai-mivdp): A [MIMIC-IV-Data-Pipeline](https://github.com/healthylaife/MIMIC-IV-Data-Pipeline) adaptation for TemporAI.## 🔑 Methods
(▶️ Expand to view the sections below.)
Time-to-Event (survival) analysis over time
Risk estimation given event data (category: `time_to_event`)
| Name | Description| Reference |
| --- | --- | --- |
| `dynamic_deephit` | Dynamic-DeepHit incorporates the available longitudinal data comprising various repeated measurements (rather than only the last available measurements) in order to issue dynamically updated survival predictions | [Paper](https://pubmed.ncbi.nlm.nih.gov/30951460/) |
| `ts_coxph` | Create embeddings from the time series and use a CoxPH model for predicting the survival function| --- |
| `ts_xgb` | Create embeddings from the time series and use a SurvivalXGBoost model for predicting the survival function| --- |Treatment effects
#### One-off
Treatment effects estimation where treatments are a one-off event.* Regression on the outcomes (category: `treatments.one_off.regression`)
| Name | Description| Reference |
| --- | --- | --- |
| `synctwin_regressor` | SyncTwin is a treatment effect estimation method tailored for observational studies with longitudinal data, applied to the LIP setting: Longitudinal, Irregular and Point treatment. | [Paper](https://proceedings.neurips.cc/paper/2021/hash/19485224d128528da1602ca47383f078-Abstract.html) |#### Temporal
Treatment effects estimation where treatments are temporal (time series).* Classification on the outcomes (category: `treatments.temporal.classification`)
| Name | Description| Reference |
| --- | --- | --- |
| `crn_classifier` | The Counterfactual Recurrent Network (CRN), a sequence-to-sequence model that leverages the available patient observational data to estimate treatment effects over time. | [Paper](https://arxiv.org/abs/2002.04083) |* Regression on the outcomes (category: `treatments.temporal.regression`)
| Name | Description| Reference |
| --- | --- | --- |
| `crn_regressor` | The Counterfactual Recurrent Network (CRN), a sequence-to-sequence model that leverages the available patient observational data to estimate treatment effects over time. | [Paper](https://arxiv.org/abs/2002.04083) |Prediction
#### One-off
Prediction where targets are static.* Classification (category: `prediction.one_off.classification`)
| Name | Description| Reference |
| --- | --- | --- |
| `nn_classifier` | Neural-net based classifier. Supports multiple recurrent models, like RNN, LSTM, Transformer etc. | --- |
| `ode_classifier` | Classifier based on ordinary differential equation (ODE) solvers. | --- |
| `cde_classifier` | Classifier based Neural Controlled Differential Equations for Irregular Time Series. | [Paper](https://arxiv.org/abs/2005.08926) |
| `laplace_ode_classifier` | Classifier based Inverse Laplace Transform (ILT) algorithms implemented in PyTorch. | [Paper](https://arxiv.org/abs/2206.04843) |* Regression (category: `prediction.one_off.regression`)
| Name | Description| Reference |
| --- | --- | --- |
| `nn_regressor` | Neural-net based regressor. Supports multiple recurrent models, like RNN, LSTM, Transformer etc. | --- |
| `ode_regressor` | Regressor based on ordinary differential equation (ODE) solvers. | --- |
| `cde_regressor` | Regressor based Neural Controlled Differential Equations for Irregular Time Series. | [Paper](https://arxiv.org/abs/2005.08926)
| `laplace_ode_regressor` | Regressor based Inverse Laplace Transform (ILT) algorithms implemented in PyTorch. | [Paper](https://arxiv.org/abs/2206.04843) |#### Temporal
Prediction where targets are temporal (time series).* Classification (category: `prediction.temporal.classification`)
| Name | Description| Reference |
| --- | --- | --- |
| `seq2seq_classifier` | Seq2Seq prediction, classification | --- |* Regression (category: `prediction.temporal.regression`)
| Name | Description| Reference |
| --- | --- | --- |
| `seq2seq_regressor` | Seq2Seq prediction, regression | --- |Preprocessing
#### Feature Encoding
* Static data (category: `preprocessing.encoding.static`)
| Name | Description| Reference |
| --- | --- | --- |
| `static_onehot_encoder` | One-hot encode categorical static features | --- |* Temporal data (category: `preprocessing.encoding.temporal`)
| Name | Description| Reference |
| --- | --- | --- |
| `ts_onehot_encoder` | One-hot encode categorical time series features | --- |#### Imputation
* Static data (category: `preprocessing.imputation.static`)
| Name | Description| Reference |
| --- | --- | --- |
| `static_tabular_imputer` | Use any method from [HyperImpute](https://github.com/vanderschaarlab/hyperimpute) (HyperImpute, Mean, Median, Most-frequent, MissForest, ICE, MICE, SoftImpute, EM, Sinkhorn, GAIN, MIRACLE, MIWAE) to impute the static data | [Paper](https://arxiv.org/abs/2206.07769) |* Temporal data (category: `preprocessing.imputation.temporal`)
| Name | Description| Reference |
| --- | --- | --- |
| `ffill` | Propagate last valid observation forward to next valid | --- |
| `bfill` | Use next valid observation to fill gap | --- |
| `ts_tabular_imputer` | Use any method from [HyperImpute](https://github.com/vanderschaarlab/hyperimpute) (HyperImpute, Mean, Median, Most-frequent, MissForest, ICE, MICE, SoftImpute, EM, Sinkhorn, GAIN, MIRACLE, MIWAE) to impute the time series data | [Paper](https://arxiv.org/abs/2206.07769) |#### Scaling
* Static data (category: `preprocessing.scaling.static`)
| Name | Description| Reference |
| --- | --- | --- |
| `static_standard_scaler` | Scale the static features using a StandardScaler | --- |
| `static_minmax_scaler` | Scale the static features using a MinMaxScaler | --- |* Temporal data (category: `preprocessing.scaling.temporal`)
| Name | Description| Reference |
| --- | --- | --- |
| `ts_standard_scaler` | Scale the temporal features using a StandardScaler | --- |
| `ts_minmax_scaler` | Scale the temporal features using a MinMaxScaler | --- |## 🔨 Tests and Development
Install the testing dependencies using:
```bash
pip install .[testing]
```
The tests can be executed using:
```bash
pytest -vsx
```For local development, we recommend that you should install the `[dev]` extra, which includes `[testing]` and some additional dependencies:
```bash
pip install .[dev]
```For development and contribution to TemporAI, see:
* 📓 [Extending TemporAI tutorials](./tutorials/extending/)
* 📃 [Contribution guide](./CONTRIBUTING.md)
* 👩💻 [Developer's guide](./docs/dev_guide.md)## ✍️ Citing
If you use this code, please cite the associated paper:
```
@article{saveliev2023temporai,
title={TemporAI: Facilitating Machine Learning Innovation in Time Domain Tasks for Medicine},
author={Saveliev, Evgeny S and van der Schaar, Mihaela},
journal={arXiv preprint arXiv:2301.12260},
year={2023}
}
```