{"id":13571106,"url":"https://github.com/alexandervnikitin/tsgm","last_synced_at":"2025-04-06T05:18:13.415Z","repository":{"id":38614241,"uuid":"499438152","full_name":"AlexanderVNikitin/tsgm","owner":"AlexanderVNikitin","description":"Generation and evaluation of synthetic time series datasets (also, augmentations, visualizations, a collection of popular datasets) NeurIPS'24","archived":false,"fork":false,"pushed_at":"2024-08-16T18:48:55.000Z","size":10291,"stargazers_count":159,"open_issues_count":23,"forks_count":17,"subscribers_count":4,"default_branch":"main","last_synced_at":"2025-04-06T05:18:07.767Z","etag":null,"topics":["augmentations","data-augmentation","data-science","datasets","deep-learning","generative-model","keras","machine-learning","python","synthetic-data","synthetic-time-series","tensorflow2","time-series","vae"],"latest_commit_sha":null,"homepage":"https://tsgm.readthedocs.io","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/AlexanderVNikitin.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2022-06-03T08:35:31.000Z","updated_at":"2025-04-02T02:05:56.000Z","dependencies_parsed_at":"2023-09-23T19:52:36.099Z","dependency_job_id":"eba6e5b4-b6ef-410e-a713-dd0506dfa1a9","html_url":"https://github.com/AlexanderVNikitin/tsgm","commit_stats":{"total_commits":134,"total_committers":5,"mean_commits":26.8,"dds":"0.10447761194029848","last_synced_commit":"e5d479c064760c48d9d30c30b85275b5900ae905"},"previous_names":[],"tags_count":6,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AlexanderVNikitin%2Ftsgm","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AlexanderVNikitin%2Ftsgm/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AlexanderVNikitin%2Ftsgm/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AlexanderVNikitin%2Ftsgm/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/AlexanderVNikitin","download_url":"https://codeload.github.com/AlexanderVNikitin/tsgm/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247436594,"owners_count":20938595,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["augmentations","data-augmentation","data-science","datasets","deep-learning","generative-model","keras","machine-learning","python","synthetic-data","synthetic-time-series","tensorflow2","time-series","vae"],"created_at":"2024-08-01T14:00:58.575Z","updated_at":"2025-04-06T05:18:13.306Z","avatar_url":"https://github.com/AlexanderVNikitin.png","language":"Python","funding_links":[],"categories":["📦 Packages"],"sub_categories":["Python"],"readme":"\u003cdiv align=\"center\"\u003e\n\u003cimg src=\"https://github.com/AlexanderVNikitin/tsgm/raw/main/docs/_static/logo.png\"\u003e\n\u003c/div\u003e\n\n\u003ch3 align=\"center\"\u003e\nTime Series Generative Modeling (TSGM)\n\u003c/h3\u003e\n\n\u003cp align=\"center\"\u003e\nCreate and evaluate synthetic time series datasets effortlessly\n\u003c/p\u003e\n\n\u003cdiv align=\"center\"\u003e\n\n[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1l2VB6eUwvrxyu8iB30faGiQM5AKthc82?usp=sharing)\n[![Pypi version](https://img.shields.io/pypi/v/tsgm)](https://pypi.org/project/tsgm/)\n[![unit-tests](https://github.com/AlexanderVNikitin/tsgm/actions/workflows/test.yml/badge.svg?event=push)](https://github.com/AlexanderVNikitin/tsgm/actions?query=workflow%3ATests+branch%3Amain)\n[![Python 3.8+](https://img.shields.io/badge/python-3.8+-blue.svg)](https://www.python.org/downloads/release/python-380/)\n[![codecov](https://codecov.io/gh/AlexanderVNikitin/tsgm/branch/main/graph/badge.svg?token=UD38ANZ0M1)](https://codecov.io/gh/AlexanderVNikitin/tsgm)\n[![arXiv](https://img.shields.io/badge/arXiv-2305.11567-b31b1b.svg)](https://arxiv.org/abs/2305.11567)\n\n\u003c/div\u003e\n\n\u003cp align=\"center\"\u003e\n  \u003ca href=\"#jigsaw-get-started\"\u003eGet Started\u003c/a\u003e •\n  \u003ca href=\"#anchor-tutorials\"\u003eTutorials\u003c/a\u003e •\n  \u003ca href=\"#art-augmentations\"\u003eAugmentations\u003c/a\u003e •\n  \u003ca href=\"#hammer-generators\"\u003eGenerators\u003c/a\u003e •\n  \u003ca href=\"#chart_with_upwards_trend-metrics\"\u003eMetrics\u003c/a\u003e •\n  \u003ca href=\"#floppy_disk-datasets\"\u003eDatasets\u003c/a\u003e •\n  \u003ca href=\"#hammer_and_wrench-contributing\"\u003eContributing\u003c/a\u003e •\n  \u003ca href=\"#mag-citing\"\u003eCiting\u003c/a\u003e \n\u003c/p\u003e\n\n\n## :jigsaw: Get Started\n\nTSGM is an open-source framework for synthetic time series dataset generation and evaluation. \n\n\u003cdiv align=\"center\"\u003e\n\u003cimg src=\"./docs/_static/generation_process.gif\"\u003e\n\u003c/div\u003e\n\nThe framework can be used for creating synthetic datasets (see \u003ca href=\"#hammer-generators\"\u003e:hammer: Generators \u003c/a\u003e), augmenting time series data (see \u003ca href=\"#art-augmentations\"\u003e:art: Augmentations \u003c/a\u003e), evaluating synthetic data with respect to consistency, privacy, downstream performance, and more (see \u003ca href=\"#chart_with_upwards_trend-metrics\"\u003e:chart_with_upwards_trend: Metrics \u003c/a\u003e), using common time series datasets (TSGM provides easy access to more than 140 datasets, see \u003ca href=\"#floppy_disk-datasets\"\u003e:floppy_disk: Datasets \u003c/a\u003e).\n\nWe provide:\n* [Documentation](https://tsgm.readthedocs.io/en/latest/) with a complete overview of the implemented methods,\n* [Tutorials](https://github.com/AlexanderVNikitin/tsgm/tree/main/tutorials) that describe practical use-cases of the framework.\n\n\n### Install TSGM\n```bash\npip install tsgm\n```\n\n#### M1 and M2 chips:\nTo install `tsgm` on Apple M1 and M2 chips:\n```bash\n# Install tensorflow\nconda install -c conda-forge tensorflow=2.9.1\n\n# Install tsgm without dependencies\npip install tsgm --no-deps\n\n# Install rest of the dependencies (separately here for clarity)\nconda install tensorflow-probability scipy antropy statsmodels dtaidistance networkx optuna prettytable seaborn scikit-learn yfinance tqdm\n```\n\n\n### Train your generative model\n```python\nimport tsgm\n\n# ... Define hyperparameters ...\n# dataset is a tensor of shape n_samples x seq_len x feature_dim\n\n# Zoo contains several prebuilt architectures: we choose a conditional GAN architecture\narchitecture = tsgm.models.architectures.zoo[\"cgan_base_c4_l1\"](\n    seq_len=seq_len, feat_dim=feature_dim,\n    latent_dim=latent_dim, output_dim=0)\ndiscriminator, generator = architecture.discriminator, architecture.generator\n\n# Initialize GAN object with selected discriminator and generator\ngan = tsgm.models.cgan.GAN(\n    discriminator=discriminator, generator=generator, latent_dim=latent_dim\n)\ngan.compile(\n    d_optimizer=keras.optimizers.Adam(learning_rate=0.0003),\n    g_optimizer=keras.optimizers.Adam(learning_rate=0.0003),\n    loss_fn=keras.losses.BinaryCrossentropy(from_logits=True),\n)\ngan.fit(dataset, epochs=N_EPOCHS)\n\n# Generate 100 synthetic samples\nresult = gan.generate(100)\n```\n\n## :anchor: Tutorials\n\n- [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1l2VB6eUwvrxyu8iB30faGiQM5AKthc82?usp=sharing) Introductory Tutorial [Getting started with TSGM](https://github.com/AlexanderVNikitin/tsgm/blob/main/tutorials/GANs/cGAN.ipynb)\n- [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1frSnJQSsuPS3asgIkmcrNtX4Y7TIQI56?usp=sharing) Tutorial [Datasets in TSGM](https://github.com/AlexanderVNikitin/tsgm/blob/main/tutorials/Datasets.ipynb)\n- [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1Vw9t4TlI1Nek_t6bMPyKcPPPqCiXfOK3?usp=sharing) Tutorial [Time Series Augmentations](https://github.com/AlexanderVNikitin/tsgm/blob/main/tutorials/augmentations.ipynb)\n- [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1_jpGrPcwoSpB8eii8XW-spaikczdPqIQ?usp=sharing) Tutorial [Time Series Generation with VAEs](https://github.com/AlexanderVNikitin/tsgm/blob/main/tutorials/VAEs/VAE.ipynb)\n- [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1rcN3pr8uglBEEOo4bTa1fvXgSMx1vKq9?usp=sharing) Tutorial [Conditional Time Series Generation with GANs](https://github.com/AlexanderVNikitin/tsgm/blob/main/tutorials/GANs/cGAN.ipynb)\n- [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1hubtddSX94KyLzuCTwmU6pAFBgBeiEB-?usp=sharing) Tutorial [Evaluation of Synthetic Time Series Data](https://github.com/AlexanderVNikitin/tsgm/blob/main/tutorials/evaluation.ipynb)\n- [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1SKD9hRi-ic27Wts9Qzkssjfe1z7o1NU4?usp=sharing) Tutorial [Model Selection](https://github.com/AlexanderVNikitin/tsgm/blob/main/tutorials/Model%20Selection.ipynb)\n- [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1wpf9WeNVj5TkUcPF6EavVx-hUCOfyvUd?usp=sharing) Tutorial [Multiple GPUs or TPU with TSGM](https://github.com/AlexanderVNikitin/tsgm/blob/main/tutorials/Using%20Multiple%20GPUs%20or%20TPU.ipynb)\n\nFor more examples, see [our tutorials](./tutorials).\n\n## :art: Augmentations\nTSGM provides a number of time series augmentations.\n\n| Augmentation  | Class in TSGM | Reference     |\n| ------------- | ------------- | ------------- |\n| Gaussian Noise / Jittering  | `tsgm.augmentations.GaussianNoise` | -  |        \n| Slice-And-Shuffle  | `tsgm.augmentations.SliceAndShuffle` | - |\n| Shuffle Features  | `tsgm.augmentations.Shuffle` | - |\n| Magnitude Warping  | `tsgm.augmentations.MagnitudeWarping` | [Data Augmentation of Wearable Sensor Data for Parkinson’s Disease Monitoring using Convolutional Neural Networks](https://dl.acm.org/doi/pdf/10.1145/3136755.3136817) |\n| Window Warping  | `tsgm.augmentations.WindowWarping` | [Data Augmentation for Time Series Classification using Convolutional Neural Networks](https://shs.hal.science/halshs-01357973/document) |\n| DTW Barycentric Averaging  | `tsgm.augmentations.DTWBarycentricAveraging` | [A global averaging method for dynamic time warping, with applications to clustering.](https://www.sciencedirect.com/science/article/pii/S003132031000453X) |\n\n## :hammer: Generators\nTSGM implements several generative models for synthetic time series data.\n\n| Method  | Link to docs | Type | Notes |\n| ------------- | ------------- | ------------- | ------------- |\n| Structural Time Series  | [sts.STS](https://tsgm.readthedocs.io/en/latest/modules/root.html#tsgm.models.sts.STS) | Data-driven | Great for modeling time series when prior knowledge is available (e.g., trend or seasonality).  |        \n| GAN  | [GAN](https://tsgm.readthedocs.io/en/latest/modules/root.html#tsgm.models.cgan.GAN) | Data-driven | A generic implementation of GAN for time series generation. It can be customized with architectures for generators and discriminators. |\n| WaveGAN  | [GAN](https://tsgm.readthedocs.io/en/latest/modules/root.html#tsgm.models.cgan.GAN) | Data-driven | WaveGAN is the model for audio synthesis proposed in [Adversarial Audio Synthesis](https://arxiv.org/abs/1802.04208). To use WaveGAN, set `use_wgan=True` when initializing the GAN class and use the `zoo[\"wavegan\"]` architecture from the model zoo. |\n| ConditionalGAN | [ConditionalGAN](https://tsgm.readthedocs.io/en/latest/modules/root.html#tsgm.models.cgan.ConditionalGAN) | Data-driven | A generic implementation of conditional GAN. It supports scalar conditioning as well as temporal one. |\n| BetaVAE  | [BetaVAE](https://tsgm.readthedocs.io/en/latest/modules/root.html#tsgm.models.cvae.BetaVAE) | Data-driven | A generic implementation of Beta VAE for TS. The loss function is customized to work well with multi-dimensional time series. |\n| cBetaVAE  | [cBetaVAE](https://tsgm.readthedocs.io/en/latest/modules/root.html#tsgm.models.cvae.cBetaVAE) | Data-driven | Conditional version of BetaVAE. It supports temporal a scalar condiotioning.|\n| TimeGAN | [TimeGAN](https://tsgm.readthedocs.io/en/latest/modules/root.html#tsgm.models.timegan.TimeGAN) | Data-driven | TSGM implementation of TimeGAN from [paper](https://papers.nips.cc/paper_files/paper/2019/hash/c9efe5f26cd17ba6216bbe2a7d26d490-Abstract.html) |\n| SineConstSimulator | [SineConstSimulator](https://tsgm.readthedocs.io/en/latest/modules/root.html#tsgm.simulator.SineConstSimulator) | Simulator-based | Simulator-based synthetic signal that switches between constant and periodics functions. |\n| Lotka Volterra | [LotkaVolterraSimulator](https://tsgm.readthedocs.io/en/latest/modules/root.html#tsgm.simulator.LotkaVolterraSimulator) | Simulator-based | Simulator-based synthetic signal that switches between constant and periodics functions. |\n| PdM Simulator | [PdMSimulator](https://tsgm.readthedocs.io/en/latest/modules/root.html#tsgm.simulator.PredictiveMaintenanceSimulator) | Simulator-based | Simulator of predictive maintenance with multiple pieces of equipment from [paper](https://arxiv.org/pdf/2206.11574) |\n\n## :chart_with_upwards_trend: Metrics\nTSGM implements many metrics for synthetic time series evaluation. Check Section 3 from [our paper for more detail on the evaluation of synthetic time series](https://arxiv.org/pdf/2305.11567).\n\n| Metric  | Link to docs | Type | Notes |\n| ------------- | ------------- | ------------- | ------------- |\n| Distance in the space of summary statistics  | [tsgm.metrics.DistanceMetric](https://tsgm.readthedocs.io/en/latest/autoapi/tsgm/metrics/index.html#tsgm.metrics.DistanceMetric) | Distance | Calculates a set of summary statistics in the original and synthetic data, and measures the distance between those.  |        \n| Maximum Mean Discrepancy (MMD)  | [tsgm.metrics.MMDMetric](https://tsgm.readthedocs.io/en/latest/autoapi/tsgm/metrics/index.html#tsgm.metrics.MMDMetric) | Distance | This metric calculated MMD between real and synthetic samples |\n| Discriminative Score | [tsgm.metrics.DiscriminativeMetric](https://tsgm.readthedocs.io/en/latest/autoapi/tsgm/metrics/index.html#tsgm.metrics.DiscriminativeMetric) | Distance | The DiscriminativeMetric measures the discriminative performance of a model in distinguishing between synthetic and real datasets. |\n| Demographic Parity Score | [tsgm.metrics.DemographicParityMetric](https://tsgm.readthedocs.io/en/latest/autoapi/tsgm/metrics/index.html#tsgm.metrics.DemographicParityMetric) | Fairness |  This metric assesses the difference in the distributions of a target variable among different groups in two datasets. Refer to [this paper](https://fairware.cs.umass.edu/papers/Verma.pdf) to learn more. |\n| Predictive Parity Score | [tsgm.metrics.PredictiveParityMetric](https://tsgm.readthedocs.io/en/latest/autoapi/tsgm/metrics/index.html#tsgm.metrics.PredictiveParityMetric) | Fairness | This metric assesses the discrepancy in the predictive performance of a model among different groups in two datasets. Refer to [this paper](https://fairware.cs.umass.edu/papers/Verma.pdf) to learn more. |\n| Privacy Membership Inference Attack Score  | [tsgm.metrics.PrivacyMembershipInferenceMetric](https://tsgm.readthedocs.io/en/latest/autoapi/tsgm/metrics/index.html#tsgm.metrics.PrivacyMembershipInferenceMetric) | Privacy | The metric measures the possibility of membership inference attacks.|\n| Spectral Entropy  | [tsgm.metrics.EntropyMetric](https://tsgm.readthedocs.io/en/latest/autoapi/tsgm/metrics/index.html#tsgm.metrics.EntropyMetric) | Diversity | Calculates the spectral entropy of a dataset or tensor as a sum of individual entropies. |\n| Shannon Entropy  | [tsgm.metrics.ShannonEntropyMetric](https://tsgm.readthedocs.io/en/latest/autoapi/tsgm/metrics/index.html#tsgm.metrics.ShannonEntropyMetric) | Diversity | Shannon Entropy calculated over the labels of a dataset. |\n| Pairwise Distance  | [tsgm.metrics.PairwiseDistanceMetric](https://tsgm.readthedocs.io/en/latest/autoapi/tsgm/metrics/index.html#tsgm.metrics.PairwiseDistanceMetric) | Diversity | Measures pairwise distances in a set of time series. |\n| Downstream Effectiveness | [tsgm.metrics.DownstreamPerformanceMetric](https://tsgm.readthedocs.io/en/latest/autoapi/tsgm/metrics/index.html#tsgm.metrics.DownstreamPerformanceMetric) | Downstream Effectiveness | The downstream performance metric evaluates the performance of a model on a downstream task. It returns performance gains achieved with the addition of synthetic data. |\n| Qualitative Evaluation | [tsgm.utils.visualization](https://tsgm.readthedocs.io/en/latest/modules/root.html#module-tsgm.utils.visualization) | Qualitative | Various tools for visual assessment of a generated dataset. |\n\n\n## :floppy_disk: Datasets\n| Dataset | API                                               | Description     |\n| - |---------------------------------------------------| ------------- |\n| UCR Dataset | `tsgm.utils.UCRDataManager`                       | https://www.cs.ucr.edu/%7Eeamonn/time_series_data_2018/  |\n| Mauna Loa | `tsgm.utils.get_mauna_loa()`                      | https://gml.noaa.gov/ccgg/trends/data.html |\n| EEG \u0026 Eye state | `tsgm.utils.get_eeg()`                            | https://archive.ics.uci.edu/ml/datasets/EEG+Eye+State  |\n| Power consumption dataset | `tsgm.utils.get_power_consumption()`              | https://archive.ics.uci.edu/ml/datasets/individual+household+electric+power+consumption  |\n| Stock data | `tsgm.utils.get_stock_data(ticker_name)`          | Gets historical stock data from YFinance  |\n| COVID-19 over the US | `tsgm.utils.get_covid_19()`                       | Covid-19 distribution over the US  |\n| Energy Data (UCI) | `tsgm.utils.get_energy_data()`                    | https://archive.ics.uci.edu/ml/datasets/Appliances+energy+prediction  |\n| MNIST as time series | `tsgm.utils.get_mnist_data()`                     | https://en.wikipedia.org/wiki/MNIST_database  |\n| Samples from GPs | `tsgm.utils.get_gp_samples_data()`                | https://en.wikipedia.org/wiki/Gaussian_process |\n| Physionet 2012 | `tsgm.utils.get_physionet2012()`                  | https://archive.physionet.org/pn3/challenge/2012/ |\n| Synchronized Brainwave Dataset  | `tsgm.utils.get_synchronized_brainwave_dataset()` | https://www.kaggle.com/datasets/berkeley-biosense/synchronized-brainwave-dataset |\n\nTSGM provides API for convenient use of many time-series datasets (currently more than 140 datasets). The comprehensive list of the datasets in the [documentation](https://tsgm.readthedocs.io/en/latest/guides/datasets.html)\n\n\n## :hammer_and_wrench: Contributing\nWe appreciate all contributions. To learn more, please check [CONTRIBUTING.md](CONTRIBUTING.md).\n\n#### For contributors\n```bash\ngit clone github.com/AlexanderVNikitin/tsgm\ncd tsgm\npip install -e .\n```\n\nRun tests:\n```bash\npython -m pytest\n```\n\nTo check static typing:\n```bash\nmypy\n```\n\n## :computer: CLI\nWe provide two CLIs for convenient synthetic data generation:\n- `tsgm-gd` generates data by a stored sample,\n- `tsgm-eval` evaluates the generated time series.\n\nUse `tsgm-gd --help` or `tsgm-eval --help` for documentation.\n\n## :mag: Citing\nIf you find this repo useful, please consider citing our paper:\n```\n@article{\n  nikitin2023tsgm,\n  title={TSGM: A Flexible Framework for Generative Modeling of Synthetic Time Series},\n  author={Nikitin, Alexander and Iannucci, Letizia and Kaski, Samuel},\n  journal={arXiv preprint arXiv:2305.11567},\n  year={2023}\n}\n```\n\n## License\n[Apache License 2.0](LICENSE)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Falexandervnikitin%2Ftsgm","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Falexandervnikitin%2Ftsgm","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Falexandervnikitin%2Ftsgm/lists"}