{"id":13415590,"url":"https://github.com/blue-yonder/tsfresh","last_synced_at":"2025-05-13T11:09:36.544Z","repository":{"id":41055405,"uuid":"71996613","full_name":"blue-yonder/tsfresh","owner":"blue-yonder","description":"Automatic extraction of relevant features from time series:","archived":false,"fork":false,"pushed_at":"2025-02-16T16:08:19.000Z","size":8356,"stargazers_count":8751,"open_issues_count":69,"forks_count":1238,"subscribers_count":171,"default_branch":"main","last_synced_at":"2025-05-13T11:09:24.486Z","etag":null,"topics":["data-science","feature-extraction","time-series"],"latest_commit_sha":null,"homepage":"http://tsfresh.readthedocs.io","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/blue-yonder.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGES.rst","contributing":null,"funding":null,"license":"LICENSE.txt","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":"AUTHORS.rst","dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2016-10-26T11:29:17.000Z","updated_at":"2025-05-13T10:53:45.000Z","dependencies_parsed_at":"2025-04-22T21:42:17.722Z","dependency_job_id":null,"html_url":"https://github.com/blue-yonder/tsfresh","commit_stats":{"total_commits":518,"total_committers":91,"mean_commits":"5.6923076923076925","dds":0.7084942084942085,"last_synced_commit":"2e4961482dc06183f79fcd32b0ad26539107e677"},"previous_names":[],"tags_count":34,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/blue-yonder%2Ftsfresh","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/blue-yonder%2Ftsfresh/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/blue-yonder%2Ftsfresh/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/blue-yonder%2Ftsfresh/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/blue-yonder","download_url":"https://codeload.github.com/blue-yonder/tsfresh/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":253929367,"owners_count":21985802,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["data-science","feature-extraction","time-series"],"created_at":"2024-07-30T21:00:50.555Z","updated_at":"2025-05-13T11:09:36.520Z","avatar_url":"https://github.com/blue-yonder.png","language":"Jupyter Notebook","readme":"\u003cdiv align=\"center\"\u003e\n  \u003cimg width=\"70%\" src=\"./docs/images/tsfresh_logo.svg\"\u003e\n\u003c/div\u003e\n\n-----------------\n\n# tsfresh\n\n[![Documentation Status](https://readthedocs.org/projects/tsfresh/badge/?version=latest)](https://tsfresh.readthedocs.io/en/latest/?badge=latest)\n[![Build Status](https://github.com/blue-yonder/tsfresh/workflows/Test%20Default%20Branch/badge.svg)](https://github.com/blue-yonder/tsfresh/actions)\n[![codecov](https://codecov.io/gh/blue-yonder/tsfresh/branch/main/graph/badge.svg)](https://codecov.io/gh/blue-yonder/tsfresh)\n[![license](https://img.shields.io/github/license/mashape/apistatus.svg)](https://github.com/blue-yonder/tsfresh/blob/main/LICENSE.txt)\n[![Binder](https://mybinder.org/badge.svg)](https://mybinder.org/v2/gh/blue-yonder/tsfresh/main?filepath=notebooks)\n[![Downloads](https://pepy.tech/badge/tsfresh)](https://pepy.tech/project/tsfresh)\n\nThis repository contains the *TSFRESH* python package. The abbreviation stands for\n\n*\"Time Series Feature extraction based on scalable hypothesis tests\"*.\n\nThe package provides systematic time-series feature extraction by combining established algorithms from statistics, time-series analysis, signal processing, and nonlinear dynamics with a robust feature selection algorithm. In this context, the term *time-series* is interpreted in the broadest possible sense, such that any types of sampled data or even event sequences can be characterised.\n\n## Spend less time on feature engineering\n\nData Scientists often spend most of their time either cleaning data or building features.\nWhile we cannot change the first thing, the second can be automated.\n*TSFRESH* frees your time spent on building features by extracting them automatically.\nHence, you have more time to study the newest deep learning paper, read hacker news or build better models.\n\n\n## Automatic extraction of 100s of features\n\n*TSFRESH* automatically extracts 100s of features from time series.\nThose features describe basic characteristics of the time series such as the number of peaks, the average or maximal value or more complex features such as the time reversal symmetry statistic.\n\n![The features extracted from a exemplary time series](docs/images/introduction_ts_exa_features.png)\n\nThe set of features can then be used to construct statistical or machine learning models on the time series to be used for example in regression or\nclassification tasks.\n\n## Forget irrelevant features\n\nTime series often contain noise, redundancies or irrelevant information.\nAs a result most of the extracted features will not be useful for the machine learning task at hand.\n\nTo avoid extracting irrelevant features, the *TSFRESH* package has a built-in filtering procedure.\nThis filtering procedure evaluates the explaining power and importance of each characteristic for the regression or classification tasks at hand.\n\nIt is based on the well developed theory of hypothesis testing and uses a multiple test procedure.\nAs a result the filtering process mathematically controls the percentage of irrelevant extracted features.\n\nThe  *TSFRESH* package is described in the following open access paper:\n\n* Christ, M., Braun, N., Neuffer, J., and Kempa-Liehr A.W. (2018).\n   _Time Series FeatuRe Extraction on basis of Scalable Hypothesis tests (tsfresh -- A Python package)._\n   Neurocomputing 307, p. 72-77, [doi: 10.1016/j.neucom.2018.03.067](https://doi.org/10.1016/j.neucom.2018.03.067).\n\nThe FRESH algorithm is described in the following whitepaper:\n\n* Christ, M., Kempa-Liehr, A.W., and Feindt, M. (2017).\n    _Distributed and parallel time series feature extraction for industrial big data applications._\n    ArXiv e-print 1610.07717,  [https://arxiv.org/abs/1610.07717](https://arxiv.org/abs/1610.07717).\n\nSystematic time-series feature extraction even works for unsupervised problems:\n\n* Teh, H.Y., Wang, K.I-K., Kempa-Liehr, A.W. (2021).\n    _Expect the Unexpected: Unsupervised feature selection for automated sensor anomaly detection._\n    IEEE Sensors Journal 15.16, p. 18033-18046, [doi: 10.1109/JSEN.2021.3084970](https://doi.org/10.1109/JSEN.2021.3084970).\n\nDue to the fact that tsfresh basically provides time-series feature extraction for free, you can now concentrate on engineering new time-series,\nlike e.g. differences of signals from synchronous measurements, which provide even better time-series features:\n\n* Kempa-Liehr, A.W., Oram, J., Wong, A., Finch, M., Besier, T. (2020).\n    _Feature engineering workflow for activity recognition from synchronized inertial measurement units._\n    In: Pattern Recognition. ACPR 2019. Ed. by M. Cree et al. Vol. 1180. Communications in Computer and Information Science (CCIS).\n    Singapore: Springer, p. 223–231. [doi: 10.1007/978-981-15-3651-9_20](https://doi.org/10.1007/978-981-15-3651-9_20).\n\n* Simmons, S., Jarvis, L., Dempsey, D., Kempa-Liehr, A.W. (2021).\n    _Data Mining on Extremely Long Time-Series._\n    In: 2021 International Conference on Data Mining Workshops (ICDMW). Ed. by B. Xue et al.\n    Los Alamitos: IEEE, p. 1057-1066. [doi: 10.1109/ICDMW53433.2021.00137](https://doi.org/10.1109/ICDMW53433.2021.00137).\n\nSystematic time-series features engineering allows to work with time-series samples of different lengths, because every time-series is projected\ninto a well-defined feature space. This approach allows the design of robust machine learning algorithms in applications with missing data.\n\n* Kennedy, A., Gemma, N., Rattenbury, N., Kempa-Liehr, A.W. (2021).\n    _Modelling the projected separation of microlensing events using systematic time-series feature engineering._\n    Astronomy and Computing 35.100460, p. 1–14, [doi: 10.1016/j.ascom.2021.100460](https://doi.org/10.1016/j.ascom.2021.100460)\n\nIs your time-series classification problem imbalanced? There is a good chance that undersampling of time-series feature matrices\nmight solve your problem:\n\n* Dempsey, D.E., Cronin, S.J., Mei, S., Kempa-Liehr, A.W. (2020).\n    _Automatic precursor recognition and real-time forecasting of sudden explosive volcanic eruptions at Whakaari, New Zealand_.\n    Nature Communications 11.3562, p. 1-8, [doi: 10.1038/s41467-020-17375-2](https://doi.org/10.1038/s41467-020-17375-2).\n\nNatural language processing of written texts is an example of applying systematic time-series feature engineering to event sequences,\nwhich is described in the following open access paper:\n\n* Tang, Y., Blincoe, K., Kempa-Liehr, A.W. (2020).\n    _Enriching Feature Engineering for Short Text Samples by Language Time Series Analysis._\n    EPJ Data Science 9.26, p. 1–59. [doi: 10.1140/epjds/s13688-020-00244-9](https://doi.org/10.1140/epjds/s13688-020-00244-9)\n\n\n\n## Advantages of tsfresh\n\n*TSFRESH* has several selling points, for example\n\n1. it is field tested\n2. it is unit tested\n3. the filtering process is statistically/mathematically correct\n4. it has a comprehensive documentation\n5. it is compatible with sklearn, pandas and numpy\n6. it allows anyone to easily add their favorite features\n7. it both runs on your local machine or even on a cluster\n\n## Next steps\n\nIf you are interested in the technical workings, go to see our comprehensive Read-The-Docs documentation at [http://tsfresh.readthedocs.io](http://tsfresh.readthedocs.io).\n\nThe algorithm, especially the filtering part are also described in the paper mentioned above.\n\nWe appreciate any contributions, if you are interested in helping us to make *TSFRESH* the biggest archive of feature extraction methods in python, just head over to our [How-To-Contribute](http://tsfresh.readthedocs.io/en/latest/text/how_to_contribute.html) instructions.\n\nIf you want to try out `tsfresh` quickly or if you want to integrate it into your workflow, we also have a docker image available:\n\n    docker pull nbraun/tsfresh\n\n\n## Backwards compatibility\n\nIf you need to reproduce or update time-series features, which were computed with the `matrixprofile` feature calculators, you need to create a Python 3.8 environment:\n\n    conda create --name tsfresh__py_3.8 python=3.8\n    conda activate tsfresh__py_3.8\n    pip install tsfresh[matrixprofile]\n\n## Acknowledgements\n\nThe research and development of *TSFRESH* was funded in part by the German Federal Ministry of Education and Research under grant number 01IS14004 (project iPRODICT).\n","funding_links":[],"categories":["Jupyter Notebook","Tools and Algorithms","TimeSeries Analysis",":open_hands: Contributing","Libraries","Analytic tools","📦 Packages","data-science","Feature Engineering","Time Series","Others","\u003cspan id=\"head7\"\u003e2.2. (Deep Learning based) Time Series Analysis\u003c/span\u003e","Python","Metrics","Data Processing","AutoML","时间序列","Feature Extraction","Curated List","Feature Engineering Automation","Uncategorized","⏳ Time Series Analysis"],"sub_categories":["Cryptocurrencies","TimeSeries Analysis","Python","General","Time Series","Data Pre-processing \u0026 Loading","网络服务_其他","时间序列分析","Time Series Analysis","Uncategorized","Tools"],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fblue-yonder%2Ftsfresh","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fblue-yonder%2Ftsfresh","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fblue-yonder%2Ftsfresh/lists"}