{"id":13415336,"url":"https://github.com/khundman/telemanom","last_synced_at":"2025-05-14T18:05:13.326Z","repository":{"id":38272620,"uuid":"135760903","full_name":"khundman/telemanom","owner":"khundman","description":"A framework for using LSTMs to detect anomalies in multivariate time series data. Includes spacecraft anomaly data and experiments from the Mars Science Laboratory and SMAP missions.","archived":false,"fork":false,"pushed_at":"2025-01-17T13:42:45.000Z","size":9760,"stargazers_count":1082,"open_issues_count":20,"forks_count":255,"subscribers_count":36,"default_branch":"master","last_synced_at":"2025-04-14T08:00:26.880Z","etag":null,"topics":["anomaly-detection","deep-learning","kdd","kdd2018","keras","lstm","rnn","tensorflow","time-series"],"latest_commit_sha":null,"homepage":"https://arxiv.org/abs/1802.04431","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/khundman.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.txt","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2018-06-01T20:33:20.000Z","updated_at":"2025-04-08T14:22:03.000Z","dependencies_parsed_at":"2025-02-09T20:00:33.822Z","dependency_job_id":null,"html_url":"https://github.com/khundman/telemanom","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/khundman%2Ftelemanom","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/khundman%2Ftelemanom/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/khundman%2Ftelemanom/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/khundman%2Ftelemanom/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/khundman","download_url":"https://codeload.github.com/khundman/telemanom/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254198514,"owners_count":22030965,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["anomaly-detection","deep-learning","kdd","kdd2018","keras","lstm","rnn","tensorflow","time-series"],"created_at":"2024-07-30T21:00:47.309Z","updated_at":"2025-05-14T18:05:08.317Z","avatar_url":"https://github.com/khundman.png","language":"Jupyter Notebook","funding_links":[],"categories":["Anomaly Detection Software","[Soil Moisture Active Passive](https://nsidc.org/data/smap/data) (SMAP) and [Mars Science Laboratory](https://pds-atmospheres.nmsu.edu/data_and_services/atmospheres_data/Mars/Mars.html) (MSL)","异常检测包","工具箱与数据集","2018"],"sub_categories":["3.2 时间序列异常检测"],"readme":"# Telemanom (v2.0)\n\n**v2.0** updates:\n- Vectorized operations via numpy\n- Object-oriented restructure, improved organization\n- Merge branches into single branch for both processing modes (with/without labels) \n- Update requirements.txt and Dockerfile\n- Updated result output for both modes\n- PEP8 cleanup\n\n## Anomaly Detection in Time Series Data Using LSTMs and Automatic Thresholding\n\n[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)\n\nTelemanom employs vanilla LSTMs using [Keras](https://github.com/keras-team/keras)/[Tensorflow](https://github.com/tensorflow/tensorflow) to identify anomalies in multivariate sensor data. LSTMs are trained to learn normal system behaviors using encoded command information and prior telemetry values. Predictions are generated at each time step and the errors in predictions represent deviations from expected behavior. Telemanom then uses a novel nonparametric, unsupervised approach for thresholding these errors and identifying anomalous sequences of errors.\n\nThis repo along with the linked data can be used to re-create the experiments in our 2018 KDD paper, \"[Detecting Spacecraft Anomalies Using LSTMs and Nonparametric Dynamic Thresholding](https://arxiv.org/abs/1802.04431)\", which describes the background, methodologies, and experiments in more detail. While the system was originally deployed to monitor spacecraft telemetry, it can be easily adapted to similar problems.\n\n# Getting Started\n\nClone the repo (only available from source currently):\n\n```sh\ngit clone https://github.com/khundman/telemanom.git \u0026\u0026 cd telemanom\n```\n\nConfigure system/modeling parameters in `config.yaml` file (to recreate experiment from paper, leave as is). For example:\n- `train: True`  if `True`, a new model will be trained for each input stream. If `False` (default) existing trained model will be loaded and used to generate predictions\n- `predict: True`  Generate new predictions using models. If `False` (default), use existing saved predictions in evaluation (useful for tuning error thresholding and skipping prior processing steps)\n- `l_s: 250` Determines the number of previous timesteps input to the model at each timestep `t` (used to generate predictions)  \n\n#### To run via **Docker**:\n\n```shell script\ndocker build -t telemanom .\n\n# rerun experiment detailed in paper or run with your own set of labeled anomlies in 'labeled_anomalies.csv'\ndocker run telemanom -l labeled_anomalies.csv\n\n# run without labeled anomalies\ndocker run telemanom\n```\n\n#### To run with local or virtual environment\n\nFrom root of repo, curl and unzip data:\n\n```sh\npip install kaggle \n\n# make sure you have an Kaggle API key setup, then: \nkaggle datasets download -d patrickfleith/nasa-anomaly-detection-dataset-smap-msl \u0026\u0026 mv nasa-anomaly-detection-dataset-smap-msl.zip data.zip \u0026\u0026 unzip -o data.zip \u0026\u0026 rm data.zip \u0026\u0026 mv data/data tmp \u0026\u0026 rm -r data \u0026\u0026 mv tmp data\n``` \n\nInstall dependencies using **python 3.6+** (recommend using a virtualenv):\n\n```sh\npip install -r requirements.txt\n```\n\nBegin processing (from root of repo):\n\n```sh\n# rerun experiment detailed in paper or run with your own set of labeled anomlies\npython example.py -l labeled_anomalies.csv\n\n# run without labeled anomalies\npython example.py\n```\n\nA jupyter notebook for evaluating results for a run is at `telemanom/result_viewer.ipynb`. To launch notebook:\n\n```sh\njupyter notebook telemanom/result-viewer.ipynb\n``` \n\nPlotly is used to generate interactive inline plots, e.g.:\n\n\u003cp align=\"center\"\u003e\n\u003cimg src=\"https://s3-us-west-2.amazonaws.com/telemanom/result-viewer.png\" alt=\"drawing2\" height=\"350\"/\u003e\n\u003c/p\u003e\n\n# Data\n\n## Using your own data\n\nPre-split training and test sets must be placed in directories named `data/train/` and `data/test`. One `.npy` file should be generated for each channel or stream (for both train and test) with shape (`n_timesteps`, `n_inputs`). The filename should be a unique channel name or ID. The telemetry values being predicted in the test data *must* be the first feature in the input. \n\nFor example, a channel `T-1` should have train/test sets named `T-1.npy` with shapes akin to `(4900,61)` and `(3925, 61)`, where the number of input dimensions are matching (`61`). The actual telemetry values should be along the first dimension `(4900,1)` and `(3925,1)`. \n\n\n## Raw experiment data\n\nThe raw data available for download represents real spacecraft telemetry data and anomalies from the Soil Moisture Active Passive satellite (SMAP) and the Curiosity Rover on Mars (MSL). All data has been anonymized with regard to time and all telemetry values are pre-scaled between `(-1,1)` according to the min/max in the test set. Channel IDs are also anonymized, but the first letter gives indicates the type of channel (`P` = power, `R` = radiation, etc.). Model input data also includes one-hot encoded information about commands that were sent or received by specific spacecraft modules in a given time window. No identifying information related to the timing or nature of commands is included in the data. For example:\n\n\u003cp align=\"center\"\u003e\n\u003cimg src=\"https://s3-us-west-2.amazonaws.com/telemanom/example-combined.png\" alt=\"drawing\" height=\"570\"/\u003e\n\u003c/p\u003e\n\nThis data also includes pre-split test and training data, pre-trained models, predictions, and smoothed errors generated using the default settings in `config.yaml`. When getting familiar with the repo, running the `result-viewer.ipynb` notebook to visualize results is useful for developing intuition. The included data also is useful for isolating portions of the system. For example, if you wish to see the effects of changes to the thresholding parameters without having to train new models, you can set `Train` and `Predict` to `False` in `config.yaml` to use previously generated predictions from prior models. \n\n## Anomaly labels and metadata\n\nThe anomaly labels and metadata are available in `labeled_anomalies.csv`, which includes:\n\n- `channel id`: anonymized channel id - first letter represents nature of channel (P = power, R = radiation, etc.)\n- `spacecraft`: spacecraft that generated telemetry stream\n- `anomaly_sequences`: start and end indices of true anomalies in stream\n- `class`: the class of anomaly (see paper for discussion)\n- `num values`: number of telemetry values in each stream\n\nTo provide your own labels, use the `labeled_anomalies.csv` file as a template. The only required fields/columns are `channel_id` and `anomaly_sequences`. `anomaly_sequences` is a list of lists that contain start and end indices of anomalous regions in the test dataset for a channel.\n\n## Dataset and performance statistics:\n\n#### Data\n|\t\t\t\t\t\t\t\t  | SMAP \t  | MSL\t\t | Total   |\n| ------------------------------- |\t:-------: |\t:------: | :------:|\t\t\t\t  \n| Total anomaly sequences \t\t  | 69        | 36\t\t | 105\t   |\n| *Point* anomalies (% tot.)\t  | 43 (62%)  | 19 (53%) | 62 (59%)|\n| *Contextual* anomalies (% tot.) | 26 (38%)  | 17 (47%) | 43 (41%)|\n| Unique telemetry channels\t\t  | 55        | 27\t\t | 82\t   |\n| Unique ISAs\t\t\t\t\t  | 28\t\t  | 19\t\t | 47\t   |\n| Telemetry values evaluated\t  | 429,735\t  | 66,709   | 496,444 |\n\n#### Performance (with default params specified in paper)\n| Spacecraft\t\t| Precision | Recall   | F_0.5 Score |\n| ----------------- | :-------: | :------: | :------: |\t\t\t\t\t  \n| SMAP \t\t  \t\t| 85.5%     | 85.5%\t   | 0.71\t  |\t\n| Curiosity (MSL)\t| 92.6%  \t| 69.4%    | 0.69     |\n| Total \t\t\t| 87.5% \t| 80.0%\t   | 0.71     |\n\n# Processing\n\nEach time the system is started a unique datetime ID (ex. `2018-05-17_16.28.00`) will be used to create the following\n- a **results** file (in `results/`) that extends `labeled_anomalies.csv` to include identified anomalous sequences and related info \n- a **data subdirectory** containing data files for created models, predictions, and smoothed errors for each channel. A file called `params.log` is also created that contains parameter settings and logging output during processing. \n\nAs mentioned, the jupyter notebook `telemanom/result-viewer.ipynb` can be used to visualize results for each stream.\n\n# Citation\n\nIf you use this work, please cite: \n\n``` @article{hundman2018detecting,\n  title={Detecting Spacecraft Anomalies Using LSTMs and Nonparametric Dynamic Thresholding},\n  author={Hundman, Kyle and Constantinou, Valentino and Laporte, Christopher and Colwell, Ian and Soderstrom, Tom},\n  journal={arXiv preprint arXiv:1802.04431},\n  year={2018}\n}\n```\n\n# License \n\nTelemanom is distributed under [Apache 2.0 license](http://www.apache.org/licenses/LICENSE-2.0).\n\nContact: Kyle Hundman (khundman@gmail.com)\n\n# Contributors\n- Kyle Hundman (NASA JPL)\n- [Valentinos Constantinou](https://github.com/vc1492a) (NASA JPL)\n- Chris Laporte (NASA JPL)\n- [Ian Colwell](https://github.com/iancolwell) (NASA JPL)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkhundman%2Ftelemanom","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fkhundman%2Ftelemanom","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkhundman%2Ftelemanom/lists"}