https://github.com/xuhongzuo/couta
a time series anomaly detection method based on the calibrated one-class classifier
https://github.com/xuhongzuo/couta
anomaly-detection one-class-classification outlier-detection self-supervised-learning time-series uncertainty-modeling
Last synced: 5 months ago
JSON representation
a time series anomaly detection method based on the calibrated one-class classifier
- Host: GitHub
- URL: https://github.com/xuhongzuo/couta
- Owner: xuhongzuo
- License: apache-2.0
- Created: 2022-07-23T02:52:13.000Z (almost 4 years ago)
- Default Branch: main
- Last Pushed: 2024-02-05T14:34:13.000Z (over 2 years ago)
- Last Synced: 2024-02-05T15:56:33.973Z (over 2 years ago)
- Topics: anomaly-detection, one-class-classification, outlier-detection, self-supervised-learning, time-series, uncertainty-modeling
- Language: Python
- Homepage:
- Size: 291 KB
- Stars: 45
- Watchers: 1
- Forks: 17
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# COUTA - time series anomaly detection
Implementation of **"Calibrated One-class classification-based Unsupervised Time series Anomaly
detection"** (COUTA for short).
The full paper is available at [link](https://arxiv.org/abs/2207.12201).
Please consider citing our paper if you use this repository. :wink:
```
@article{xu2024calibrated,
title={Calibrated one-class classification for unsupervised time series anomaly detection},
author={Xu, Hongzuo and Wang, Yijie and Jian, Songlei and Liao, Qing and Wang, Yongjun and Pang, Guansong},
journal={IEEE Transactions on Knowledge and Data Engineering},
volume={36},
number={11},
pages={5723--5736},
year={2024},
publisher={IEEE}
}
```
## Environment
main packages
```
torch==1.10.1+cu113
numpy==1.20.3
pandas==1.3.3
scipy==1.4.1
scikit-learn==1.1.1
```
we provide a `requirements.txt` in our repository.
## Takeaways
### APIs
COUTA provides easy APIs in a sklearn/[pyod](https://github.com/yzhao062/Pyod) style, that is, we can first instantiate the model class by giving the parameters
```python
from src.algorithms.couta_algo import COUTA
model_configs = {'sequence_length': 50, 'stride': 1}
model = COUTA(**model_configs)
```
then, the instantiated model can be used to fit and predict data, please use dataframes of pandas as input data
```python
model.fit(train_df)
score_dic = model.predict(test_df)
score = score_dic['score_t']
```
We use a dictionary as our prediction output for the sake of consistency with an evaluation work of time series anomaly detection [link](https://github.com/astha-chem/mvts-ano-eval)
`score_t` is a vector that indicates anomaly scores of each time observation in the testing dataframe, and a higher value represents a higher likehood to be an anomaly
### model save and load
Training by feeding the `save_model_path` parameter, the model will be saved in this path
```python
from src.algorithms.couta_algo import COUTA
path = 'saved_models/couta.pth'
model_configs = {'sequence_length': 50, 'stride': 1, 'save_model_path': path}
model = COUTA(**model_configs)
model.fit(train_df)
```
Then, couta can be used without fitting.
```python
from src.algorithms.couta_algo import COUTA
path = 'saved_models/couta.pth'
model_configs = {'load_model_path': path}
model = COUTA(**model_configs)
model.predict(test_df)
```
## Datasets used in our paper
* Due to the license issue of these datasets, we provide download links here. We also offer the preprocessing script in `data_preprocessing.ipynb`. You can easily generate processed datasets that can be directly fed into our pipeline by downloading original data and running this notebook. *
The used datasets can be downloaded from:
- ASD https://github.com/zhhlee/InterFusion
- SMD https://github.com/NetManAIOps/OmniAnomaly
- SWAT https://itrust.sutd.edu.sg/itrust-labs_datasets
- WaQ https://www.spotseven.de/gecco/gecco-challenge
- DSADS https://github.com/zhangyuxin621/AMSL
- Epilepsy https://github.com/boschresearch/NeuTraL-AD/
## Reproduction of experiment results
### Experiments of the effectivness (4.2)
After handling the used datasets, you can use `main.py` to perform COUTA on different time series datasets, we use six datasets in our paper, and `--data` can be chosen from `[ASD, SMD, SWaT, WaQ, Epilepsy, DSADS]`.
For example, perform COUTA on the ASD dataset by
```shell
python main.py --data ASD --algo COUTA
```
or you can directly use `script_effectivenss.sh`
### Generalization test (4.3)
we include the used synthetic datasets in `data_processed/`
```shell
python main_showcase.py --type point
python main_showcase.py --type pattern
```
two anomaly score `npy` files are generated, you can use `experiment_generalization_ability.ipynb` to visualize the data and our results.
### Robustness (4.4)
use `src/experiments/data_contaminated_generator_dsads.py` and `src/experiments/data_contaminated_generator_ep.py` to generate datasets with various contamination ratios
use `main.py` to perform COUTA on these datasets, or directly execute `script_robustness.sh`
### Ablation study (4.5)
change the `--algo` argument to `COUTA_wto_umc`, `COUTA_wto_nac`, or `Canonical`, e.g.,
```shell
python main.py --algo COUTA_wto_umc --data ASD
```
use `script_effectiveness.sh` also produce detection results of ablated variants
### Others
As for the sensitivity test (4.6), please adjust the parameters in the yaml file.
As for the scalability test (4.7), the produced result files also contain execution time.
## Competing methods
All of the anomaly detectors in our paper are implemented in Python. We list their publicly available implementations below.
- `OCSVM` and `ECOD` : we directly use [pyod](https://github.com/yzhao062/Pyod) (python library of anomaly detection approaches);
- `GOAD`: https://github.com/lironber/GOAD
- `DSVDD`: https://github.com/lukasruff/Deep-SVDD-PyTorch
- `USAD`: https://github.com/hoo2257/USAD-Anomaly-Detecting-Algorithm
- `GDN`: https://github.com/d-ailin/GDN
- `NeuTraL`: https://github.com/boschresearch/NeuTraL-AD
- `TranAD`: https://github.com/imperial-qore/TranAD
- `LSTM-ED`, `Tcn-ED`, `MSCRED` and `Omni`: https://github.com/astha-chem/mvts-ano-eval/