https://github.com/xuhongzuo/couta

a time series anomaly detection method based on the calibrated one-class classifier
https://github.com/xuhongzuo/couta

anomaly-detection one-class-classification outlier-detection self-supervised-learning time-series uncertainty-modeling

Last synced: 5 months ago
JSON representation

a time series anomaly detection method based on the calibrated one-class classifier

Host: GitHub
URL: https://github.com/xuhongzuo/couta
Owner: xuhongzuo
License: apache-2.0
Created: 2022-07-23T02:52:13.000Z (almost 4 years ago)
Default Branch: main
Last Pushed: 2024-02-05T14:34:13.000Z (over 2 years ago)
Last Synced: 2024-02-05T15:56:33.973Z (over 2 years ago)
Topics: anomaly-detection, one-class-classification, outlier-detection, self-supervised-learning, time-series, uncertainty-modeling
Language: Python
Homepage:
Size: 291 KB
Stars: 45
Watchers: 1
Forks: 17
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

awesome-multivariate-time-series-anomaly-detection-algorithms - couta

README

          # COUTA  - time series anomaly detection

Implementation of **"Calibrated One-class classification-based Unsupervised Time series Anomaly

detection"** (COUTA for short).  

The full paper is available at [link](https://arxiv.org/abs/2207.12201).  

Please consider citing our paper if you use this repository. :wink:

```

@article{xu2024calibrated,

  title={Calibrated one-class classification for unsupervised time series anomaly detection},

  author={Xu, Hongzuo and Wang, Yijie and Jian, Songlei and Liao, Qing and Wang, Yongjun and Pang, Guansong},

  journal={IEEE Transactions on Knowledge and Data Engineering},

  volume={36},

  number={11},

  pages={5723--5736},

  year={2024},

  publisher={IEEE}

}

```

  

## Environment  

main packages

```  

torch==1.10.1+cu113  

numpy==1.20.3  

pandas==1.3.3  

scipy==1.4.1  

scikit-learn==1.1.1  

```  

we provide a `requirements.txt` in our repository.

  

  

  

## Takeaways

### APIs

COUTA provides easy APIs in a sklearn/[pyod](https://github.com/yzhao062/Pyod) style, that is, we can first instantiate the model class by giving the parameters

```python

from src.algorithms.couta_algo import COUTA

model_configs = {'sequence_length': 50, 'stride': 1}

model = COUTA(**model_configs)

```

then, the instantiated model can be used to fit and predict data, please use dataframes of pandas as input data

```python

model.fit(train_df)

score_dic = model.predict(test_df)

score = score_dic['score_t']

```

We use a dictionary as our prediction output for the sake of consistency with an evaluation work of time series anomaly detection [link](https://github.com/astha-chem/mvts-ano-eval)  

`score_t` is a vector that indicates anomaly scores of each time observation in the testing dataframe, and a higher value represents a higher likehood to be an anomaly

  

### model save and load

Training by feeding the `save_model_path` parameter, the model will be saved in this path

```python

from src.algorithms.couta_algo import COUTA

path = 'saved_models/couta.pth'

model_configs = {'sequence_length': 50, 'stride': 1, 'save_model_path': path}

model = COUTA(**model_configs)

model.fit(train_df)

```

Then, couta can be used without fitting. 

```python

from src.algorithms.couta_algo import COUTA

path = 'saved_models/couta.pth'

model_configs = {'load_model_path': path}

model = COUTA(**model_configs)

model.predict(test_df)

```

  

## Datasets used in our paper

* Due to the license issue of these datasets, we provide download links here. We also offer the preprocessing script in `data_preprocessing.ipynb`. You can easily generate processed datasets that can be directly fed into our pipeline by downloading original data and running this notebook. *  

The used datasets can be downloaded from:  

- ASD   https://github.com/zhhlee/InterFusion  

- SMD   https://github.com/NetManAIOps/OmniAnomaly  

- SWAT  https://itrust.sutd.edu.sg/itrust-labs_datasets  

- WaQ   https://www.spotseven.de/gecco/gecco-challenge  

- DSADS https://github.com/zhangyuxin621/AMSL  

- Epilepsy https://github.com/boschresearch/NeuTraL-AD/  

  

  

  

## Reproduction of experiment results

### Experiments of the effectivness (4.2)

After handling the used datasets, you can use `main.py` to perform COUTA on different time series datasets, we use six datasets in our paper, and `--data` can be chosen from `[ASD, SMD, SWaT, WaQ, Epilepsy, DSADS]`.

For example, perform COUTA on the ASD dataset by

```shell

python main.py --data ASD --algo COUTA

```

or you can directly use `script_effectivenss.sh`  

### Generalization test (4.3)

we include the used synthetic datasets in `data_processed/`

```shell

python main_showcase.py --type point

python main_showcase.py --type pattern

```

two anomaly score `npy` files are generated, you can use `experiment_generalization_ability.ipynb` to visualize the data and our results.

### Robustness (4.4)

use `src/experiments/data_contaminated_generator_dsads.py` and  `src/experiments/data_contaminated_generator_ep.py` to generate datasets with various contamination ratios  

use `main.py` to perform COUTA on these datasets, or directly execute `script_robustness.sh`

### Ablation study (4.5)

change the `--algo` argument to `COUTA_wto_umc`, `COUTA_wto_nac`, or `Canonical`, e.g., 

```shell

python main.py --algo COUTA_wto_umc --data ASD

```

use `script_effectiveness.sh` also produce detection results of ablated variants  

### Others

As for the sensitivity test (4.6), please adjust the parameters in the yaml file.  

As for the scalability test (4.7), the produced result files also contain execution time.  

  

  

  

## Competing methods

All of the anomaly detectors in our paper are implemented in Python. We list their publicly available implementations below. 

- `OCSVM` and `ECOD` :  we directly use [pyod](https://github.com/yzhao062/Pyod) (python library of anomaly detection approaches); 

- `GOAD`: https://github.com/lironber/GOAD 

- `DSVDD`: https://github.com/lukasruff/Deep-SVDD-PyTorch 

- `USAD`: https://github.com/hoo2257/USAD-Anomaly-Detecting-Algorithm

- `GDN`: https://github.com/d-ailin/GDN

- `NeuTraL`: https://github.com/boschresearch/NeuTraL-AD

- `TranAD`: https://github.com/imperial-qore/TranAD

- `LSTM-ED`, `Tcn-ED`, `MSCRED` and `Omni`: https://github.com/astha-chem/mvts-ano-eval/

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/xuhongzuo/couta

Awesome Lists containing this project

README