Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/dongtsi/OWAD
Anomaly Detection in the Open World: Normality Shift Detection, Explanation, and Adaptation (NDSS'23).
https://github.com/dongtsi/OWAD
Last synced: 15 days ago
JSON representation
Anomaly Detection in the Open World: Normality Shift Detection, Explanation, and Adaptation (NDSS'23).
- Host: GitHub
- URL: https://github.com/dongtsi/OWAD
- Owner: dongtsi
- License: mit
- Created: 2023-01-12T12:20:23.000Z (almost 2 years ago)
- Default Branch: main
- Last Pushed: 2023-10-07T10:37:17.000Z (about 1 year ago)
- Last Synced: 2024-08-01T16:32:09.686Z (3 months ago)
- Language: Python
- Size: 35.4 MB
- Stars: 57
- Watchers: 1
- Forks: 10
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
> *Anomaly Detection in the Open World: Normality Shift Detection, Explanation, and Adaptation. Accpeted by [NDSS'23](https://www.ndss-symposium.org/ndss2023/).*
---
![](https://img.shields.io/badge/license-MIT-green.svg)
![](https://img.shields.io/badge/language-python-blue.svg)
![](https://img.shields.io/badge/framework-pytorch-red.svg)**OWAD** aims to tackle distribution shift for deep learning based anomaly detection. In practical security applications, the distribution of data can differ and evolve over time. Such distribution shift can induce dramatic model peformance degradation, if not detected and properly adapted. This is an important reason why learning-based methods always perform well on validation or test set, but become extremely poor in real-world deployments.
In this context, **OWAD** is a novel framework to *detect*, *explain* and *adapt* to distribution shift. The major advantage of **OWAD** is to **reduce the overhead of manual labeling** during investigation and adaptation of shift. The overview of **OWAD** is shown in the following figure. It includes four key steps:
- Output Calibration: calibrate the model outputs to provide them a good description and get ready to distributional test in the following step;
- Shift Detetion: detect shift via distributional test of calibrated outputs;
- Shift Explanation: select important samples inducing the distribution shift;
- Shift Adaptation: subtly retrain model with important samples to adapt to the shift.![overview](media/overview.png)
(More details can be found in our paper.)# Environment Setup
Use the following command to install the packages:
> pip install -r requirement.txt
# Toy Example
OWAD is technically complicated but with an extremely straightforward principle. To prevent you from being intimidated by the complicated technology, we provide a [TOY EXAMPLEπ](demo/Toy_example_AE_synthetic.ipynb) to help you understand the basic idea of OWAD.
# Use Cases
We provide several cases to show how to use **OWAD** on different use cases (different applications, models, and datasets):
- [(Normal-Confidence Anomaly Detection) LSTM, Log Anomaly Detection, BGL](demo/LogAD_LSTM_BGL.ipynb)
- [(Abnormal-Confidence Anomaly Detection) Auto-Encoder, NIDS, Kyoto-2006](demo/NIDS_KitNET_Kyoto.ipynb)# Citation
Please cite the paper if you use the code, here is the Bibtex:
```bibtex
@inproceedings{han2023anomaly,
title={Anomaly Detection in the Open World: Normality Shift Detection, Explanation, and Adaptation},
author={Han, Dongqi and Wang, Zhiliang and Chen, Wenqi and Wang, Kai and Yu, Rui and Wang, Su and Zhang, Han and Wang, Zhihua and Jin, Minghui and Yang, Jiahai and others},
booktitle={30th Annual Network and Distributed System Security Symposium (NDSS)},
year={2023}
}
```
The paper is available at [here](https://www.ndss-symposium.org/wp-content/uploads/2023/02/ndss2023_f830_paper.pdf).# More Info and Notes
- Regarding the baselines and open datasets used in our experiments in Section V. ("EVALUATION"), their available links are as follows
- (Datasets for NID) Anoshift: https://github.com/bit-ml/AnoShift; Kyoto-2016: https://www.takakura.com/Kyoto_data/
- (Datasets for LogAD) BGL: https://doi.org/10.5281/zenodo.1144100
- (Datasets for APT) LANL-CMSCSE: https://csr.lanl.gov/data/cyber1/
- (Baseline) CADE: https://github.com/whyisyoung/CADE
- (Baseline) TRANS: https://s2lab.cs.ucl.ac.uk/projects/transcend/
- the code of Trascend/TRANS and UNLEARN are obtained offline from the authors
- Regarding our real-world deployment in Section VI. ("REAL-WORLD TEST ON SCADA"), the DeepLog model we used has been included in the `admodels/` folder. The data about this part is inconvenient to release (at least for now) for compliance purpose as it involves sensitive information and privacy issue in SCADA.