https://github.com/aicenter/GenerativeAD.jl
Generative models for anomaly detection
https://github.com/aicenter/GenerativeAD.jl
Last synced: 2 months ago
JSON representation
Generative models for anomaly detection
- Host: GitHub
- URL: https://github.com/aicenter/GenerativeAD.jl
- Owner: aicenter
- License: mit
- Created: 2020-08-27T11:09:23.000Z (almost 6 years ago)
- Default Branch: master
- Last Pushed: 2022-12-16T09:24:00.000Z (over 3 years ago)
- Last Synced: 2026-02-06T22:31:29.186Z (4 months ago)
- Language: Julia
- Size: 1.39 MB
- Stars: 4
- Watchers: 3
- Forks: 2
- Open Issues: 9
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- awesome-julia-security - GenerativeAD.jl - Generative models for anomaly detection using deep learning approaches. (Machine Learning for Security / Anomaly and Outlier Detection)
README
# GenerativeAD.jl
Generative models for anomaly detection. This Julia package contains code for the paper "Comparison of Anomaly Detectors: Context Matters" [arXiv preprint](https://arxiv.org/abs/2012.06260).
## Installation
1. Clone this repo somewhere.
2. Run Julia in the cloned dir.
```bash
cd path/to/repo/GenerativeAD.jl
julia --project
```
3. Install all the packages using `]instantiate` and compile the package.
```julia
(@julia) pkg> instantiate
(@julia) using GenerativeAD
```
Some of the bash scripts are calling `julia` without `--project` flag and uses `@quickactivate` macro to activate the environment, however this fails, unless `DrWatson` is installed in the base julia environment. In order to avoid these problems install `DrWatson` in your base environment.
```bash
cd ~
julia -e 'using Pkg; Pkg.add("DrWatson");'
```
### Python instalation
Some models (PIDforest, scikit-learn, PyOD) are available only through PyCall with appropriate environment active. With upcoming bayesian optimisation from `scikit-optimize` every model will require an active environment, which can be setup in following way using python's `venv` module. (Most of the scripts have hardcoded path to this environment, though this can be easily changed).
```bash
cd ~
python -m venv sklearn-env
source ${HOME}/sklearn-env/bin/activate
export PYTHON="${HOME}/sklearn-env/bin/python"
```
Then install requirements inside this repository
```bash
cd path/to/repo/GenerativeAD.jl
pip install -r requirements.txt
pip install git+https://github.com/janfrancu/pidforest.git # not registerd anywhere
julia --project -e 'using Pkg; Pkg.build("PyCall");' # rebuilds PyCall.jl to point to the current environment
```
## Running experiments on the RCI cluster
0. First, load Julia and Python modules.
```bash
ml Julia
ml Python
```
1. Install the package somewhere on the RCI cluster.
2. Then the experiments can be run via `slurm`. This will run 20 experiments with the basic VAE model, each with 5 crossvalidation repetitions on all datasets in the text file with 10 parallel processes for each dataset. All data will be saved in `GenerativeAD.jl/data/experiments/tabular`
```bash
cd GenerativeAD.jl/scripts/experiments_tabular
./run_parallel.sh vae 20 5 10 datasets_tabular.txt
```
## Data:
Only UCI datasets are available upon installation via the `UCI` package. Remaining tabular and image datasets are downloaded upon first request (e.g. via the `GenerativeAD.Datasets.load_data(dataset)` function). First download requires user input to accept download terms for individual datasets. If you want to avoid this, do
```bash
export DATADEPS_ALWAYS_ACCEPT=true
```
before running Julia.