https://github.com/graph-com/gad-ebm
[NeurIPS 2023 : GLFRONTIERS Workshop] GAD-EBM : Graph Anomaly Detection using Energy-Based Models
https://github.com/graph-com/gad-ebm
energy-based-model glfrontiers graph-anomaly-detection neurips-2023
Last synced: 3 months ago
JSON representation
[NeurIPS 2023 : GLFRONTIERS Workshop] GAD-EBM : Graph Anomaly Detection using Energy-Based Models
- Host: GitHub
- URL: https://github.com/graph-com/gad-ebm
- Owner: Graph-COM
- Created: 2023-10-29T18:44:58.000Z (almost 2 years ago)
- Default Branch: main
- Last Pushed: 2024-09-15T20:12:03.000Z (about 1 year ago)
- Last Synced: 2024-12-03T01:12:41.944Z (10 months ago)
- Topics: energy-based-model, glfrontiers, graph-anomaly-detection, neurips-2023
- Language: Jupyter Notebook
- Homepage: https://openreview.net/forum?id=I5hf3opvgK
- Size: 1.38 MB
- Stars: 4
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
GAD-EBM: Graph Anomaly Detection
using Energy-Based ModelsThis repository contains the PyTorch implementation of the NeurIPS 2023 New Frontiers in Graph Learning (GLFrontiers) workshop paper "GAD-EBM: Graph Anomaly Detection using Energy-Based Models" by [Amit Roy](https://amitroy7781.github.io/), Juan Shu, Olivier Elshocht, Jeroen Smeets, Ruqi Zhang and Pan Li.
## Abstract
Graph Anomaly Detection (GAD) is essential in fields ranging from network security, and bioinformatics to finance. Previous works often adopt auto-encoders to compute reconstruction errors for anomaly detection: anomalies are hard to be reconstructed. In this work, we revisit the first principle for anomaly detection, i.e., the Neyman-Pearson rule, where the optimal anomaly detector is based on the likelihood of a data point given the normal distribution of data. However, in practice, the distribution is often unknown and the estimation of the distribution of graph-structured data may be hard. Moreover, the likelihood computation of a graph-structured data point may be challenging as well. In this paper, we propose a novel approach GAD-EBM that can estimate the distribution of graphs and compute likelihoods efficiently by using Energy-Based Models (EBMs) over graphs. GAD-EBM approaches the likelihood of a rooted subgraph of node $v$, and further can leverage the likelihood to accurately identify whether node $v$ is anomalous or not. Traditional score matching for training EBMs may not be used to apply EBMs that model the distribution of graphs because of complicated discreteness and multi-modality of graph data. We propose a Subgraph Score Matching (SSM) approach, which is specifically designed for graph data based on a novel framework of neighborhood state-space graphs. Experimentation conducted on six real-world datasets validates the effectiveness and efficiency of GAD-EBM and the source code for GAD-EBM is openly available.
## Neighborhood State-Space Graph
![]()
Exapmle of a neighborhood state-space graph## Main Parameters
```
--dataset Anomaly detection dataset (default: disney)
--perturb_percent Percentages of edges to be added/deleted (default: 0.05)
--seed Random Number Seed (default: 42)
--nb_epochs Number of epochs (default: 200)
--hidden_dim Hidden Dimension Size (default: 16)
--lr Learning Rate (default: 0.01)
--l2_coef Regularization coefficient (default: 0.01)
--drop_edge Drop Edge Flag (default: True)
--add_edge Add Edge Flag (default: False)
--self_loop Self-loop flag (default: True)
--preprocess_feat Preprocess Features (default: True)
--GNN_name GNN Encoder (default: GCN)
--num_neigh Number of Neighbors in the State-Space Graph (default: 1)
```## Environment Setup
Create Conda Environment
```
conda create --name GAD-EBM
conda activate GAD-EBM
```Install pytorch:
```
conda install pytorch torchvision torchaudio pytorch-cuda=11.7 -c pytorch -c nvidia```
Install pytorch geometric:
```
pip install pyg-lib torch-scatter torch-sparse torch-cluster torch-spline-conv torch-geometric -f https://data.pyg.org/whl/torch-1.13.0+cu117.html```
Install requirements.txt
```
conda install --file requirements.txt
```## Basic Usage
Run the python notebook with appropriate parameter changes.
To run **GAD-EBM** on the DGraph dataset, please download the DGraphFin dataset file 'DGraphFin.zip' from the website 'https://dgraph.xinye.com/introduction' and place it under the directory './dataset/raw'.
## Experimental Results
**Dataset Description**
**Benchmark Anomaly Detection Results**
**Likelihood comparison**
**Running Time Comparison**
![]()
## Cite
If you find our paper and repo useful, please cite our paper:
```bibtex
@inproceedings{roy2023gad,
title={GAD-EBM: Graph Anomaly Detection using Energy-Based Models},
author={Roy, Amit and Shu, Juan and Elshocht, Olivier and Smeets, Jeroen and Zhang, Ruqi and Li, Pan},
booktitle={NeurIPS 2023 Workshop: New Frontiers in Graph Learning},
year={2023}
}
```## Star History
[](https://star-history.com/#Graph-COM/GAD-EBM&Date)