An open API service indexing awesome lists of open source software.

https://github.com/edisonleeeee/magi

[KDD 2024] Revisiting Modularity Maximization for Graph Clustering: A Contrastive Learning Perspective
https://github.com/edisonleeeee/magi

graph-clustering graph-contrastive-learning kdd2024

Last synced: about 2 months ago
JSON representation

[KDD 2024] Revisiting Modularity Maximization for Graph Clustering: A Contrastive Learning Perspective

Awesome Lists containing this project

README

        


🎩 MAGI (coMmunity-Aware Graph clusterIng)


Revisiting Modularity Maximization for Graph Clustering: A Contrastive Learning Perspective

arXiv ([arXiv:2406.142886](https://arxiv.org/abs/2406.142886)),



#### TL; DR
* **(Modularity maximization == contrastive learning)** We establish the connection between *modularity maximization* and *graph contrastive learning*
* **(MAGI framework)** We propose MAGI, a community-aware graph contrastive learning framework that uses modularity maximization as its pretext task
* **(Performance and scalibility)** MAGI has achieved state-of-the-art performance in graph clustering and demonstrates excellent scalability on industrial-scale graphs

# Requirements
> [!NOTE]
> Higher versions should be also compatible.

* PyTorch
* PyTorch Geometric
* PyTorch Cluster
* PyTorch Scatter
* PyTorch Sparse
* Scipy
* Scikit-learn
* Scikit-learn-intelex
* Ogb

```bash
pip install -r requirements.txt
```

# Model
![framework](imgs/framework.png)

# Reproduction

* Cora
```
python train_gcn.py --runs 10 --dataset 'Cora' --hidden '512' --wt 100 --wl 2 --tau 0.3 --ns 0.5 --lr 0.0005 --epochs 400 --wd 1e-3
```
* CiteSeer
```
python train_gcn.py --runs 10 --dataset 'Citeseer' --hidden '1024,512' --wt 100 --wl 3 --tau 0.9 --ns 0.5 --lr 0.0001 --epochs 400 --wd 5e-4
```
* Amazon-photo
```
python train_gcn.py --runs 10 --dataset 'Photo' --hidden '512' --wt 100 --wl 3 --tau 0.5 --ns 0.5 --lr 0.0005 --epochs 400 --wd 1e-3
```
* Amazon-computers
```
python train_gcn.py --runs 10 --dataset 'Computers' --hidden '1024,512' --wt 100 --wl 3 --tau 0.9 --ns 0.1 --lr 0.0005 --epochs 400 --wd 1e-3
```
* ogbn-arxiv
```
python train_sage.py --runs 10 --dataset 'ogbn-arxiv' --batchsize 2048 --max_duration 60 --kmeans_device 'cpu' --kmeans_batch -1 --hidden '1024,256' --size '10,10' --wt 20 --wl 5 --tau 0.9 --ns 0.1 --lr 0.01 --epochs 400 --wd 0 --dropout 0
```
* reddit
```
python train_sage.py --runs 10 --dataset 'Reddit' --batchsize 2048 --max_duration 60 --kmeans_device 'cpu' --kmeans_batch -1 --hidden '1024,256' --size '10,10' --wt 20 --wl 5 --tau 0.5 --ns 0.5 --lr 0.01 --epochs 400 --wd 0 --dropout 0
```
* ogbn-products
```
python train_sage.py --runs 10 --dataset 'ogbn-products' --batchsize 2048 --max_duration 60 --kmeans_device 'cuda' --kmeans_batch 300000 --hidden '1024,1024,256' --size '10,10,10' --wt 20 --wl 4 --tau 0.9 --ns 0.1 --lr 0.01 --epochs 400 --wd 0 --dropout 0
```

# Citation
Please kindly cite our paper if you find our project is useful:

```bib
@inproceedings{magi,
title = {Revisiting Modularity Maximization for Graph Clustering: A Contrastive Learning Perspective},
author = {Yunfei Liu and Jintang Li and Yuehe Chen and Ruofan Wu and Baokun Wang and Jing Zhou and Sheng Tian and Shuheng Shen and Xing Fu and Changhua Meng and Weiqiang Wang and Liang Chen},
booktitle = {{KDD}},
publisher = {{ACM}},
year = {2024},
}
```