https://github.com/edisonleeeee/step
https://github.com/edisonleeeee/step
Last synced: 9 months ago
JSON representation
- Host: GitHub
- URL: https://github.com/edisonleeeee/step
- Owner: EdisonLeeeee
- Created: 2022-06-17T09:08:20.000Z (almost 4 years ago)
- Default Branch: master
- Last Pushed: 2022-10-14T10:27:49.000Z (over 3 years ago)
- Last Synced: 2024-11-13T03:37:22.828Z (over 1 year ago)
- Language: Python
- Size: 30.3 KB
- Stars: 7
- Watchers: 3
- Forks: 1
- Open Issues: 2
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Less Can be More: Unsupervised Graph Pruning for Large-scale Dynamic Graphs
PyTorch implementation of the paper "Less Can be More: Unsupervised Graph Pruning for Large-scale Dynamic Graphs".
# Requirements
+ torch == 1.8.1
- pytorch-lightning == 1.6.4
- torch_scatter == 2.0.8
- scikit-learn == 1.0.2
- scipy == 1.7.3
## Preprocessing
### Dataset
Create a folder 'dataset' to store data file.
[Wikipedia](http://snap.stanford.edu/jodie/wikipedia.csv)
[Reddit](http://snap.stanford.edu/jodie/reddit.csv)
### Preprocess the data
We use the data processing method of the reference [TGAT](https://openreview.net/pdf?id=rJeW1yHYwH), [repo](https://github.com/StatsDLMathsRecomSys/Inductive-representation-learning-on-temporal-graphs#inductive-representation-learning-on-temporal-graphs-iclr-2020).
We use the dense npy format to save the features in binary format. If edge features or nodes features are absent, it will be replaced by a vector of zeros.
python build_dataset_graph.py --data wikipedia --bipartite
python build_dataset_graph.py --data reddit --bipartite
## Model Training
Training the Graph pruning network based on an unsupervised task.
python train_gsn.py --data_set wikipedia --prior_ratio 0.5 --learning_rate 1e-3
## Inference
Pruning the edge data in the database inductively according to the trained Graph pruning network above.
python edge_pruning.py --data_set wikipedia --output_edge_txt ./result/edge_pred.txt --ckpt_file ./lightning_logs_gsn/lightning_logs/version_0/checkpoints/epoch=10.ckpt
## Evaluation
Using a gnn to evaluate the performance of graph pruning.(this requires a trained gnn model from the supervised task, eg. runing the following commands on dynamic node classification).
python train_gnn.py --mode origin --data_set wikipedia
python eval_gnn.py --data_set wikipedia --mode gsn --pruning_ratio 0.5 --mask_edge --output_edge_txt ./result/edge_pred.txt --ckpt_file ./lightning_logs_gnn/lightning_logs/version_0/checkpoints/epoch=10.ckpt