https://github.com/bupt-gamma/space4hgnn
Space4HGNN: A Novel, Modularized and Reproducible Platform to Evaluate Heterogeneous Graph Neural Network
https://github.com/bupt-gamma/space4hgnn
dgl heterogeneous-graph-neural-network openhgnn sigir2022
Last synced: about 1 year ago
JSON representation
Space4HGNN: A Novel, Modularized and Reproducible Platform to Evaluate Heterogeneous Graph Neural Network
- Host: GitHub
- URL: https://github.com/bupt-gamma/space4hgnn
- Owner: BUPT-GAMMA
- License: apache-2.0
- Created: 2021-11-18T06:17:10.000Z (over 4 years ago)
- Default Branch: main
- Last Pushed: 2022-04-25T03:26:07.000Z (about 4 years ago)
- Last Synced: 2025-03-24T07:06:03.978Z (about 1 year ago)
- Topics: dgl, heterogeneous-graph-neural-network, openhgnn, sigir2022
- Language: Python
- Homepage:
- Size: 1.33 MB
- Stars: 28
- Watchers: 0
- Forks: 5
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Space4HGNN: A Novel, Modularized and Reproducible Platform to Evaluate Heterogeneous Graph Neural Network
Paper: [Space4HGNN: A Novel, Modularized and Reproducible Platform to Evaluate Heterogeneous Graph Neural Network](https://arxiv.org/abs/2202.09177)
Following [GraphGym](https://github.com/snap-stanford/GraphGym), we release a platform Space4HGNN for designing and evaluating Heterogeneous Graph Neural Networks (HGNN). It is implemented with PyTorch and DGL, using the OpenHGNN package.
We have deployed the code into [OpenHGNN](https://github.com/BUPT-GAMMA/OpenHGNN). Here we will introduce the *space4hgnn* part in OpenHGNN and how to run it.
## The file tree of Space4HGNN in OpenHGNN
```tree
.
├── README.md
├── openhgnn
│ ├── __init__.py
│ ├── dataset
│ │ ├── LinkPredictionDataset.py
│ │ ├── NodeClassificationDataset.py
│ │ ├── README.md
│ │ ├── __init__.py
│ │ ├── academic_graph.py
│ │ ├── base_dataset.py
│ │ ├── hgb_dataset.py
│ │ └── utils.py
│ ├── layers
│ │ ├── GeneralGNNLayer.py
│ │ ├── GeneralHGNNLayer.py
│ │ ├── HeteroGraphConv.py
│ │ ├── HeteroLinear.py
│ │ ├── MetapathConv.py
│ │ ├── SkipConnection.py
│ │ └── __init__.py
│ ├── models
│ │ ├── __init__.py
│ │ ├── base_model.py
│ │ ├── general_HGNN.py
│ │ └── homo_GNN.py
│ ├── tasks
│ │ ├── README.md
│ │ ├── __init__.py
│ │ ├── base_task.py
│ │ ├── link_prediction.py
│ │ ├── node_classification.py
│ ├── trainerflow
│ │ ├── README.md
│ │ ├── link_prediction.py
│ │ ├── node_classification.py
│ └── utils
│ ├── __init__.py
│ ├── activation.py
│ ├── evaluator.py
│ └── utils.py
├── requirements.txt
├── setup.py
├── space4hgnn
│ ├── README.md
│ ├── __init__.py
│ ├── figure
│ │ ├── distribution.py
│ │ └── rank.py
│ ├── generate_yaml.py
│ ├── parallel.sh
│ ├── prediction
│ │ └── excel
│ │ └── gather_all_Csv.py
│ └── utils.py
└── space4hgnn.py
```
## How to run
### 1 Install
The installation process is same with OpenHGNN [Get Started](https://github.com/BUPT-GAMMA/OpenHGNN#get-started).
### 2 Run a single experiment
#### 2.1 Generate designs randomly
Here we will generate a random design combination for each dataset and save it in a `.yaml` file. The candidate designs are listed in [`./space4hgnn/generate_yaml.py`](./generate_yaml.py).
```bash
python ./space4hgnn/generate_yaml.py --gnn_type gcnconv --times 1 --key has_bn --configfile test
```
``--aggr -a``, specify the gnn type, [and gcnconv, gatconv, sageconv, ginconv are optional].
``--times -t``, the ID of yaml file to control different random sampling.
``--key -k``, specify a design dimension.
``--configfile -c``, specify a directory name to store configure yaml file.
**Note:** ``.yaml`` file will be saved in the yaml_file_path which is controlled by four arguments.
```python
yaml_file_path = './space4hgnn/config/{}/{}/{}_{}.yaml'.format(configfile, key, gnn_type, times)
# Here yaml_file_path = './space4hgnn/config/test/has_bn/gcnconv_1.yaml' with the above example code
```
#### 2.2 **Launch an experiment**
```bash
python space4hgnn.py -m general_HGNN -u metapath -t node_classification -d HGBn-ACM -g 0 -r 5 -a gcnconv -s 1 -k has_bn -v True -c test -p HGB
```
``--model -m `` name of models
``--subgraph_extraction -u`` subgraph extraction methods
``--task -t`` name of task
``--dataset -t`` name of dataset
``--gpu -g`` controls which gpu you will use. If you do not have gpu, set -g -1.
``--repeat -r`` times to repeat, default 5
``--gnn_type -a `` gun type.
``--times -t`` same with generating random designs
``--key -k`` a design dimension
``--value -v`` the value of ``key`` design dimension
``--configfile -c `` load the yaml file which is in the directory configfile
``--predictfile -p`` The file path to store predict files.
e.g.:
We implement three model families in Space4HGNN, Homogenization model family, Relation model family, Meta-path model family.
For **Homogenization model family**, we can omit the parameter ``--subgraph_extraction``,
```bash
python space4hgnn.py -m homo_GNN -t node_classification -d HGBn-ACM -g 0 -r 5 -a gcnconv -s 1 -k has_bn -v True -c test -p HGB
```
For **Relation model family**, ``--model`` is general_HGNN and ``--subgraph_extraction`` is relation,
```bash
python space4hgnn.py -m general_HGNN -u relation -t node_classification -d HGBn-ACM -g 0 -r 5 -a gcnconv -s 1 -k has_bn -v True -c test -p HGB
```
For **Meta-path model family**, ``--model`` is general_HGNN and ``--subgraph_extraction`` is meta-path
```bash
python space4hgnn.py -m general_HGNN -u metapath -t node_classification -d HGBn-ACM -g 0 -r 5 -a gcnconv -s 1 -k has_bn -v True -c test -p HGB
```
**Note: **
Similar with generating yaml file, experiment will load the design configuration from ``yaml_file_path``. And it will save the results into a `.csv` file in `prediction_file_path`.
```python
yaml_file_path = './space4hgnn/config/{}/{}/{}_{}.yaml'.format(configfile, key, gnn_type, times)
# Here yaml_file_path = './space4hgnn/config/test/has_bn/gcnconv_1.yaml'
prediction_file_path = './space4hgnn/prediction/excel/{}/{}_{}/{}_{}_{}_{}.csv'.format(predictfile, key, value, model_family, gnn_type, times, dataset)
# Here prediction_file_path = './space4hgnn/prediction/test/has_bn_True/metapath_gcnconv_1_HGBn-ACM.yaml'
```
### 3 Run a batch of experiments
An example:
```bash
./space4hgnn/parallel.sh 0 5 has_bn True node_classification test_paral test_paral
```
It will generate configuration files for the batch of experiments and launch a batch of experiments.
The following is the arguments descriptions:
1. The first argument controls which gpu to use. Here is 0.
2. Repeat times. Here is 5
3. Design dimension. Here is BN.
4. Choice of design dimension. Here set BN `` True``.
5. Task name. Here is nodeclassification
6. Configfile is the path to save configuration files.
7. Predictfile is the path to save prediction files.
**Note**:
If you encounter the error ``bash: ./space4hgnn/parallel.sh: Permission denied``, you can try with cmd ``chmod +x ./space4hgnn/parallel.sh``.
### 3 Analyze the results
#### 3.1 Gather all results
To gather all experiments results, we should run the following command to gather all results into one ``.csv`` file.
```bash
python ./space4hgnn/prediction/excel/gather_all_Csv.py -p ./space4hgnn/prediction/excel/HGB
```
#### 3.2 Analyze with figures
We offer ``./figure/result.csv`` recording the experimental results.
##### 3.2.1 Ranking analysis
We analyze the results with average ranking following [GraphGym](https://github.com/snap-stanford/GraphGym#3-analyze-the-results), the corresponding code is in [`figure/rank.py`](./figure/rank.py).

##### 3.2.2 Distribution estimates
We analyze the results with distribution estimates following [NDS](https://github.com/facebookresearch/nds), and the corresponding code is in [`figure/distribution.py`](./figure/distribution.py).

## Cite
Please kindly cite our paper if you use this code:
```
@inproceedings{zhao2022space4hgnn,
title={Space4HGNN: A Novel, Modularized and Reproducible Platform to Evaluate Heterogeneous Graph Neural Network},
author={Zhao, Tianyu and Yang, Cheng and Li, Yibo and Gan, Quan and Wang, Zhenyi and Liang, Fengqi and Zhao, Huan and Shao, Yingxia and Wang, Xiao and Shi, Chuan},
booktitle={SIGIR},
year={2022}
}
```
## Acknowledgement
The code is built on [GraphGym](https://github.com/snap-stanford/GraphGym), a method defining design space for graph neural network.