Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/xovee/ccgl
TKDE 22. CCCL: Contrastive Cascade Graph Learning.
https://github.com/xovee/ccgl
acm aps data-augmentation dblp graph information-cascade information-diffusion popularity-prediction self-supervised-learning twitter weibo
Last synced: about 12 hours ago
JSON representation
TKDE 22. CCCL: Contrastive Cascade Graph Learning.
- Host: GitHub
- URL: https://github.com/xovee/ccgl
- Owner: Xovee
- License: mit
- Created: 2020-08-31T12:47:29.000Z (about 4 years ago)
- Default Branch: master
- Last Pushed: 2024-03-12T09:57:23.000Z (8 months ago)
- Last Synced: 2024-03-12T11:04:53.564Z (8 months ago)
- Topics: acm, aps, data-augmentation, dblp, graph, information-cascade, information-diffusion, popularity-prediction, self-supervised-learning, twitter, weibo
- Language: Python
- Homepage: https://www.xoveexu.com
- Size: 13.5 MB
- Stars: 28
- Watchers: 3
- Forks: 5
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# CCGL: Contrastive Cascade Graph Learning
![](https://img.shields.io/badge/TKDE-22-blue)
![](https://img.shields.io/badge/python-3.7-green)
![](https://img.shields.io/badge/tensorflow-2.3-green)
![](https://img.shields.io/badge/cudatoolkit-10.1-green)
![](https://img.shields.io/badge/cudnn-7.6.5-green)This repo provides a reference implementation of Contrastive Cascade Graph Learning (**CCGL**) framework as described in the paper:
> CCGL: Contrastive Cascade Graph Learning
> [Xovee Xu](https://www.xoveexu.com), [Fan Zhou](https://dblp.org/pid/63/3122-2.html), [Kunpeng Zhang](http://www.terpconnect.umd.edu/~kpzhang/), and [Siyuan Liu](https://directory.smeal.psu.edu/sxl68)
> IEEE Transactions on Knowledge and Data Engineering (TKDE), vol. 35, no. 5, pp. 4539--4554, May 2023
> [arXiv:2107.12576](https://arxiv.org/abs/2107.12576)
> https://doi.org/10.1109/TKDE.2022.3151829## Dataset
You can download all five datasets (Weibo, Twitter, ACM, APS, and DBLP) via any one of the following links:
Google Drive|Dropbox|Onedrive|Baidu Netdisk
:---:|:---:|:---:|:---:
||| `trqg`## Environmental Settings
Our experiments are conducted on Ubuntu 20.04, a single NVIDIA 1080Ti GPU, 48GB RAM, and Intel i7 8700K. CCGL is implemented by `Python 3.7`, `TensorFlow 2.3`, `Cuda 10.1`, and `Cudnn 7.6.5`.
Create a virtual environment and install GPU-support packages via [Anaconda](https://www.anaconda.com/):
```shell
# create virtual environment
conda create --name=ccgl python=3.7 cudatoolkit=10.1 cudnn=7.6.5# activate virtual environment
conda activate ccgl# install other dependencies
pip install -r requirements.txt
```## Usage
Here we take Weibo dataset as an example to demonstrate the usage.
### Preprocess
Step 1: divide, filter, generate labeled and unlabeled cascades:
```shell
cd ccgl
# labeled cascades
python src/gene_cas.py --input=./datasets/weibo/ --unlabel=False
# unlabeled cascades
python src/gene_cas.py --input=./datasets/weibo/ --unlabel=True
```Step 2: augment both labeled and unlabeled cascades (here we use the `AugSIM` strategy):
```shell
python src/augmentor.py --input=./datasets/weibo/ --aug_strategy=AugSIM
```Step 3: generate cascade embeddings:
```shell
python src/gene_emb.py --input=./datasets/weibo/
```### Pre-training
```shell
python src/pre_training.py --name=weibo-0 --input=./datasets/weibo/ --projection_head=4-1
```
The saved pre-training model is named as `weibo-0`.### Fine-tuning
```shell
python src/fine_tuning.py --name=weibo-0 --num=0 --input=./datasets/weibo/ --projection_head=4-1
```
Here we load the pre-trained model `weibo-0` and save the teacher network as `weibo-0-0`.### Distillation
```shell
python src/distilling.py --name=weibo-0-0 --num=0 --input=./datasets/weibo/ --projection_head=4-1
```
Here we load the teacher network `weibo-0-0` and save the student network as `weibo-0-0-student-0`.### (Optional) Run the Base model
```shell
python src/base_model.py --input=./datasets/weibo/
```## CCGL model weights
We provide pre-trained, fine-tuned, and distilled CCGL model weights. Please see details in the following table.
Model|Dataset|Label Fraction|Projection Head|MSLE|Weights
:---|:---|:---|:---|:---|:---
Pre-trained CCGL model|Weibo|100%|4-1|-|[Download](./results/pre_training_weight/weibo-100.h5)
Pre-trained CCGL model|Weibo|10%|4-4|-|[Download](./results/pre_training_weight/weibo-10.h5)
Pre-trained CCGL model|Weibo|1%|4-3|-|[Download](./results/pre_training_weight/weibo-1.h5)
Fine-tuned CCGL model|Weibo|100%|4-1|2.70|[Download](./results/fine_tuning_weight/weibo-100-0.h5)
Fine-tuned CCGL model|Weibo|10%|4-4|2.87|[Download](./results/fine_tuning_weight/weibo-10-0.h5)
Fine-tuned CCGL model|Weibo|1%|4-3|3.30|[Download](./results/fine_tuning_weight/weibo-1-0.h5)Load weights into the model:
```python
# construct model, carefully check projection head designs:
# use different number of Dense layers
...
# load weights for fine-tuning, distillation, or evaluation
model.load_weights(weight_path)
```
Check `src/fine_tuning.py` and `src/distilling.py` for *weights loading* examples.## Default hyper-parameter settings
Unless otherwise specified, we use following default hyper-parameter settings.
Param|Value|Param|Value
:---|---:|:---|---:
Augmentation strength|0.1|Pre-training epochs|30
Augmentation strategy|AugSIM|Projection Head (100%)|4-1
Batch size|64|Projection Head (10%)|4-4
Early stopping patience|20|Projection Head (1%)|4-3
Embedding dimension|64|Model size|128 (4x)
Learning rate|5e-4|Temperature|0.1## Change Logs
- Jul 21, 2021: fix a bug and some annotations
## Cite
If you find our paper & code are useful for your research, please consider citing us π:
```bibtex
@article{xu2022ccgl,
author = {Xovee Xu and Fan Zhou and Kunpeng Zhang and Siyuan Liu},
title = {{CCGL}: Contrastive Cascade Graph Learning},
journal = {IEEE Transactions on Knowledge and Data Engineering (TKDE)},
volume = {35},
number = {5},
pages = {4539--4554},
year = {2022},
doi = {10.1109/TKDE.2022.3151829},
}
```We also have a [survey paper](https://dl.acm.org/doi/10.1145/3433000?cid=99659687041) you might be interested:
```bibtex
@article{zhou2021survey,
author = {Fan Zhou and Xovee Xu and Goce Trajcevski and Kunpeng Zhang},
title = {A Survey of Information Cascade Analysis: Models, Predictions, and Recent Advances},
journal = {ACM Computing Surveys (CSUR)},
volume = {54},
number = {2},
year = {2021},
articleno = {27},
numpages = {36},
doi = {10.1145/3433000},
}
```## Acknowledgment
We would like to thank [Xiuxiu Qi](https://qhemu.github.io/xiuxiuqi/), [Ce Li](https://github.com/celi52), [Qing Yang](https://www.linkedin.com/in/εΊ-ζ¨-43ba1a142), and Wenxiong Li for sharing their computing resources and help us to test the codes. We would also like to show our gratitude to the authors of [SimCLR](https://github.com/google-research/simclr) (and [Sayak Paul](https://github.com/sayakpaul)), [node2vec](https://github.com/eliorc/node2vec), [DeepHawkes](https://github.com/CaoQi92/DeepHawkes), and others, for sharing their codes and datasets.
## Contact
For any questions please open an issue or drop an email to: `xovee at live.com`