{"id":18797646,"url":"https://github.com/lirongwu/graphmixup","last_synced_at":"2025-07-27T14:09:11.081Z","repository":{"id":37364709,"uuid":"505676502","full_name":"LirongWu/GraphMixup","owner":"LirongWu","description":"Code for ECML-PKDD 2022 paper \"GraphMixup: Improving Class-Imbalanced Node Classification by Reinforcement Mixup and Self-supervised Context Prediction\"","archived":false,"fork":false,"pushed_at":"2023-06-07T07:29:54.000Z","size":32394,"stargazers_count":23,"open_issues_count":0,"forks_count":6,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-04-13T16:56:28.745Z","etag":null,"topics":["graph-algorithms","graph-self-supervised-learning","imbalanced-classification","imbalanced-data","reinforcement-learning"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/LirongWu.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2022-06-21T03:28:14.000Z","updated_at":"2025-01-19T13:13:54.000Z","dependencies_parsed_at":"2025-04-13T16:44:37.315Z","dependency_job_id":null,"html_url":"https://github.com/LirongWu/GraphMixup","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/LirongWu/GraphMixup","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LirongWu%2FGraphMixup","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LirongWu%2FGraphMixup/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LirongWu%2FGraphMixup/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LirongWu%2FGraphMixup/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/LirongWu","download_url":"https://codeload.github.com/LirongWu/GraphMixup/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LirongWu%2FGraphMixup/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":267368932,"owners_count":24076093,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-07-27T02:00:11.917Z","response_time":82,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["graph-algorithms","graph-self-supervised-learning","imbalanced-classification","imbalanced-data","reinforcement-learning"],"created_at":"2024-11-07T22:08:59.493Z","updated_at":"2025-07-27T14:09:11.049Z","avatar_url":"https://github.com/LirongWu.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# GraphMixup\n\n\nThis is a PyTorch implementation of the GraphMixup, and the code includes the following modules:\n\n* Dataset Loader (Cora, BlagCatalog, and Wiki-CS)\n\n* Various Architectures (GCN, SAGE, GAT, and SEM)\n\n* Five compared baselines (Origin, Over-Sampling, Re-weight, SMOTE, and Embed-SMOTE)\n\n* Training paradigm (joint learning, pre-training, and fine-tuning) for node classification on three datasets\n\n* Visualization and evaluation metrics \n\n  \n\n## Main Requirements\n\n* networkx==2.5\n* numpy==1.19.2\n* scikit-learn==0.24.1\n* scipy==1.5.2\n* torch==1.6.0\n\n\n\n## Description\n\n* train.py  \n  * train() -- Train a new model for node classification task on the *Cora, BlagCatalog, and Wiki-CS* datasets\n  * test() -- Test the learned model for node classification task on the *Cora, BlagCatalog, and Wiki-CS* datasets\n  * save_model() -- Save the pre-trained model\n  * load_model() -- Load model for fine-tuning\n* data_load.py  \n  \n  * load_cora() -- Load Cora Dataset\n  * load_BlogCatalog() -- Load BlogCatalog Dataset\n  * load_wiki_cs() -- Load Wiki-CS Dataset\n* models.py  \n  \n  * GraphConvolution() -- GCN Layer\n  * SageConv() -- SAGE Layer\n  * SemanticLayer() -- Semantic Feature Layer\n  * GraphAttentionLayer() -- GAT Layer\n  * PairwiseDistance() -- Perform self-supervised Local-Path Prediction\n  * DistanceCluster() -- Perform self-supervised Global-Path Prediction\n* utils.py  \n  * src_upsample() -- Perform interpolation in the input space\n  * src_smote() -- Perform interpolation in the embedding space\n  * mixup() -- Perform mixup in the semantic relation space\n* QLearning.py  \n  * GNN_env() -- Calculate rewards, perform actions, and update states\n  * isTerminal() -- Determine whether the termination conditions have been met\n\n\n\n## Running the code\n\n1. Install the required dependency packages\n\n3. To get the results on a specific *dataset*, first run with proper hyperparameters to perform pre-training\n\n  ```\npython train.py --dataset data_name --setting pre-train\n  ```\n\nwhere the *data_name* is one of the 3 datasets (CCora, BlagCatalog, and Wiki-CS). The pre-trained model will be saved to the corresponding checkpoint folder in **./checkpoint** for evaluation.\n\n3. To fine-tune the pre-trained model, run\n\n  ```\npython train.py --dataset data_name --setting fine-tune --load model_path\n  ```\n\nwhere the *model_path* is the path where the pre-trained model is saved.\n\n4. We provide five compared baselines in this code. They can be configured via the '--setting' arguments:\n\n- Origin: Vanilla backbone models with *'--setting raw'*\n- Over-Sampling: Repeat nodes in the minority classes with *'--setting over-sampling'*\n- Re-weight: Give samples from minority classes a larger weight when calculating the loss with *'--setting re-weight'*\n- SMOTE: Interpolation in the input space with *'--setting smote'*\n- Embed-SMOTE: Perform SMOTE in the intermediate embedding space with *'--setting embed_smote'*\n\nUse *Embed-SMOTE* as an example: \n\n  ```\npython train.py --dataset cora --setting embed_smote\n  ```\n\n\n\n## Citation\n\nIf you find this project useful for your research, please use the following BibTeX entry.\n\n```\n@inproceedings{wu2023graphmixup,\n  title={Graphmixup: Improving class-imbalanced node classification by reinforcement mixup and self-supervised context prediction},\n  author={Wu, Lirong and Xia, Jun and Gao, Zhangyang and Lin, Haitao and Tan, Cheng and Li, Stan Z},\n  booktitle={Machine Learning and Knowledge Discovery in Databases: European Conference, ECML PKDD 2022, Grenoble, France, September 19--23, 2022, Proceedings, Part IV},\n  pages={519--535},\n  year={2023},\n  organization={Springer}\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flirongwu%2Fgraphmixup","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Flirongwu%2Fgraphmixup","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flirongwu%2Fgraphmixup/lists"}