{"id":13782002,"url":"https://github.com/YingtongDou/CARE-GNN","last_synced_at":"2025-05-11T15:32:15.971Z","repository":{"id":39563826,"uuid":"285392640","full_name":"YingtongDou/CARE-GNN","owner":"YingtongDou","description":"Code for CIKM 2020 paper Enhancing Graph Neural Network-based Fraud Detectors against Camouflaged Fraudsters","archived":false,"fork":false,"pushed_at":"2022-10-11T11:15:56.000Z","size":40674,"stargazers_count":264,"open_issues_count":0,"forks_count":55,"subscribers_count":4,"default_branch":"master","last_synced_at":"2025-04-09T14:15:48.620Z","etag":null,"topics":["datamining","deep-learning","fraud-detection","fraud-prevention","graphneuralnetwork","machine-learning","reinforcement-learning","security"],"latest_commit_sha":null,"homepage":"https://arxiv.org/abs/2008.08692","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/YingtongDou.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2020-08-05T20:05:39.000Z","updated_at":"2025-03-26T10:13:23.000Z","dependencies_parsed_at":"2022-08-27T19:22:01.869Z","dependency_job_id":null,"html_url":"https://github.com/YingtongDou/CARE-GNN","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/YingtongDou%2FCARE-GNN","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/YingtongDou%2FCARE-GNN/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/YingtongDou%2FCARE-GNN/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/YingtongDou%2FCARE-GNN/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/YingtongDou","download_url":"https://codeload.github.com/YingtongDou/CARE-GNN/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":253588647,"owners_count":21932292,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["datamining","deep-learning","fraud-detection","fraud-prevention","graphneuralnetwork","machine-learning","reinforcement-learning","security"],"created_at":"2024-08-03T18:01:31.890Z","updated_at":"2025-05-11T15:32:10.947Z","avatar_url":"https://github.com/YingtongDou.png","language":"Python","funding_links":[],"categories":["2020"],"sub_categories":[],"readme":"# CARE-GNN\n\nA PyTorch implementation for the [CIKM 2020](https://www.cikm2020.org/) paper below:  \n**Enhancing Graph Neural Network-based Fraud Detectors against Camouflaged Fraudsters**.  \n[Yingtong Dou](http://ytongdou.com/), [Zhiwei Liu](https://sites.google.com/view/zhiwei-jim), [Li Sun](https://www.researchgate.net/profile/Li_Sun118), Yutong Deng, [Hao Peng](https://penghao-buaa.github.io/), [Philip S. Yu](https://www.cs.uic.edu/PSYu/).  \n\\[[Paper](https://arxiv.org/pdf/2008.08692.pdf)\\]\\[[Toolbox](https://github.com/safe-graph/DGFraud)\\]\\[[DGL Example](https://github.com/dmlc/dgl/tree/master/examples/pytorch/caregnn)\\]\\[[Benchmark](https://paperswithcode.com/paper/enhancing-graph-neural-network-based-fraud)\\]\n\n## Bug Fixes and Update (06/2021)\n\n### Similarity score\nThe feature and label similarity scores presented in Table 2 of the paper are incorrect. The updated equations for calculating two similarity scores are shown below:\n\u003cp align=\"center\"\u003e\n    \u003cbr\u003e\n    \u003ca href=\"https://github.com/YingtongDou/CARE-GNN\"\u003e\n        \u003cimg src=\"https://github.com/YingtongDou/CARE-GNN/blob/master/eq_simi.png\" width=\"500\"/\u003e\n    \u003c/a\u003e\n    \u003cbr\u003e\n\u003cp\u003e\n\nThe code for calculating the similarity scores is in [simi_comp.py](https://github.com/YingtongDou/CARE-GNN/blob/master/simi_comp.py).\n\nThe updated similarity scores for the two datasets are shown below. Note that we only compute the similarity scores for positive nodes to demonstrate the camouflage of fraudsters (positive nodes).\n\n| YelpChi  | rur  | rtr  | rsr  | homo  |\n|-------|--------|--------|--------|--------|\n| Avg. Feature Similarity | 0.991   |   0.988    |  0.988  | 0.988  |\n| Avg. Label Similarity |  0.909  |   0.176   |  0.186  | 0.184  |\n\n| Amazon  | upu  | usu  | uvu  | homo  |\n|-------|--------|--------|--------|--------|\n| Avg. Feature Similarity | 0.711   |   0.687    |  0.697  | 0.687  |\n| Avg. Label Similarity |  0.167  |   0.056   |  0.053  | 0.072  |\n\n### Relation weight in Figure 3\n\nAccording to this [issue](https://github.com/YingtongDou/CARE-GNN/issues/5), the weighted aggregation of CARE-Weight (a variant of CARE-GNN) has an error. After fixing it, the relation weight will not converge to the same value. Thus, the relation weight subfigure in Figure 3 and its associated conclusion are wrong.\n\n### Extended version CARE-GNN\n\nPlease check out [RioGNN](https://github.com/safe-graph/RioGNN), a GNN model extended based on CARE-GNN with more reinforcement learning modules integrated. We are actively developing an efficient multi-layer version of CARE-GNN. Stay tuned.\n\n\n## Overview\n\n\u003cp align=\"center\"\u003e\n    \u003cbr\u003e\n    \u003ca href=\"https://github.com/YingtongDou/CARE-GNN\"\u003e\n        \u003cimg src=\"https://github.com/YingtongDou/CARE-GNN/blob/master/model.png\" width=\"900\"/\u003e\n    \u003c/a\u003e\n    \u003cbr\u003e\n\u003cp\u003e\n\n**CA**mouflage-**RE**sistant **G**raph **N**eural **N**etwork **(CARE-GNN)** is a GNN-based fraud detector based on a multi-relation graph equipped with three modules that enhance its performance against camouflaged fraudsters.\n\nThree enhancement modules are:\n\n- **A label-aware similarity measure** which measures the similarity scores between a center node and its neighboring nodes;\n- **A similarity-aware neighbor selector** which leverages top-p sampling and reinforcement learning to select the optimal amount of neighbors under each relation;\n- **A relation-aware neighbor aggregator** which directly aggregates information from different relations using the optimal neighbor selection thresholds as weights.\n\nCARE-GNN has following advantages:\n\n- **Adaptability.** CARE-GNN adaptively selects best neighbors\nfor aggregation given arbitrary multi-relation graph;\n- **High-efficiency.** CARE-GNN has a high computational efficiency without attention and deep reinforcement learning;\n- **Flexibility.** Many other neural modules and external knowledge can be plugged into the CARE-GNN;\n\nWe have integrated more than **eight** GNN-based fraud detectors as a TensorFlow [toolbox](https://github.com/safe-graph/DGFraud). \n\n## Setup\n\nYou can download the project and install the required packages using the following commands:\n\n```bash\ngit clone https://github.com/YingtongDou/CARE-GNN.git\ncd CARE-GNN\npip3 install -r requirements.txt\n```\n\nTo run the code, you need to have at least **Python 3.6** or later versions. \n\n## Running\n\n1. In CARE-GNN directory, run `unzip /data/Amazon.zip` and `unzip /data/YelpChi.zip` to unzip the datasets; \n2. Run `python data_process.py` to generate adjacency lists used by CARE-GNN;\n3. Run `python train.py` to run CARE-GNN with default settings.\n\nFor other dataset and parameter settings, please refer to the arg parser in `train.py`. Our model supports both CPU and GPU mode.\n\n## Running on your datasets\n\nTo run CARE-GNN on your datasets, you need to prepare the following data:\n\n- Multiple-single relation graphs with the same nodes where each graph is stored in `scipy.sparse` matrix format, you can use `sparse_to_adjlist()` in `utils.py` to transfer the sparse matrix into adjacency lists used by CARE-GNN;\n- A numpy array with node labels. Currently, CARE-GNN only supports binary classification;\n- A node feature matrix stored in `scipy.sparse` matrix format. \n\n### Repo Structure\nThe repository is organized as follows:\n- `data/`: dataset files;\n- `data_process.py`: transfer sparse matrix to adjacency lists;\n- `graphsage.py`: model code for vanilla [GraphSAGE](https://github.com/williamleif/graphsage-simple/) model;\n- `layers.py`: CARE-GNN layers implementations;\n- `model.py`: CARE-GNN model implementations;\n- `train.py`: training and testing all models;\n- `utils.py`: utility functions for data i/o and model evaluation.\n\n## Citation\nIf you use our code, please cite the paper below:\n```bibtex\n@inproceedings{dou2020enhancing,\n  title={Enhancing Graph Neural Network-based Fraud Detectors against Camouflaged Fraudsters},\n  author={Dou, Yingtong and Liu, Zhiwei and Sun, Li and Deng, Yutong and Peng, Hao and Yu, Philip S},\n  booktitle={Proceedings of the 29th ACM International Conference on Information and Knowledge Management (CIKM'20)},\n  year={2020}\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FYingtongDou%2FCARE-GNN","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FYingtongDou%2FCARE-GNN","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FYingtongDou%2FCARE-GNN/lists"}