{"id":19585517,"url":"https://github.com/safe-graph/dgfraud-tf2","last_synced_at":"2025-04-27T11:33:42.846Z","repository":{"id":41465796,"uuid":"367144146","full_name":"safe-graph/DGFraud-TF2","owner":"safe-graph","description":"A Deep Graph-based Toolbox for Fraud Detection in TensorFlow 2.X","archived":false,"fork":false,"pushed_at":"2022-04-20T21:36:35.000Z","size":31432,"stargazers_count":121,"open_issues_count":4,"forks_count":30,"subscribers_count":3,"default_branch":"main","last_synced_at":"2024-05-22T00:04:25.022Z","etag":null,"topics":["anomaly-detection","datamining","datascience","dblp-dataset","financial-engineering","fraud-detection","fraud-prevention","graph-algorithms","graphneuralnetwork","machine-learning","opensource","outlier-detection","security","security-tools","spam-detection","toolkit","yelp-dataset"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/safe-graph.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2021-05-13T18:47:01.000Z","updated_at":"2024-05-03T12:47:47.000Z","dependencies_parsed_at":"2022-08-16T08:30:28.336Z","dependency_job_id":null,"html_url":"https://github.com/safe-graph/DGFraud-TF2","commit_stats":null,"previous_names":[],"tags_count":1,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/safe-graph%2FDGFraud-TF2","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/safe-graph%2FDGFraud-TF2/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/safe-graph%2FDGFraud-TF2/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/safe-graph%2FDGFraud-TF2/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/safe-graph","download_url":"https://codeload.github.com/safe-graph/DGFraud-TF2/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":224069541,"owners_count":17250455,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["anomaly-detection","datamining","datascience","dblp-dataset","financial-engineering","fraud-detection","fraud-prevention","graph-algorithms","graphneuralnetwork","machine-learning","opensource","outlier-detection","security","security-tools","spam-detection","toolkit","yelp-dataset"],"created_at":"2024-11-11T07:54:59.217Z","updated_at":"2024-11-11T07:55:00.398Z","avatar_url":"https://github.com/safe-graph.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cp align=\"center\"\u003e\r\n    \u003cbr\u003e\r\n    \u003ca href=\"https://image.flaticon.com/icons/svg/1671/1671517.svg\"\u003e\r\n        \u003cimg src=\"https://github.com/safe-graph/DGFraud-TF2/blob/main/logo.png\" width=\"550\"/\u003e\r\n    \u003c/a\u003e\r\n    \u003cbr\u003e\r\n\u003cp\u003e\r\n\u003cp align=\"center\"\u003e\r\n    \u003ca href=\"https://travis-ci.com/github/safe-graph/DGFraud-TF2\"\u003e\r\n        \u003cimg alt=\"travis-ci\" src=\"https://travis-ci.com/safe-graph/DGFraud-TF2.svg?token=wicswr4X2g4v8jddTpUv\u0026branch=main\"\u003e\r\n    \u003c/a\u003e\r\n    \u003ca href=\"https://www.tensorflow.org/install\"\u003e\r\n        \u003cimg alt=\"Tensorflow\" src=\"https://img.shields.io/badge/tensorflow-2.X-orange\"\u003e\r\n    \u003c/a\u003e\r\n    \u003ca href=\"https://www.python.org/\"\u003e\r\n        \u003cimg alt=\"Python\" src=\"https://img.shields.io/badge/python-3.6%20%7C%203.7%20%7C%203.8%20%7C%203.9-blue\"\u003e\r\n    \u003c/a\u003e\r\n    \u003ca href=\"https://github.com/safe-graph/DGFraud-TF2/archive/main.zip\"\u003e\r\n        \u003cimg alt=\"PRs\" src=\"https://img.shields.io/badge/PRs-welcome-brightgreen.svg\"\u003e\r\n    \u003c/a\u003e\r\n    \u003ca href=\"https://github.com/safe-graph/DGFraud-TF2/pulls\"\u003e\r\n        \u003cimg alt=\"GitHub release\" src=\"https://img.shields.io/github/v/release/safe-graph/DGFraud-TF2?include_prereleases\"\u003e\r\n    \u003c/a\u003e\r\n\u003c/p\u003e\r\n\r\n\u003ch3 align=\"center\"\u003e\r\n\u003cp\u003eA Deep Graph-based Toolbox for Fraud Detection in TensorFlow 2.X\r\n\u003c/h3\u003e\r\n\r\n[Introduction](#introduction) | [Useful Resources](#useful-resources) | [Installation](#installation) |  [Datasets](#datasets) | [User Guide](#user-guide) | [Implemented Models](#implemented-models) | [How to Contribute](#how-to-contribute)\r\n\r\n\r\n## Introduction\r\n\r\n**DGFraud-TF2** is a Graph Neural Network (GNN) based toolbox for fraud detection. It is the Tensorflow 2.X version of [DGFraud](https://github.com/safe-graph/DGFraud), which is implemented using TF 1.X. It integrates the implementation \u0026 comparison of state-of-the-art GNN-based fraud detection models. The introduction of implemented models can be found [here](#implemented-models).\r\n\r\nWe welcome contributions to this repo like adding new fraud detectors and extending the features of the toolbox.\r\n\r\nIf you use the toolbox in your project, please cite the paper below and the [algorithms](#implemented-models) you used:\r\n\r\nCIKM'20 ([PDF](https://arxiv.org/pdf/2008.08692.pdf))\r\n```bibtex\r\n@inproceedings{dou2020enhancing,\r\n  title={Enhancing Graph Neural Network-based Fraud Detectors against Camouflaged Fraudsters},\r\n  author={Dou, Yingtong and Liu, Zhiwei and Sun, Li and Deng, Yutong and Peng, Hao and Yu, Philip S},\r\n  booktitle={Proceedings of the 29th ACM International Conference on Information and Knowledge Management (CIKM'20)},\r\n  year={2020}\r\n}\r\n```\r\n\r\n\r\n## Useful Resources\r\n- [PyGOD: A Python Library for Graph Outlier Detection (Anomaly Detection)](https://github.com/pygod-team/pygod)\r\n- [UGFraud: An Unsupervised Graph-based Toolbox for Fraud Detection](https://github.com/safe-graph/UGFraud)\r\n- [Graph-based Fraud Detection Paper List](https://github.com/safe-graph/graph-fraud-detection-papers) \r\n- [Awesome Fraud Detection Papers](https://github.com/benedekrozemberczki/awesome-fraud-detection-papers)\r\n- [PyOD: A Python Toolbox for Scalable Outlier Detection (Anomaly Detection)](https://github.com/yzhao062/pyod)\r\n- [PyODD: An End-to-end Outlier Detection System](https://github.com/datamllab/pyodds)\r\n- [DGL: Deep Graph Library](https://github.com/dmlc/dgl)\r\n- [Realtime Fraud Detection with GNN on DGL](https://github.com/awslabs/realtime-fraud-detection-with-gnn-on-dgl)\r\n- [Outlier Detection DataSets (ODDS)](http://odds.cs.stonybrook.edu/)\r\n\r\n## Installation\r\n```bash\r\ngit clone https://github.com/safe-graph/DGFraud-TF2.git\r\ncd DGFraud-TF2\r\npython setup.py install\r\n```\r\n### Requirements\r\n```bash\r\n* python\u003e=3.6\r\n* tensorflow\u003e=2.0\r\n* numpy\u003e=1.16.4\r\n* scipy\u003e=1.2.0\r\n```\r\n## Datasets\r\n\r\n### DBLP\r\nWe uses the pre-processed DBLP dataset from [Jhy1993/HAN](https://github.com/Jhy1993/HAN)\r\nYou can run the FdGars, Player2Vec, GeniePath and GEM based on the DBLP dataset.\r\nUnzip the archive before using the dataset:\r\n```bash\r\ncd dataset\r\nunzip DBLP4057_GAT_with_idx_tra200_val_800.zip\r\n```\r\n\r\n### Example dataset\r\nWe implement example graphs for SemiGNN, GAS and GEM in `data_loader.py`. Because those models require unique graph structures or node types, which cannot be found in opensource datasets.\r\n\r\n\r\n### Yelp dataset\r\nFor [GraphConsis](https://arxiv.org/abs/2005.00625) and [GraphSAGE](https://arxiv.org/pdf/1706.02216.pdf), we preprocessed [Yelp Spam Review Dataset](http://odds.cs.stonybrook.edu/yelpchi-dataset/) with reviews as nodes and three relations as edges.\r\n\r\nThe dataset with `.mat` format is located at `/dataset/YelpChi.zip`. The `.mat` file includes:\r\n- `net_rur, net_rtr, net_rsr`: three sparse matrices representing three homo-graphs defined in [GraphConsis](https://arxiv.org/abs/2005.00625) paper;\r\n- `features`: a sparse matrix of 32-dimension handcrafted features;\r\n- `label`: a numpy array with the ground truth of nodes. `1` represents spam and `0` represents benign.\r\n\r\nThe YelpChi data preprocessing details can be found in our [CIKM'20](https://arxiv.org/pdf/2008.08692.pdf) paper.\r\nTo get the complete metadata of the Yelp dataset, please email to [ytongdou@gmail.com](mailto:ytongdou@gmail.com) for inquiry.\r\n\r\n## User Guide\r\n\r\n### Running the example code\r\nYou can find the implemented models in `algorithms` directory. For example, you can run Player2Vec using:\r\n```bash\r\npython Player2Vec_main.py \r\n```\r\nYou can specify parameters for models when running the code.\r\n\r\n### Running on your datasets\r\nHave a look at the load_data_dblp() function in utils/utils.py for an example.\r\n\r\nIn order to use your own data, you have to provide:\r\n* adjacency matrices or adjlists (for GAS);\r\n* a feature matrix\r\n* a label matrix\r\nthen split feature matrix and label matrix into testing data and training data.\r\n\r\nYou can specify a dataset as follows:\r\n```bash\r\npython xx_main.py --dataset your_dataset \r\n```\r\nor by editing xx_main.py\r\n\r\n### The structure of code\r\nThe repository is organized as follows:\r\n- `algorithms/` contains the implemented models and the corresponding example code;\r\n- `layers/` contains all GNN layers used by implemented models;\r\n- `dataset/` contains the necessary dataset files;\r\n- `utils/` contains:\r\n    * loading and splitting the data (`data_loader.py`);\r\n    * contains various utilities (`utils.py`).\r\n\r\n\r\n## Implemented Models\r\n\r\n### Model Source\r\n\r\n| Model  | Paper  | Venue  | Reference  |\r\n|-------|--------|--------|--------|\r\n| **SemiGNN** | [A Semi-supervised Graph Attentive Network for Financial Fraud Detection](https://arxiv.org/pdf/2003.01171)  | ICDM 2019  | [BibTex](https://github.com/safe-graph/DGFraud/blob/master/reference/semignn.txt) |\r\n| **Player2Vec** | [Key Player Identification in Underground Forums over Attributed Heterogeneous Information Network Embedding Framework](http://mason.gmu.edu/~lzhao9/materials/papers/lp0110-zhangA.pdf)  | CIKM 2019  | [BibTex](https://github.com/safe-graph/DGFraud/blob/master/reference/player2vec.txt)|\r\n| **GAS** | [Spam Review Detection with Graph Convolutional Networks](https://arxiv.org/abs/1908.10679)  | CIKM 2019 | [BibTex](https://github.com/safe-graph/DGFraud/blob/master/reference/gas.txt) |\r\n| **FdGars** | [FdGars: Fraudster Detection via Graph Convolutional Networks in Online App Review System](https://dl.acm.org/citation.cfm?id=3316586)  | WWW 2019 | [BibTex](https://github.com/safe-graph/DGFraud/blob/master/reference/fdgars.txt) |\r\n| **GeniePath** | [GeniePath: Graph Neural Networks with Adaptive Receptive Paths](https://arxiv.org/abs/1802.00910)  | AAAI 2019 | [BibTex](https://github.com/safe-graph/DGFraud/blob/master/reference/geniepath.txt)  |\r\n| **GEM** | [Heterogeneous Graph Neural Networks for Malicious Account Detection](https://arxiv.org/pdf/2002.12307.pdf)  | CIKM 2018 |[BibTex](https://github.com/safe-graph/DGFraud/blob/master/reference/gem.txt) |\r\n| **GraphSAGE** | [Inductive Representation Learning on Large Graphs](https://arxiv.org/pdf/1706.02216.pdf)  | NIPS 2017  | [BibTex](https://github.com/safe-graph/DGFraud/blob/master/reference/graphsage.txt) |\r\n| **GraphConsis** | [Alleviating the Inconsistency Problem of Applying Graph Neural Network to Fraud Detection](https://arxiv.org/pdf/2005.00625.pdf)  | SIGIR 2020  | [BibTex](https://github.com/safe-graph/DGFraud/blob/master/reference/graphconsis.txt) |\r\n| **HACUD** | [Cash-Out User Detection Based on Attributed Heterogeneous Information Network with a Hierarchical Attention Mechanism](https://aaai.org/ojs/index.php/AAAI/article/view/3884)  | AAAI 2019 |  [BibTex](https://github.com/safe-graph/DGFraud/blob/master/reference/hacud.txt) |\r\n\r\n\r\n### Model Comparison\r\n| Model  | Application  | Graph Type  | Base Model  |\r\n|-------|--------|--------|--------|\r\n| **SemiGNN** | Financial Fraud  | Heterogeneous   | GAT, LINE, DeepWalk |\r\n| **Player2Vec** | Cyber Criminal  | Heterogeneous | GAT, GCN|\r\n| **GAS** | Opinion Fraud  | Heterogeneous | GCN, GAT |\r\n| **FdGars** |  Opinion Fraud | Homogeneous | GCN |\r\n| **GeniePath** | Financial Fraud | Homogeneous | GAT  |\r\n| **GEM** | Financial Fraud  | Heterogeneous |GCN |\r\n| **GraphSAGE** | Opinion Fraud  | Homogeneous   | GraphSAGE |\r\n| **GraphConsis** | Opinion Fraud  | Heterogeneous   | GraphSAGE |\r\n| **HACUD** | Financial Fraud | Heterogeneous | GAT |\r\n\r\n\r\n\r\n## How to Contribute\r\nYou are welcomed to contribute to this open-source toolbox. Currently, you can create PR or email to [bdscsafegraph@gmail.com](mailto:bdscsafegraph@gmail.com) for inquiry.\r\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsafe-graph%2Fdgfraud-tf2","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsafe-graph%2Fdgfraud-tf2","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsafe-graph%2Fdgfraud-tf2/lists"}