{"id":13481143,"url":"https://github.com/benedekrozemberczki/SEAL-CI","last_synced_at":"2025-03-27T11:31:53.355Z","repository":{"id":43537155,"uuid":"183039041","full_name":"benedekrozemberczki/SEAL-CI","owner":"benedekrozemberczki","description":"A PyTorch implementation of  \"Semi-Supervised Graph Classification: A Hierarchical Graph Perspective\" (WWW 2019)","archived":false,"fork":false,"pushed_at":"2022-11-06T21:14:17.000Z","size":5326,"stargazers_count":210,"open_issues_count":0,"forks_count":43,"subscribers_count":8,"default_branch":"master","last_synced_at":"2025-03-14T11:51:12.729Z","etag":null,"topics":["active-learning","active-learning-module","deep-learning","deep-learning-algorithms","deepwalk","gcn","gnn","graph-classification","graph-convolution","graph-convolutional-networks","graph-embedding","graph-neural-network","graph-neural-networks","graph-representation-learning","machine-learning-algorithms","network-embedding","node-classification","node2vec","pytorch","tensorflow"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/benedekrozemberczki.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":".github/FUNDING.yml","license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null},"funding":{"github":["benedekrozemberczki"]}},"created_at":"2019-04-23T15:01:21.000Z","updated_at":"2025-03-11T02:14:37.000Z","dependencies_parsed_at":"2023-01-22T09:45:37.680Z","dependency_job_id":null,"html_url":"https://github.com/benedekrozemberczki/SEAL-CI","commit_stats":null,"previous_names":[],"tags_count":1,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/benedekrozemberczki%2FSEAL-CI","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/benedekrozemberczki%2FSEAL-CI/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/benedekrozemberczki%2FSEAL-CI/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/benedekrozemberczki%2FSEAL-CI/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/benedekrozemberczki","download_url":"https://codeload.github.com/benedekrozemberczki/SEAL-CI/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":245836252,"owners_count":20680339,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["active-learning","active-learning-module","deep-learning","deep-learning-algorithms","deepwalk","gcn","gnn","graph-classification","graph-convolution","graph-convolutional-networks","graph-embedding","graph-neural-network","graph-neural-networks","graph-representation-learning","machine-learning-algorithms","network-embedding","node-classification","node2vec","pytorch","tensorflow"],"created_at":"2024-07-31T17:00:49.124Z","updated_at":"2025-03-27T11:31:53.061Z","avatar_url":"https://github.com/benedekrozemberczki.png","language":"Python","funding_links":["https://github.com/sponsors/benedekrozemberczki"],"categories":["Deep Learning","Uncategorized","Paper implementations｜论文实现","Paper implementations"],"sub_categories":["Uncategorized","Other libraries｜其他库:","Other libraries:"],"readme":"SEAL\n======\n [![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/semi-supervised-graph-classification-a/graph-classification-on-proteins)](https://paperswithcode.com/sota/graph-classification-on-proteins?p=semi-supervised-graph-classification-a) [![codebeat badge](https://codebeat.co/badges/295bfd20-f420-40c6-ad2d-ae1e06ae97aa)](https://codebeat.co/projects/github-com-benedekrozemberczki-seal-ci-master) [![repo size](https://img.shields.io/github/repo-size/benedekrozemberczki/SEAL.svg)](https://github.com/benedekrozemberczki/SEAL/archive/master.zip) [![benedekrozemberczki](https://img.shields.io/twitter/follow/benrozemberczki?style=social\u0026logo=twitter)](https://twitter.com/intent/follow?screen_name=benrozemberczki)\n\n \nA **PyTorch** implementation of **Semi-Supervised Graph Classification: A Hierarchical Graph Perspective (WWW 2019)**\n\n\u003cp align=\"center\"\u003e\n  \u003cimg width=\"800\" src=\"seal.jpg\"\u003e\n\u003c/p\u003e\n  \n### Abstract\n\u003cp align=\"justify\"\u003e\nNode classification and graph classification are two graph learning problems that predict the class label of a node and the class label of a graph respectively. A node of a graph usually represents a real-world entity, e.g., a user in a social network, or a protein in a protein-protein interaction network. In this work, we consider a more challenging but practically useful setting, in which a node itself is a graph instance. This leads to a hierarchical graph perspective which arises in many domains such as social network, biological network and document collection. For example, in a social network, a group of people with shared interests forms a user group, whereas a number of user groups are interconnected via interactions or common members. We study the node classification problem in the hierarchical graph where a `node' is a graph instance, e.g., a user group in the above example. As labels are usually limited in real-world data, we design two novel semi-supervised solutions named Semi-supervised graph classification via Cautious/Active Iteration (or SEAL-C/AI in short). SEAL-C/AI adopt an iterative framework that takes turns to build or update two classifiers, one working at the graph instance level and the other at the hierarchical graph level. To simplify the representation of the hierarchical graph, we propose a novel supervised, self-attentive graph embedding method called SAGE, which embeds graph instances of arbitrary size into fixed-length vectors. Through experiments on synthetic data and Tencent QQ group data, we demonstrate that SEAL-C/AI not only outperform competing methods by a significant margin in terms of accuracy/Macro-F1, but also generate meaningful interpretations of the learned representations. \u003c/p\u003e\n\nThis repository provides a PyTorch implementation of SEAL-CI as described in the paper:\n\n\u003e Semi-Supervised Graph Classification: A Hierarchical Graph Perspective.\n\u003e Jia Li, Yu Rong, Hong Cheng, Helen Meng, Wenbing Huang, Junzhou Huang.\n\u003e WWW, 2019.\n\u003e [[Paper]](https://arxiv.org/pdf/1904.05003.pdf)\n\nA TensorFlow implementatio of the model is available [[here]](https://github.com/xiyou3368/SAGE).\n\n### Requirements\nThe codebase is implemented in Python 3.5.2. package versions used for development are just below.\n```\nnetworkx          2.4\ntqdm              4.28.1\nnumpy             1.15.4\npandas            0.23.4\ntexttable         1.5.0\nscipy             1.1.0\nargparse          1.1.0\ntorch             1.1.0\ntorch-scatter     1.4.0\ntorch-sparse      0.4.3\ntorch-cluster     1.4.5\ntorch-geometric   1.3.2\ntorchvision       0.3.0\n```\n### Datasets\n\n#### Graphs\nThe code takes graphs for training from an input folder where each graph is stored as a JSON. Graphs used for testing are also stored as JSON files. Every node id and node label has to be indexed from 0. Keys of dictionaries are stored strings in order to make JSON serialization possible.\n\nThe graphs file **has to be unzipped** in the input folder.\n\nEvery JSON file has the following key-value structure:\n\n```javascript\n{\"edges\": [[0, 1],[1, 2],[2, 3],[3, 4]],\n \"features\": {\"0\": [\"A\",\"B\"], \"1\": [\"B\",\"K\"], \"2\": [\"C\",\"F\",\"A\"], \"3\": [\"A\",\"B\"], \"4\": [\"B\"]},\n \"label\": \"A\"}\n```\nThe **edges** key has an edge list value which descibes the connectivity structure. The **features** key has features for each node which are stored as a dictionary -- within this nested dictionary features are list values, node identifiers are keys. The **label** key has a value which is the class membership.\n\n#### Hierarchical graph\n\nThe hierarchical graph is stored as an edge list, where graph identifiers integers are the node identifiers. Finally, node pairs are separated by commas in the comma separated values file. This edge list file has a header.\n\n### Options\nTraining a SEAL-CI model is handled by the `src/main.py` script which provides the following command line arguments.\n\n#### Input and output options\n```\n  --graphs                STR    Training graphs folder.      Default is `input/graphs/`.\n  --hierarchical-graph    STR    Macro level graph.           Default is `input/synthetic_edges.csv`.\n```\n#### Model options\n```\n  --epochs                      INT     Number of epochs.                  Default is 10.\n  --budget                      INT     Nodes to be added.                 Default is 20.\n  --labeled-count               INT     Number of labeled instances.       Default is 100.\n  --first-gcn-dimensions        INT     Graph level GCN 1st filters.       Default is 16.\n  --second-gcn-dimensions       INT     Graph level GCN 2nd filters.       Default is 8.\n  --first-dense-neurons         INT     SAGE aggregator neurons.           Default is 16.\n  --second-dense-neurons        INT     SAGE attention neurons.            Default is 4.\n  --macro-gcn-dimensions        INT     Macro level GCN neurons.           Default is 16.\n  --weight-decay                FLOAT   Weight decay of Adam.              Defatul is 5*10^-5.\n  --gamma                       FLOAT   Regularization parameter.          Default is 10^-5.\n  --learning-rate               FLOAT   Adam learning rate.                Default is 0.01.\n```\n### Examples\nThe following commands learn a model and score on the unlabaled instances. Training a model on the default dataset:\n```\npython src/main.py\n```\n\u003cp align=\"center\"\u003e\n  \u003cimg width=\"500\" src=\"seal.gif\"\u003e\n\u003c/p\u003e\n\nTraining each SEAL-CI model for a 100 epochs.\n```\npython src/main.py --epochs 100\n```\nChanging the budget size.\n```\npython src/main.py --budget 200\n```\n\n------------------------------------------------------------\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbenedekrozemberczki%2FSEAL-CI","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbenedekrozemberczki%2FSEAL-CI","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbenedekrozemberczki%2FSEAL-CI/lists"}