{"id":13767493,"url":"https://github.com/amazon-science/tgl","last_synced_at":"2025-06-13T12:02:38.286Z","repository":{"id":38059780,"uuid":"461600360","full_name":"amazon-science/tgl","owner":"amazon-science","description":null,"archived":false,"fork":false,"pushed_at":"2023-12-25T09:27:39.000Z","size":22855,"stargazers_count":204,"open_issues_count":15,"forks_count":37,"subscribers_count":11,"default_branch":"main","last_synced_at":"2025-04-02T16:46:22.348Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/amazon-science.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2022-02-20T19:55:34.000Z","updated_at":"2025-03-19T06:23:29.000Z","dependencies_parsed_at":"2024-08-03T16:05:32.424Z","dependency_job_id":"4919398e-0e7a-4129-9738-0f52894e53f4","html_url":"https://github.com/amazon-science/tgl","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/amazon-science%2Ftgl","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/amazon-science%2Ftgl/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/amazon-science%2Ftgl/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/amazon-science%2Ftgl/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/amazon-science","download_url":"https://codeload.github.com/amazon-science/tgl/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248112362,"owners_count":21049645,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-08-03T16:01:09.148Z","updated_at":"2025-04-09T21:20:43.129Z","avatar_url":"https://github.com/amazon-science.png","language":"Python","funding_links":[],"categories":["Temporal Graph Learning (TGL)"],"sub_categories":[],"readme":"# TGL: A General Framework for Temporal Graph Training on Billion-Scale Graphs\n\n## Overview\n\nThis repo is the open-sourced code for our work *TGL: A General Framework for Temporal Graph Training on Billion-Scale Graphs*.\n\n## Requirements\n- python \u003e= 3.6.13\n- pytorch \u003e= 1.8.1\n- pandas \u003e= 1.1.5\n- numpy \u003e= 1.19.5\n- dgl \u003e= 0.6.1\n- pyyaml \u003e= 5.4.1\n- tqdm \u003e= 4.61.0\n- pybind11 \u003e= 2.6.2\n- g++ \u003e= 7.5.0\n- openmp \u003e= 201511\n\nOur temporal sampler is implemented using C++, please compile the sampler first with the following command\n\u003e python setup.py build_ext --inplace\n\n## Dataset\n\n[2022/06/29] We noticed that we uploaded the wrong version of the GDELT dataset and have uploaded the correct version. Please re-download all the files in the GDELT folder. Sorry of any inconvenience created.\n\nThe four datasets used in our paper are available to download from AWS S3 bucket using the `down.sh` script. The total download size is around 350GB.\n\nTo use your own dataset, you need to put the following files in the folder `\\DATA\\\\\u003cNameOfYourDataset\u003e\\`\n\n1. `edges.csv`: The file that stores temporal edge informations. The csv should have the following columns with the header as `,src,dst,time,ext_roll` where each of the column refers to edge index (start with zero), source node index (start with zero), destination node index, time stamp, extrapolation roll (0 for training edges, 1 for validation edges, 2 for test edges). The CSV should be sorted by time ascendingly.\n2. `ext_full.npz`: The T-CSR representation of the temporal graph. We provide a script to generate this file from `edges.csv`. You can use the following command to use the script \n    \u003epython gen_graph.py --data \\\u003cNameOfYourDataset\u003e\n3. `edge_features.pt` (optional): The torch tensor that stores the edge featrues row-wise with shape (num edges, dim edge features). *Note: at least one of `edge_features.pt` or `node_features.pt` should present.*\n4. `node_features.pt` (optional): The torch tensor that stores the node featrues row-wise with shape (num nodes, dim node features). *Note: at least one of `edge_features.pt` or `node_features.pt` should present.*\n5. `labels.csv` (optional): The file contains node labels for dynamic node classification task. The csv should have the following columns with the header as `,node,time,label,ext_roll` where each of the column refers to node label index (start with zero), node index (start with zero), time stamp, node label, extrapolation roll (0 for training node labels, 1 for validation node labels, 2 for test node labels). The CSV should be sorted by time ascendingly.\n\n## Configuration Files\n\nWe provide example configuration files for five temporal GNN methods: JODIE, DySAT, TGAT, TGN and TGAT. The configuration files for single GPU training are located at `/config/` while the multiple GPUs training configuration files are located at `/config/dist/`.\n\nThe provided configuration files are all tested to be working. If you want to use your own network architecture, please refer to `/config/readme.yml` for the meaining of each entry in the yaml configuration file. As our framework is still under development, it possible that some combination of the confiruations will lead to bug. \n\n## Run\n\nCurrently, our framework only supports extrapolation setting (inference for the future).\n\n### Single GPU Link Prediction\n\u003epython train.py --data \\\u003cNameOfYourDataset\u003e --config \\\u003cPathToConfigFile\u003e\n\n### MultiGPU Link Prediction\n\u003epython -m torch.distributed.launch --nproc_per_node=\\\u003cNumberOfGPUs+1\u003e train_dist.py --data \\\u003cNameOfYourDataset\u003e --config \\\u003cPathToConfigFile\u003e --num_gpus \\\u003cNumberOfGPUs\u003e\n\n### Dynamic Node Classification\n\nCurrenlty, TGL only supports performing dynamic node classification using the dynamic node embedding generated in link prediction. \n\nFor Single GPU models, directly run\n\u003epython train_node.py --data \\\u003cNameOfYourDATA\u003e --config \\\u003cPathToConfigFile\u003e --model \\\u003cPathToSavedModel\u003e\n\nFor multi-GPU models, you need to first generate the dynamic node embedding\n\u003epython -m torch.distributed.launch --nproc_per_node=\\\u003cNumberOfGPUs+1\u003e extract_node_dist.py --data \\\u003cNameOfYourDataset\u003e --config \\\u003cPathToConfigFile\u003e --num_gpus \\\u003cNumberOfGPUs\u003e --model \\\u003cPathToSavedModel\u003e\n\nAfter generating the node embeding for multi-GPU models, run\n\u003epython train_node.py --data \\\u003cNameOfYourDATA\u003e --model \\\u003cPathToSavedModel\u003e\n\n## Security\n\nSee [CONTRIBUTING](CONTRIBUTING.md#security-issue-notifications) for more information.\n\n## Cite\n\nIf you use TGL in a scientific publication, we would appreciate citations to the following paper:\n\n```\n@article{zhou2022tgl,\n    title={{TGL}: A General Framework for Temporal GNN Training on Billion-Scale Graphs},\n    author={Zhou, Hongkuan and Zheng, Da and Nisa, Israt and Ioannidis, Vasileios and Song, Xiang and Karypis, George},\n    year = {2022},\n    journal = {Proc. VLDB Endow.},\n    volume = {15},\n    number = {8},\n}\n```\n\n## License\n\nThis project is licensed under the Apache-2.0 License.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Famazon-science%2Ftgl","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Famazon-science%2Ftgl","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Famazon-science%2Ftgl/lists"}