{"id":13567094,"url":"https://github.com/snap-stanford/ogb","last_synced_at":"2025-05-13T15:12:48.271Z","repository":{"id":37602844,"uuid":"223495162","full_name":"snap-stanford/ogb","owner":"snap-stanford","description":"Benchmark datasets, data loaders, and evaluators for graph machine learning","archived":false,"fork":false,"pushed_at":"2025-05-02T20:29:57.000Z","size":4449,"stargazers_count":1998,"open_issues_count":37,"forks_count":403,"subscribers_count":40,"default_branch":"master","last_synced_at":"2025-05-02T21:31:32.159Z","etag":null,"topics":["datasets","deep-learning","graph-machine-learning","graph-neural-networks"],"latest_commit_sha":null,"homepage":"https://ogb.stanford.edu","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/snap-stanford.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2019-11-22T22:13:57.000Z","updated_at":"2025-05-02T20:30:01.000Z","dependencies_parsed_at":"2023-02-15T13:15:45.442Z","dependency_job_id":"5dc5b304-993e-46d2-8f50-d57bc5f03835","html_url":"https://github.com/snap-stanford/ogb","commit_stats":{"total_commits":598,"total_committers":34,"mean_commits":17.58823529411765,"dds":0.403010033444816,"last_synced_commit":"f631af76359c9687b2fe60905557bbb241916258"},"previous_names":[],"tags_count":16,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/snap-stanford%2Fogb","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/snap-stanford%2Fogb/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/snap-stanford%2Fogb/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/snap-stanford%2Fogb/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/snap-stanford","download_url":"https://codeload.github.com/snap-stanford/ogb/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":253969266,"owners_count":21992264,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["datasets","deep-learning","graph-machine-learning","graph-neural-networks"],"created_at":"2024-08-01T13:02:23.442Z","updated_at":"2025-05-13T15:12:43.260Z","avatar_url":"https://github.com/snap-stanford.png","language":"Python","funding_links":[],"categories":["Python","图数据处理","Graph","Evaluation and Monitoring"],"sub_categories":["Others"],"readme":"\u003cp align='center'\u003e\n  \u003cimg width='40%' src='https://snap-stanford.github.io/ogb-web/assets/img/OGB_rectangle.png' /\u003e\n\u003c/p\u003e\n\n--------------------------------------------------------------------------------\n\n[![PyPI](https://img.shields.io/pypi/v/ogb)](https://pypi.org/project/ogb/)\n[![License](https://img.shields.io/badge/license-MIT-blue.svg)](https://github.com/snap-stanford/ogb/blob/master/LICENSE)\n\n## Overview\n\nThe Open Graph Benchmark (OGB) is a collection of benchmark datasets, data loaders, and evaluators for graph machine learning. Datasets cover a variety of graph machine learning tasks and real-world applications.\nThe OGB data loaders are fully compatible with popular graph deep learning frameworks, including [PyTorch Geometric](https://pytorch-geometric.readthedocs.io/en/latest/) and [Deep Graph Library (DGL)](https://www.dgl.ai/). They provide automatic dataset downloading, standardized dataset splits, and unified performance evaluation.\n\n\u003cp align='center'\u003e\n  \u003cimg width='80%' src='https://snap-stanford.github.io/ogb-web/assets/img/ogb_overview.png' /\u003e\n\u003c/p\u003e\n\nOGB aims to provide graph datasets that cover important graph machine learning tasks, diverse dataset scale, and rich domains.\n\n**Graph ML Tasks:** We cover three fundamental graph machine learning tasks: prediction at the level of nodes, links, and graphs.\n\n**Diverse scale:** Small-scale graph datasets can be processed within a single GPU, while medium- and large-scale graphs might require multiple GPUs or clever sampling/partition techniques.\n\n**Rich domains:** Graph datasets come from diverse domains ranging from scientific ones to social/information networks, and also include heterogeneous knowledge graphs. \n\n\u003cp align='center'\u003e\n  \u003cimg width='70%' src='https://snap-stanford.github.io/ogb-web/assets/img/dataset_overview.png' /\u003e\n\u003c/p\u003e\n\nOGB is an on-going effort, and we are planning to increase our coverage in the future.\n\n## Installation\nYou can install OGB using Python's package manager `pip`.\n**If you have previously installed ogb, please make sure you update the version to 1.3.6.**\nThe release note is available [here](https://github.com/snap-stanford/ogb/releases/tag/1.3.6).\n\n#### Requirements\n - Python\u003e=3.6\n - PyTorch\u003e=1.6\n - DGL\u003e=0.5.0 or torch-geometric\u003e=2.0.2\n - Numpy\u003e=1.16.0\n - pandas\u003e=0.24.0\n - urllib3\u003e=1.24.0\n - scikit-learn\u003e=0.20.0\n - outdated\u003e=0.2.0\n\n#### Pip install\nThe recommended way to install OGB is using Python's package manager pip:\n```bash\npip install ogb\n```\n\n```bash\npython -c \"import ogb; print(ogb.__version__)\"\n# This should print \"1.3.6\". Otherwise, please update the version by\npip install -U ogb\n```\n\n\n#### From source\nYou can also install OGB from source. This is recommended if you want to contribute to OGB.\n```bash\ngit clone https://github.com/snap-stanford/ogb\ncd ogb\npip install -e .\n```\n\n## Package Usage\nWe highlight two key features of OGB, namely, (1) easy-to-use data loaders, and (2) standardized evaluators.\n#### (1) Data loaders\nWe prepare easy-to-use PyTorch Geometric and DGL data loaders. We handle dataset downloading as well as standardized dataset splitting.\nBelow, on PyTorch Geometric, we see that a few lines of code is sufficient to prepare and split the dataset! Needless to say, you can enjoy the same convenience for DGL!\n```python\nfrom ogb.graphproppred import PygGraphPropPredDataset\nfrom torch_geometric.loader import DataLoader\n\n# Download and process data at './dataset/ogbg_molhiv/'\ndataset = PygGraphPropPredDataset(name = 'ogbg-molhiv')\n\nsplit_idx = dataset.get_idx_split() \ntrain_loader = DataLoader(dataset[split_idx['train']], batch_size=32, shuffle=True)\nvalid_loader = DataLoader(dataset[split_idx['valid']], batch_size=32, shuffle=False)\ntest_loader = DataLoader(dataset[split_idx['test']], batch_size=32, shuffle=False)\n```\n\n#### (2) Evaluators\nWe also prepare standardized evaluators for easy evaluation and comparison of different methods. The evaluator takes `input_dict` (a dictionary whose format is specified in `evaluator.expected_input_format`) as input, and returns a dictionary storing the performance metric appropriate for the given dataset.\nThe standardized evaluation protocol allows researchers to reliably compare their methods.\n```python\nfrom ogb.graphproppred import Evaluator\n\nevaluator = Evaluator(name = 'ogbg-molhiv')\n# You can learn the input and output format specification of the evaluator as follows.\n# print(evaluator.expected_input_format) \n# print(evaluator.expected_output_format) \ninput_dict = {'y_true': y_true, 'y_pred': y_pred}\nresult_dict = evaluator.eval(input_dict) # E.g., {'rocauc': 0.7321}\n```\n\n## Citing OGB / OGB-LSC\nIf you use OGB or [OGB-LSC](https://ogb.stanford.edu/docs/lsc/) datasets in your work, please cite our papers (Bibtex below).\n```\n@article{hu2020ogb,\n  title={Open Graph Benchmark: Datasets for Machine Learning on Graphs},\n  author={Hu, Weihua and Fey, Matthias and Zitnik, Marinka and Dong, Yuxiao and Ren, Hongyu and Liu, Bowen and Catasta, Michele and Leskovec, Jure},\n  journal={arXiv preprint arXiv:2005.00687},\n  year={2020}\n}\n```\n```\n@article{hu2021ogblsc,\n  title={OGB-LSC: A Large-Scale Challenge for Machine Learning on Graphs},\n  author={Hu, Weihua and Fey, Matthias and Ren, Hongyu and Nakata, Maho and Dong, Yuxiao and Leskovec, Jure},\n  journal={arXiv preprint arXiv:2103.09430},\n  year={2021}\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsnap-stanford%2Fogb","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsnap-stanford%2Fogb","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsnap-stanford%2Fogb/lists"}