{"id":24735607,"url":"https://github.com/tony-y/cgnn","last_synced_at":"2025-10-29T07:06:13.028Z","repository":{"id":64594787,"uuid":"181129678","full_name":"Tony-Y/cgnn","owner":"Tony-Y","description":"Crystal Graph Neural Networks","archived":false,"fork":false,"pushed_at":"2024-04-20T03:15:25.000Z","size":2169,"stargazers_count":108,"open_issues_count":0,"forks_count":24,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-10-18T04:51:34.587Z","etag":null,"topics":["colab-notebook","data-mining","deep-learning","graph-convolutional-networks","graph-neural-networks","graph-theory","materials-science","neural-networks","oqmd","pytorch"],"latest_commit_sha":null,"homepage":"https://tony-y.github.io/cgnn/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Tony-Y.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null}},"created_at":"2019-04-13T06:19:07.000Z","updated_at":"2025-08-17T20:16:48.000Z","dependencies_parsed_at":"2024-04-20T04:23:52.107Z","dependency_job_id":"931b4eb3-30be-4c1a-9e05-cdb1da683f69","html_url":"https://github.com/Tony-Y/cgnn","commit_stats":null,"previous_names":[],"tags_count":4,"template":false,"template_full_name":null,"purl":"pkg:github/Tony-Y/cgnn","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Tony-Y%2Fcgnn","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Tony-Y%2Fcgnn/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Tony-Y%2Fcgnn/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Tony-Y%2Fcgnn/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Tony-Y","download_url":"https://codeload.github.com/Tony-Y/cgnn/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Tony-Y%2Fcgnn/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":281577702,"owners_count":26524886,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-29T02:00:06.901Z","response_time":59,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["colab-notebook","data-mining","deep-learning","graph-convolutional-networks","graph-neural-networks","graph-theory","materials-science","neural-networks","oqmd","pytorch"],"created_at":"2025-01-27T20:45:50.930Z","updated_at":"2025-10-29T07:06:13.010Z","avatar_url":"https://github.com/Tony-Y.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Crystal Graph Neural Networks\n[![GitHub release (latest by date)](https://img.shields.io/github/v/release/Tony-Y/cgnn)](https://github.com/Tony-Y/cgnn/releases)\n[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](LICENSE)\n[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/crystal-graph-neural-networks-for-data-mining/formation-energy-on-oqmd-v12)](https://paperswithcode.com/sota/formation-energy-on-oqmd-v12?p=crystal-graph-neural-networks-for-data-mining)\n[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/crystal-graph-neural-networks-for-data-mining/band-gap-on-oqmd-v12)](https://paperswithcode.com/sota/band-gap-on-oqmd-v12?p=crystal-graph-neural-networks-for-data-mining)\n[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/crystal-graph-neural-networks-for-data-mining/total-magnetization-on-oqmd-v12)](https://paperswithcode.com/sota/total-magnetization-on-oqmd-v12?p=crystal-graph-neural-networks-for-data-mining)\n\n[![Develop branch](https://img.shields.io/badge/develop-v1.1-red)](https://github.com/Tony-Y/cgnn/tree/dev_v1.1)\n[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/oqm9hk-a-large-scale-graph-dataset-for/formation-energy-on-oqm9hk)](https://paperswithcode.com/sota/formation-energy-on-oqm9hk?p=oqm9hk-a-large-scale-graph-dataset-for)\n[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/oqm9hk-a-large-scale-graph-dataset-for/band-gap-on-oqm9hk)](https://paperswithcode.com/sota/band-gap-on-oqm9hk?p=oqm9hk-a-large-scale-graph-dataset-for)\n[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/oqm9hk-a-large-scale-graph-dataset-for/total-magnetization-on-oqm9hk)](https://paperswithcode.com/sota/total-magnetization-on-oqm9hk?p=oqm9hk-a-large-scale-graph-dataset-for)\n\n\u003ctable\u003e\n\u003ctr\u003e\n\u003cth\u003e\n\u003ca href=\"https://www.youtube.com/watch?v=ghzHOLm0FCE\"\u003e\u003cimg src=\"http://img.youtube.com/vi/ghzHOLm0FCE/mqdefault.jpg\" alt=\"iCrucible Demo\"/\u003e\u003c/a\u003e\n\u003c/th\u003e\n\u003cth\u003e\n\u003ci\u003e[Demo Video]\u003cbr\u003eThe iOS app, iCrucible, uses the CGNN technology to discover new compounds.\u003c/i\u003e\n\u003c/th\u003e\n\u003c/tr\u003e\n\u003c/table\u003e\n\nThis repository contains the original implementation of the CGNN architectures described in the paper [\"Crystal Graph Neural Networks for Data Mining in Materials Science\"](https://storage.googleapis.com/rimcs_cgnn/cgnn_matsci_May_27_2019.pdf).\n\n\u003cp align=\"center\"\u003e\u003cimg src=\"figs/SiO2.png\" alt=\"Logo\" width=\"200\"/\u003e\u003c/p\u003e\n\n[Gilmer, *et al.*](#Gilmer2017) investigated various graph neural networks for predicting molecular properties, and proposed the neural message passing framework that unifies them. [Xie, *et al.*](#Xie2018) studied graph neural networks to predict bulk properties of crystalline materials, and used a multi-graph named a crystal graph. [Schütt, *et al.*](#Scheutt2018) proposed a deep learning architecture with an implicit graph neural network not only to predict material properties, but also to perform molecular dynamics simulations. These studies use bond distances as features for machine learning. In contrast, the CGNN architectures use no bond distances to predict bulk properties at equilibrium states of crystalline materials at 0 K and 0 Pa, such as the formation energy, the unit cell volume, the band gap, and the total magnetization.\n\nNote that the crystal graph represents only a repeating unit of [a periodic graph or a crystal net](https://en.wikipedia.org/wiki/Periodic_graph_(crystallography)) in crystallography.\n\n## Requirements\n\n* Python 3.7\n* PyTorch 1.1+\n* Pandas\n* Matplotlib (necessary for plotting scripts)\n\n## Installation\n\n```\ngit clone https://github.com/Tony-Y/cgnn.git\nCGNN_HOME=`pwd`/cgnn\n```\n\n## Usage\n\nThe user guide in [this GitHub Pages site](https://Tony-Y.github.io/cgnn/) provides the complete explanation of the CGNN architectures, and the description of program options. Usage examples are contained in the directory `cgnn/examples`.\n\n### Dataset Files\nThe CGNN code needs the following files:\n\n* `targets.csv` consists of all target values.\n* `graph_data.npz` consists of all node and neighbor lists of graphs.\n* `config.json` defines node vectors.\n* `split.json` defines data splitting (train/val/test).\n\n#### Target Values\n`targets.csv` must have a header row consisting `name` and target names such as `formation_energy_per_atom`, `volume_deviation`, `band_gap`, and `magnetization_per_atom`. The `name` column must store identifiers like an ID number or string that is unique to each example in the dataset. The target columns must store numerical values excluding `NaN` and `None`.\n\n#### Crystal Graphs\nYou can create a graph data file (`graph_data.npz`) as follows:\n```python\ngraphs = dict()\nfor name, structure in dataset:\n    nodes = ... # A species-index list\n    neighbors = ... # A list of neighbor lists\n    graphs[name] = (nodes, neighbors)\nnp.savez_compressed('graph_data.npz', graph_dict=graphs)    \n```\nwhere `name` is the same identifier as in `targets.csv` for each example.\n\n`tools/mp_graph.py` creates graph data from structures given in the Materials Project structure format. This tool is used when the OQMD dataset is compiled.\n\n#### Node Vectors\nYou can create a configuration file (`config.json`) using the one-hot encoding as follows:\n\n```python\nn_species = ... # The number of node species\nconfig = dict()\nconfig[\"node_vectors\"] = np.eye(n_species,n_species).tolist()\nwith open(\"config.json\", 'w') as f:\n    json.dump(config, f)\n```\n\n#### Data Splitting\nYou can create a data-splitting file (`split.json`) as follows:\n\n```python\nsplit = dict()\nsplit[\"train\"] = ... # The index list for the training set\nsplit[\"val\"] = ... # The index list for the validation set\nsplit[\"test\"] = ... # The index list for the testing set\nwith open(\"split.json\", 'w') as f:\n    json.dump(split, f)\n```\nwhere the index, which must be a non-negative integer, is a row label of the data frame that the CSV file `targets.csv` is read into.\n\n### Training\nA training script example:\n\n```shell\nNodeFeatures=... # The size of a node vector\nDATASET=${CGNN_HOME}/YourDataset\npython ${CGNN_HOME}/src/cgnn.py \\\n  --num_epochs 100 \\\n  --batch_size 512 \\\n  --lr 0.001 \\\n  --n_node_feat ${NodeFeatures} \\\n  --n_hidden_feat 64 \\\n  --n_graph_feat 128 \\\n  --n_conv 3 \\\n  --n_fc 2 \\\n  --dataset_path ${DATASET} \\\n  --split_file ${DATASET}/split.json \\\n  --target_name formation_energy_per_atom \\\n  --milestones 80 \\\n  --gamma 0.1 \\\n```\n\nYou can see the training history using `tools/plot_history.py` that plots the root mean squared errors (RMSEs) and the mean absolute errors (MAEs) for the training and validation sets. The values of the loss (the mean squared error, MSE) and the MAE are written to `history.csv` for every epoch.\n\n```shell\npython ${CGNN_HOME}/tools/plot_history.py\n```\n\nAfter the end of the training, predictions for the testing set are written to `test_predictions.csv`. You can see the predictions compared to the target values using `tools/plot_test.py`.\n\n```shell\npython ${CGNN_HOME}/tools/plot_test.py\n```\n\n### Prediction\nThe prediction for new data is conducted using the testing-only mode of the program. You first prepare a new dataset with a testing set including all examples to be predicted. The prediction configuration must have all the same parameters as the training configuration except for the total number of epochs, which must be zero for testing only. In addition, you must specify the model to be loaded using `--load_model YourModel`.   \n\n```shell\nDATASET=${CGNN_HOME}/NewDataset\npython ${CGNN_HOME}/src/cgnn.py \\\n  --num_epochs 0 \\\n  --batch_size 512 \\\n  --lr 0.001 \\\n  --n_node_feat ${NodeFeatures} \\\n  --n_hidden_feat 64 \\\n  --n_graph_feat 128 \\\n  --n_conv 3 \\\n  --n_fc 2 \\\n  --dataset_path ${DATASET} \\\n  --split_file ${DATASET}/split.json \\\n  --target_name formation_energy_per_atom \\\n  --milestones 80 \\\n  --gamma 0.1 \\\n  --load_model ${MODEL} \\\n```\n\n## The Open Quantum Materials Database\nThe OQMD v1.2 contains 563k entries, and is available from [the OQMD site](http://oqmd.org). The detail setup of the database is described in the README in the directory `cgnn/OQMD`. Alternatively, you may use the OQMD v1.2 dataset available at [this link](https://doi.org/10.5281/zenodo.7118055). There is [a data loading tutorial](https://github.com/Tony-Y/oqmd-v1.2-dataset-for-cgnn/blob/main/OQMD_v1_2_dataset_for_CGNN.ipynb). [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Tony-Y/oqmd-v1.2-dataset-for-cgnn/blob/main/OQMD_v1_2_dataset_for_CGNN.ipynb)\n\n*Note that there is an abnormal entry in this dataset. The information is available at [this page](https://github.com/Tony-Y/oqmd-v1.2-dataset-for-cgnn#abnormal-entry).*\n\n## Citation\nWhen you mention this work, please cite [the CGNN paper](https://storage.googleapis.com/rimcs_cgnn/cgnn_matsci_May_27_2019.pdf):\n```\n@techreport{yamamoto2019cgnn,\n  Author = {Takenori Yamamoto},\n  Title = {Crystal Graph Neural Networks for Data Mining in Materials Science},\n  Address = {Yokohama, Japan},\n  Institution = {Research Institute for Mathematical and Computational Sciences, LLC},\n  Year = {2019},\n  Note = {https://github.com/Tony-Y/cgnn}\n}\n```\n\n## References\n\n1. \u003ca name=\"Gilmer2017\"\u003eJustin Gilmer\u003c/a\u003e, *et al.*, \"Neural Message Passing for Quantum Chemistry\", *Proceedings of the 34th International Conference on Machine Learning* (2017) [arXiv](https://arxiv.org/abs/1704.01212) [GitHub](https://github.com/brain-research/mpnn)\n2. \u003ca name=\"Xie2018\"\u003eTian Xie\u003c/a\u003e, *et al.*, \"Crystal Graph Convolutional Neural Networks for an Accurate and Interpretable Prediction of Material Properties\", *Phys. Rev. Lett.* **120**, 145301 (2018) [DOI](https://dx.doi.org/10.1103%2FPhysRevLett.120.145301) [arXiv](https://arxiv.org/abs/1710.10324) [GitHub](https://github.com/txie-93/cgcnn)\n3. \u003ca name=\"Scheutt2018\"\u003eKristof T. Schütt\u003c/a\u003e, *et al.*, \"SchNet - a deep learning architecture for molecules and materials\", *J. Chem. Phys.* **148**, 241722 (2018) [DOI](https://doi.org/10.1063/1.5019779) [arXiv](https://arxiv.org/abs/1712.06113) [GitHub](https://github.com/atomistic-machine-learning/schnetpack)\n\n## License\n\nApache License 2.0\n\n(c) 2019-2024 Takenori Yamamoto\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftony-y%2Fcgnn","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ftony-y%2Fcgnn","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftony-y%2Fcgnn/lists"}