{"id":28700502,"url":"https://github.com/deepgraphlearning/confgf","last_synced_at":"2025-06-26T05:02:43.282Z","repository":{"id":47589750,"uuid":"374559686","full_name":"DeepGraphLearning/ConfGF","owner":"DeepGraphLearning","description":"Implementation of Learning Gradient Fields for Molecular Conformation Generation (ICML 2021).","archived":false,"fork":false,"pushed_at":"2021-09-29T02:25:38.000Z","size":3456,"stargazers_count":166,"open_issues_count":8,"forks_count":37,"subscribers_count":9,"default_branch":"main","last_synced_at":"2025-06-14T11:08:08.936Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/DeepGraphLearning.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2021-06-07T06:33:00.000Z","updated_at":"2025-05-25T04:24:50.000Z","dependencies_parsed_at":"2022-08-30T21:21:30.250Z","dependency_job_id":null,"html_url":"https://github.com/DeepGraphLearning/ConfGF","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/DeepGraphLearning/ConfGF","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DeepGraphLearning%2FConfGF","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DeepGraphLearning%2FConfGF/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DeepGraphLearning%2FConfGF/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DeepGraphLearning%2FConfGF/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/DeepGraphLearning","download_url":"https://codeload.github.com/DeepGraphLearning/ConfGF/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DeepGraphLearning%2FConfGF/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":262003930,"owners_count":23243347,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-06-14T11:08:07.421Z","updated_at":"2025-06-26T05:02:43.254Z","avatar_url":"https://github.com/DeepGraphLearning.png","language":"Python","readme":"![ConfGF](assets/logo.png)\n\n----------------------------\n[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://github.com/DeepGraphLearning/ConfGF/blob/main/LICENSE)\n\n[[PDF]](https://arxiv.org/abs/2105.03902) | [[Slides]](https://drive.google.com/file/d/1wA5Qu98dYPmEdoGt1QQcYfoUJG3Ndnec/view?usp=sharing)\n\n\nThe official implementation of Learning Gradient Fields for Molecular Conformation Generation (ICML 2021 **Long talk**)  \n\n\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"assets/sampling.png\" /\u003e \n\u003c/p\u003e\n\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"assets/demo.gif\" width=\"300\"\u003e\n\u003c/p\u003e\n\n## Installation\n\n### Install via Conda (Recommended)\n\n\n```bash\n# Clone the environment\nconda env create -f env.yml\n\n# Activate the environment\nconda activate confgf\n\n# Install Library\ngit clone https://github.com/DeepGraphLearning/ConfGF.git\ncd ConfGF\npython setup.py install\n```\n\n### Install Manually\n\n```bash\n# Create conda environment\nconda create -n confgf python=3.7\n\n# Activate the environment\nconda activate confgf\n\n# Install packages\nconda install -y -c pytorch pytorch=1.7.0 torchvision torchaudio cudatoolkit=10.2\nconda install -y -c rdkit rdkit==2020.03.2.0\nconda install -y scikit-learn pandas decorator ipython networkx tqdm matplotlib\nconda install -y -c conda-forge easydict\npip install pyyaml\n\n# Install PyTorch Geometric\npip install torch-scatter -f https://pytorch-geometric.com/whl/torch-1.7.0+cu102.html\npip install torch-sparse -f https://pytorch-geometric.com/whl/torch-1.7.0+cu102.html\npip install torch-cluster -f https://pytorch-geometric.com/whl/torch-1.7.0+cu102.html\npip install torch-spline-conv -f https://pytorch-geometric.com/whl/torch-1.7.0+cu102.html\npip install torch-geometric==1.6.3\n\n# Install Library\ngit clone https://github.com/DeepGraphLearning/ConfGF.git\ncd ConfGF\npython setup.py install\n```\n\n\n## Dataset \n### Offical Dataset\nThe offical raw GEOM dataset is avaiable [[here]](https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/JNGTDF).\n\n### Preprocessed dataset\nWe provide the preprocessed datasets (GEOM, ISO17) in a [[google drive folder]](https://drive.google.com/drive/folders/10dWaj5lyMY0VY4Zl0zDPCa69cuQUGb-6?usp=sharing).\nFor ISO17 dataset, we use the default split of [[GraphDG]](https://github.com/gncs/graphdg).\n\n### Prepare your own GEOM dataset from scratch (optional)\nDownload the raw GEOM dataset and unpack it.\n\n```bash\ntar xvf ~/rdkit_folder.tar.gz -C ~/GEOM\n```\n\nPreprocess the raw GEOM dataset.\n\n```bash\npython script/process_GEOM_dataset.py --base_path GEOM --dataset_name qm9 --confmin 50 --confmax 500\npython script/process_GEOM_dataset.py --base_path GEOM --dataset_name drugs --confmin 50 --confmax 100\n```\n\nThe final folder structure will look like this: \n\n```\nGEOM\n|___rdkit_folder  # raw dataset\n|   |___qm9 # raw qm9 dataset\n|   |___drugs # raw drugs dataset\n|   |___summary_drugs.json\n|   |___summary_qm9.json\n|   \n|___qm9_processed\n|   |___train_data_40k.pkl\n|   |___val_data_5k.pkl\n|   |___test_data_200.pkl\n|   \n|___drugs_processed\n|   |___train_data_39k.pkl\n|   |___val_data_5k.pkl\n|   |___test_data_200.pkl\n|\niso17_processed\n|___iso17_split-0_train_processed.pkl\n|___iso17_split-0_test_processed.pkl\n|\n...\n```\n\n## Training\n\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"assets/training.png\" /\u003e \n\u003c/p\u003e\n\nAll hyper-parameters and training details are provided in config files (`./config/*.yml`), and free feel to tune these parameters.\n\nYou can train the model with the following commands:\n\n```bash\npython -u script/train.py --config_path ./config/qm9_default.yml\npython -u script/train.py --config_path ./config/drugs_default.yml\npython -u script/train.py --config_path ./config/iso17_default.yml\n```\n\nThe checkpoint of the models will be saved into a directory specified in config files.\n\n## Generation\n\nWe provide the checkpoints of three trained models, i.e., `qm9_default`, `drugs_default` and `iso17_default` in a [[google drive folder]](https://drive.google.com/drive/folders/10dWaj5lyMY0VY4Zl0zDPCa69cuQUGb-6?usp=sharing).\n\nYou can generate conformations of a molecule by feeding its SMILES into the model:\n\n```bash\npython -u script/gen.py --config_path ./config/qm9_default.yml --generator ConfGF --smiles c1ccccc1\npython -u script/gen.py --config_path ./config/qm9_default.yml --generator ConfGFDist --smiles c1ccccc1\n```\n\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"assets/benzene-crop.png\" width=\"300\"\u003e\n\u003c/p\u003e\n\nHere we use the models trained on `GEOM-QM9`  to generate conformations for the benzene. The argument `--generator` indicates the type of the generator, i.e., `ConfGF` vs. `ConfGFDist`. See the ablation study (Table 5) in the original paper for more details.\n\nYou can also generate conformations for an entire test set.\n```bash\npython -u script/gen.py --config_path ./config/qm9_default.yml --generator ConfGF \\\n                        --start 0 --end 200 \\\n\npython -u script/gen.py --config_path ./config/qm9_default.yml --generator ConfGFDist \\\n                        --start 0 --end 200 \\\n\npython -u script/gen.py --config_path ./config/drugs_default.yml --generator ConfGF \\\n                        --start 0 --end 200 \\\n\npython -u script/gen.py --config_path ./config/drugs_default.yml --generator ConfGFDist \\\n                        --start 0 --end 200 \\\n```\nHere `start` and `end` indicate the range of the test set that we want to use. All hyper-parameters related to generation can be set in config files.\n\nConformations of some drug-like molecules generated by ConfGF are provided below.\n\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"assets/drug_samples.png\" /\u003e \n\u003c/p\u003e\n\n## Get Results\nThe results of all benchmark tasks can be calculated based on generated conformations.\n\nWe report the results of each task in the following tables. **Results of `ConfGF` and `ConfGFDist` are re-evaluated based on the current code base, which successfully reproduce the results reported in the original paper. Results of other models are taken directly from the original paper.**\n\n### Task 1. Conformation Generation\n\nThe COV and MAT scores on the GEOM datasets can be calculated using the following commands:\n\n```bash\npython -u script/get_task1_results.py --input dir_of_QM9_samples --core 10 --threshold 0.5  \n\npython -u script/get_task1_results.py --input dir_of_Drugs_samples --core 10 --threshold 1.25  \n```\n\n\n\n\nTable: COV and MAT scores on GEOM-QM9\n\n\n|    QM9     | COV-Mean (%) | COV-Median (%) | MAT-Mean (\\AA) | MAT-Median (\\AA) |\n| :--------: | :----------: | :------------: | :----------------------------------: | :------------------------------------: |\n| **ConfGF** |  **91.06**   |   **95.76**    |              **0.2649**              |               **0.2668**               |\n| **ConfGFDist** |    85.37     |     88.59      |                0.3435                |                 0.3548                 |\n|    CGCF    |    78.05     |     82.48      |                0.4219                |                 0.3900                 |\n|  GraphDG   |    73.33     |     84.21      |                0.4245                |                 0.3973                 |\n|   CVGAE    |     0.09     |      0.00      |                1.6713                |                 1.6088                 |\n|   RDKit    |    83.26     |     90.78      |                0.3447                |                 0.2935                 |\n\n\n\nTable: COV and MAT scores on GEOM-Drugs\n\n\n\n|   Drugs    | COV-Mean (%) | COV-Median (%) | MAT-Mean (\\AA) | MAT-Median (\\AA) |\n| :--------: | :----------: | :------------: | :----------------------------------: | :------------------------------------: |\n| **ConfGF** |  **62.54**   |   **71.32**    |              **1.1637**              |               **1.1617**               |\n| **ConfGFDist** |    49.96     |     48.12      |                1.2845                |                 1.2827                 |\n|    CGCF    |    53.96     |     57.06      |                1.2487                |                 1.2247                 |\n|  GraphDG   |     8.27     |      0.00      |                1.9722                |                 1.9845                 |\n|   CVGAE    |     0.00     |      0.00      |                3.0702                |                 2.9937                 |\n|   RDKit    |    60.91     |     65.70      |                1.2026                |                 1.1252                 |\n\n\n\n### Task 2. Distributions Over Distances\n\nThe MMD metrics on the ISO17 dataset can be calculated using the following commands:\n\n```bash\npython -u script/get_task2_results.py --input dir_of_ISO17_samples\n```\n\n\n\nTable: Distributions over distances\n\n|   Method   | Single-Mean | Single-Median | Pair-Mean  | Pair-Median | All-Mean   | All-Median |\n| :--------: | :---------: | :-----------: | :--------: | :---------: | ---------- | ---------- |\n| **ConfGF** |   0.3430    |    0.2473     |   0.4195   |   0.3081    | **0.5432** | **0.3868** |\n| **ConfGFDist** | **0.3348**  |    0.2011     | **0.4080** | **0.2658**  | 0.5821     | 0.3974     |\n|    CGCF    |   0.4490    |  **0.1786**   |   0.5509   |   0.2734    | 0.8703     | 0.4447     |\n|  GraphDG   |   0.7645    |    0.2346     |   0.8920   |   0.3287    | 1.1949     | 0.5485     |\n|   CVGAE    |   4.1789    |    4.1762     |   4.9184   |   5.1856    | 5.9747     | 5.9928     |\n|   RDKit    |   3.4513    |    3.1602     |   3.8452   |   3.6287    | 4.0866     | 3.7519     |\n\n\n\n\n## Visualizing molecules with PyMol\n\n### Start Setup\n\n1. `pymol -R`\n2. `Display - Background - White`\n3. `Display - Color Space - CMYK`\n4. `Display - Quality - Maximal Quality`\n5. `Display Grid`\n   1. by object:  use `set grid_slot, int, mol_name` to put the molecule into the corresponding slot\n   2. by state: align all conformations in a single slot\n   3. by object-state: align all conformations and put them in separate slots. (`grid_slot` dont work!)\n6. `Setting - Line and Sticks - Ball and Stick on - Ball and Stick ratio: 1.5`\n7. `Setting - Line and Sticks - Stick radius: 0.2 - Stick Hydrogen Scale: 1.0`\n\n### Show Molecule\n\n1. To show molecules\n\n   1. `hide everything`\n   2. `show sticks`\n\n2. To align molecules: `align name1, name2`\n\n3. Convert RDKit mol to Pymol\n\n   ```python\n   from rdkit.Chem import PyMol\n   v= PyMol.MolViewer()\n   rdmol = Chem.MolFromSmiles('C')\n   v.ShowMol(rdmol, name='mol')\n   v.SaveFile('mol.pkl')\n   ```\n\n### Make the trajectory for Langevin dynamics\n1. load a sequence of pymol objects named `traj*.pkl` into the PyMol, where `traji.pkl` is the `i-th` conformation in the trajectory.\n2. Join states: `join_states mol, traj*, 0`\n3. Delete useless object: `delete traj*`\n4. `Movie - Program - State Loop - Full Speed`\n5. Export the movie to a sequence of PNG files: `File - Export Movie As - PNG Images`\n6. Use photoshop to convert the PNG sequence to a GIF with the transparent background.\n\n\n## Citation\nPlease consider citing the following paper if you find our codes helpful. Thank you!\n```\n@inproceedings{shi*2021confgf,\ntitle={Learning Gradient Fields for Molecular Conformation Generation},\nauthor={Shi, Chence and Luo, Shitong and Xu, Minkai and Tang, Jian},\nbooktitle={International Conference on Machine Learning},\nyear={2021}\n}\n```\n\n## Contact\nChence Shi (chence.shi@umontreal.ca)\n\nShitong Luo (luost26@gmail.com)","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdeepgraphlearning%2Fconfgf","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdeepgraphlearning%2Fconfgf","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdeepgraphlearning%2Fconfgf/lists"}