{"id":13751739,"url":"https://github.com/MolecularAI/GraphINVENT","last_synced_at":"2025-05-09T18:32:37.482Z","repository":{"id":37583515,"uuid":"289293101","full_name":"MolecularAI/GraphINVENT","owner":"MolecularAI","description":"Graph neural networks for molecular design.","archived":true,"fork":false,"pushed_at":"2023-03-11T11:55:32.000Z","size":22749,"stargazers_count":352,"open_issues_count":10,"forks_count":74,"subscribers_count":15,"default_branch":"master","last_synced_at":"2024-05-06T00:03:26.374Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":false,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/MolecularAI.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null}},"created_at":"2020-08-21T14:45:19.000Z","updated_at":"2024-04-28T04:50:51.000Z","dependencies_parsed_at":"2024-01-13T12:35:15.612Z","dependency_job_id":null,"html_url":"https://github.com/MolecularAI/GraphINVENT","commit_stats":null,"previous_names":[],"tags_count":3,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/MolecularAI%2FGraphINVENT","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/MolecularAI%2FGraphINVENT/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/MolecularAI%2FGraphINVENT/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/MolecularAI%2FGraphINVENT/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/MolecularAI","download_url":"https://codeload.github.com/MolecularAI/GraphINVENT/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":213786220,"owners_count":15638379,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-08-03T09:00:53.713Z","updated_at":"2024-08-03T09:02:44.417Z","avatar_url":"https://github.com/MolecularAI.png","language":"Python","funding_links":[],"categories":["分子","Ranked by starred repositories","Generative Molecular Design"],"sub_categories":["网络服务_其他"],"readme":"**Please note: this repository is no longer being maintained.**\n\n# GraphINVENT\n\n![cover image](./cover-image.png)\n\n## Description\nGraphINVENT is a platform for graph-based molecular generation using graph neural networks. GraphINVENT uses a tiered deep neural network architecture to probabilistically generate new molecules a single bond at a time. All models implemented in GraphINVENT can quickly learn to build molecules resembling training set molecules without any explicit programming of chemical rules. The models have been benchmarked using the MOSES distribution-based metrics, showing how the best GraphINVENT model compares well with state-of-the-art generative models.\n\n## Updates\nThe following versions of GraphINVENT exist in this repository:\n* v1.0 (and all commits up to here) is the label corresponding to the \"original\" version, and corresponds with the publications below.\n* v2.0 is an outdated version, created March 10, 2021.\n* v3.0 is the latest version, created August 20, 2021.\n\n*20-08-2021*:\n\nLarge update: \n* Added a reinforcement learning framework to allow for fine-tuning models. Fine-tuning jobs can now be run using the --job-type \"fine-tune\" flag. \n* An example submission script for fine-tuning jobs was added (`submit-fine-tuning.py`), and the old example submission script was renamed (`submit.py` --\u003e `submit-pre-training.py`).\n* Note: the tutorials have not yet been updated to reflect the changes, this will be done soon but for now be aware that there may be small discrepancies between what is written in the tutorial and the actual instructions. I will delete this bulletpoint when I have updated the tutorials.\n\n*26-03-2021*:\n\nSmall update: \n* Pre-trained models created with GraphINVENT v1.0 can now be used with GraphINVENT v2.0.\n\n*10-03-2021*:\n\nThe biggest changes in v2.0 from v1.0 are summarized below:\n* Data preprocessing was updated for readibility (now done in `DataProcesser.py`).\n* Graph generation was updated for readibility (now done in `Generator.py`), as well as some bugs related to how implicit Hs and chirality were handled on the GPU (not used before, despite being available for preprocessing/training).\n* Data analysis code was updated for readibility (now done in `Analyzer.py`).\n* The learning rate decay scheme was changed from a custom learning rate scheduler to the OneCycle scheduler (so far, it appears to be working well enough, and with a reduced set of parameters).\n* The code now runs using the latest version of PyTorch (1.8.0); the previous version was running using PyTorch 1.3. The environment has correspondingly been updated (and renamed \"GraphINVENT-env\" -\u003e \"graphinvent\").\n* Redundant hyperparameters were removed; additionally, hyperparameters seen not to improve things were removed from `defaults.py`, such as the optimizer weight decay (now just 0.0) and weights initialization (fixed to Xavier uniform now).\n* Some old functions, such as `models.py` and `loss.py` were consolidated into `Workflow.py`.\n* A validation loss calculation was added to keep track of model training.\n\nAdditionally, minor typos and bugs were corrected, and the docstrings and error messages updated. Examples of minor bugs/changes:\n* Bug in how fraction properly terminated graphs (and fraction valid of properly terminated) was calculated (wrong function for data type, which led to errors in rare instances).\n* Errors in how analysis histograms were written to tensorboard; these were also of questionable utility so are now simply removed.\n* Some values (like the \"NLL diff\") were removed, as they were also not found to be useful.\n\nIf you spot any issues (big or small) since the update, please create an issue or a pull request (if you are able to fix it), and we will be happy to make changes.\n\n## Prerequisites\n* Anaconda or Miniconda with Python 3.6 or 3.8.\n* (for GPU-training only) CUDA-enabled GPU.\n\n## Instructions and tutorials\nFor detailed guides on how to use GraphINVENT, see the [tutorials](./tutorials/).\n\n## Examples\nAn example training set is available in [./data/gdb13_1K/](./data/gdb13_1K/). It is a small (1K) subset of GDB-13 and is already preprocessed.\n\n## Contributors\n[@rociomer](https://www.github.com/rociomer)\n\n[@rastemo](https://www.github.com/rastemo)\n\n[@edvardlindelof](https://www.github.com/edvardlindelof)\n\n[@sararromeo](https://www.github.com/sararromeo)\n\n[@JuanViguera](https://www.github.com/JuanViguera)\n\n[@psolsson](https://www.github.com/psolsson)\n\n## Contributions\n\nContributions are welcome in the form of issues or pull requests. To report a bug, please submit an issue. Thank you to everyone who has used the code and provided feedback thus far.\n\n\n## References\n### Relevant publications\nIf you use GraphINVENT in your research, please reference our [publication](https://doi.org/10.1088/2632-2153/abcf91).\n\nAdditional details related to the development of GraphINVENT are available in our [technical note](https://doi.org/10.1002/ail2.18). You might find this note useful if you're interested in either exploring different hyperparameters or developing your own generative models.\n\nThe references in BibTex format are available below:\n\n```\n@article{mercado2020graph,\n  author = \"Rocío Mercado and Tobias Rastemo and Edvard Lindelöf and Günter Klambauer and Ola Engkvist and Hongming Chen and Esben Jannik Bjerrum\",\n  title = \"{Graph Networks for Molecular Design}\",\n  journal = {Machine Learning: Science and Technology},\n  year = {2020},\n  publisher = {IOP Publishing},\n  doi = \"10.1088/2632-2153/abcf91\"\n}\n\n@article{mercado2020practical,\n  author = \"Rocío Mercado and Tobias Rastemo and Edvard Lindelöf and Günter Klambauer and Ola Engkvist and Hongming Chen and Esben Jannik Bjerrum\",\n  title = \"{Practical Notes on Building Molecular Graph Generative Models}\",\n  journal = {Applied AI Letters},\n  year = {2020},\n  publisher = {Wiley Online Library},\n  doi = \"10.1002/ail2.18\"\n}\n```\n\n### Related work\n#### MPNNs\nThe MPNN implementations used in this work were pulled from Edvard Lindelöf's repo in October 2018, while he was a masters student in the MAI group. This work is available at\n\nhttps://github.com/edvardlindelof/graph-neural-networks-for-drug-discovery.\n\nHis master's thesis, describing the EMN implementation, can be found at\n\nhttps://odr.chalmers.se/handle/20.500.12380/256629.\n\n#### MOSES\nThe MOSES repo is available at https://github.com/molecularsets/moses.\n\n#### GDB-13\nThe example dataset provided is a subset of GDB-13. This was obtained by randomly sampling 1000 structures from the entire GDB-13 dataset. The full dataset is available for download at http://gdb.unibe.ch/downloads/.\n\n\n#### RL-GraphINVENT\nVersion 3.0 incorporates Sara's work into the latest GraphINVENT framework: [repo](https://github.com/olsson-group/RL-GraphINVENT) and [paper](https://doi.org/10.33774/chemrxiv-2021-9w3tc). Her work was presented at the [RL4RealLife](https://sites.google.com/view/RL4RealLife) workshop at ICML 2021.\n\n#### Exploring graph traversal algorithms in GraphINVENT\nIn [this](https://doi.org/10.33774/chemrxiv-2021-5c5l1) pre-print, we look into the effect of different graph traversal algorithms on the types of structures that are generated by GraphINVENT. We find that a BFS generally leads to better molecules than a DFS, unless the model is overtrained, at which point both graph traversal algorithms lead to indistinguishible sets of structures.\n\n## License\n\nGraphINVENT is licensed under the MIT license and is free and provided as-is.\n\n## Link\nhttps://github.com/MolecularAI/GraphINVENT/\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FMolecularAI%2FGraphINVENT","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FMolecularAI%2FGraphINVENT","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FMolecularAI%2FGraphINVENT/lists"}