{"id":20837813,"url":"https://github.com/astrazeneca/hsqc_structure_elucidation","last_synced_at":"2025-05-08T20:34:43.659Z","repository":{"id":199478496,"uuid":"697310287","full_name":"AstraZeneca/hsqc_structure_elucidation","owner":"AstraZeneca","description":"Implementation of the SGNN graph neural network for 1H and 13C NMR prediction and a tool for distinguishing different molecules based on HSQC simulations","archived":false,"fork":false,"pushed_at":"2023-09-29T15:08:55.000Z","size":56660,"stargazers_count":15,"open_issues_count":0,"forks_count":0,"subscribers_count":10,"default_branch":"main","last_synced_at":"2025-03-31T17:59:19.068Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/AstraZeneca.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null}},"created_at":"2023-09-27T13:23:17.000Z","updated_at":"2025-03-03T02:19:45.000Z","dependencies_parsed_at":null,"dependency_job_id":"d534de2b-1ff0-4aa5-ab93-108bc6f6fd07","html_url":"https://github.com/AstraZeneca/hsqc_structure_elucidation","commit_stats":null,"previous_names":["astrazeneca/hsqc_structure_elucidation"],"tags_count":1,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AstraZeneca%2Fhsqc_structure_elucidation","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AstraZeneca%2Fhsqc_structure_elucidation/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AstraZeneca%2Fhsqc_structure_elucidation/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AstraZeneca%2Fhsqc_structure_elucidation/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/AstraZeneca","download_url":"https://codeload.github.com/AstraZeneca/hsqc_structure_elucidation/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":253145972,"owners_count":21861309,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-18T01:08:39.848Z","updated_at":"2025-05-08T20:34:43.604Z","avatar_url":"https://github.com/AstraZeneca.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# HSQC Spectra Simulation and Matching Tool\n [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1WawzugDDloQSxrToI6T66gm9cfXZqWAm?usp=sharing\n)\n![Maturity level-0](https://img.shields.io/badge/Maturity%20Level-ML--0-red)\n\nImplementation of the following publication: \n\n**Advancing HSQC Spectral Matching: A Comparative Study of Peak-Matching and Simulation Techniques for Molecular Identification**\n\nLink to the paper: [link](https://dummy.com) (dummy)\n\n\u003cimg src=\"./dump/graphical_abstract.png\" width=\"500\"\u003e\n\n*The tool allows for identification of correct molecules among similar analogues (molecules with the very similar molecular weight or different regio- or stereoisomers)*\n\n\n\n## What is this?\n![Alt text](./dump/Paper_1_Figure_v3_GITHUB.png)\n\nThis tool provides a comprehensive platform for simulating and matching Heteronuclear Single Quantum Coherence (HSQC) spectra which can be used to facilitate molecular structure elucidation. \nThe tool provides an implementation of a machine learning based 1H and 13C NMR prediction with a graph-based neural network (ML) which was published as follows:\n**Scalable graph neural network for NMR chemical shift prediction**  [URL](https://pubs.rsc.org/en/Content/ArticleLanding/2022/CP/D2CP04542G)\nfrom *Jongmin Han, Hyungu Kang, Seokho Kang, Youngchun Kwon, Dongseon Lee and Youn-Suk Choi* \n\nFurthermore, it incorporates four distinct HSQC simulation techniques: ACD-Labs (ACD), MestReNova (MNova), Gaussian NMR calculations (DFT), and a graph-based neural network (ML). For DFT and ML, we've supplemented the techniques with a self-implemented 2D HSQC reconstruction logic. We've also devised three peak-matching strategies—Minimum-Sum (MinSum), Euclidean-Distance (EucDist), and Hungarian-Distance (HungDist)—which are combined with three padding approaches—zero-padding (Zero), peak-truncated (Trunc), and nearest-neighbor double assignment (NN) which can be selected for peak matching. \n\u003cimg src=\"./dump/peak_padding.png\" width=\"800\"\u003e\n\nThe tool is adept at handling molecules with very similar molecular weight or different regio- or stereoisomers, thereby facilitating the identification of correct molecules among similar analogues. Additionally, our methodology shows robust performance in resolving ambiguous structural assignments, as demonstrated on a set of previously misassigned molecules.\nThe tool is linked with a Google Colab notebook that allows users to apply our methodology to their own data, run the ML NMR prediction, and learn how to generate simulated spectra with commercial software. It also provides instructions on processing real spectra and conducting similarity comparisons using the algorithms we've implemented. This hands-on, interactive tool is designed to enhance user understanding and practical application of the methodologies used.\n\nTry it out yourself using the following Google Colab Notebook:\n [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1WawzugDDloQSxrToI6T66gm9cfXZqWAm?usp=sharing\n) \nhttps://colab.research.google.com/drive/1WawzugDDloQSxrToI6T66gm9cfXZqWAm?usp=sharing\n## Want to see a short video demonstration and user tutorials?\n| Tutorial: Colab Notebook |  Tutorial: ACD HSQC Simulation |  Tutorial: MNova HSQC Simulation |  Tutorial: Experimental Data Preparation |\n|:-:|:-:|:-:|:-:|\n| [![](./dump/Colab_Notebook.PNG)](https://youtu.be/w59bVTpJmZY) | [![](./dump/MNova.PNG)](https://youtu.be/RyMQuRYtpbM) | [![](./dump/ACD.PNG)](https://youtu.be/xymw0ZRF8Xo) | [![](./dump/experimental_peak_picking.PNG)](https://youtu.be/NYW5bve198U) |\n\n\n\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fastrazeneca%2Fhsqc_structure_elucidation","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fastrazeneca%2Fhsqc_structure_elucidation","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fastrazeneca%2Fhsqc_structure_elucidation/lists"}