{"id":15140699,"url":"https://github.com/quantori/structure-seer","last_synced_at":"2026-02-16T21:32:33.011Z","repository":{"id":217291362,"uuid":"728768148","full_name":"quantori/structure-seer","owner":"quantori","description":"The implementation, training and evaluation of a Structure Seer machine learning model designed for reconstruction of adjacency of a molecular graph from the labelling of its nodes.","archived":false,"fork":false,"pushed_at":"2024-01-11T11:40:18.000Z","size":45344,"stargazers_count":1,"open_issues_count":0,"forks_count":1,"subscribers_count":1,"default_branch":"main","last_synced_at":"2026-01-27T12:28:43.450Z","etag":null,"topics":["cheminformatics","graph","graph-convolutional-network","machine-learning","ml","molecular-graph","molecular-graph-learning","molecule","molecule-generation","nmr-data","nmr-spectroscopy"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/quantori.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":"CITATION.cff","codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2023-12-07T16:45:51.000Z","updated_at":"2024-01-15T10:24:28.000Z","dependencies_parsed_at":null,"dependency_job_id":"12872868-0513-4a82-b50f-b06e19814e5d","html_url":"https://github.com/quantori/structure-seer","commit_stats":{"total_commits":23,"total_committers":2,"mean_commits":11.5,"dds":0.04347826086956519,"last_synced_commit":"16f330e72de3ab6dc86fc27dacf3eb5527c9a4b7"},"previous_names":["quantori/structure-seer"],"tags_count":1,"template":false,"template_full_name":null,"purl":"pkg:github/quantori/structure-seer","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/quantori%2Fstructure-seer","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/quantori%2Fstructure-seer/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/quantori%2Fstructure-seer/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/quantori%2Fstructure-seer/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/quantori","download_url":"https://codeload.github.com/quantori/structure-seer/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/quantori%2Fstructure-seer/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":29519330,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-02-16T18:37:19.720Z","status":"ssl_error","status_checked_at":"2026-02-16T18:36:46.920Z","response_time":115,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.6:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cheminformatics","graph","graph-convolutional-network","machine-learning","ml","molecular-graph","molecular-graph-learning","molecule","molecule-generation","nmr-data","nmr-spectroscopy"],"created_at":"2024-09-26T08:40:21.432Z","updated_at":"2026-02-16T21:32:32.993Z","avatar_url":"https://github.com/quantori.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"[![DOI:10.1039/D3DD00178D](http://img.shields.io/badge/DOI-10.1039/D3DD00178D-ebe534.svg)](https://doi.org/10.1039/D3DD00178D)\n\n![PyTorch](https://img.shields.io/badge/PyTorch-%23EE4C2C.svg?style=for-the-badge\u0026logo=PyTorch\u0026logoColor=white)\n# Structure Seer \n\nThe implementation training and evaluation of a Structure Seer model designed for\nreconstruction of adjacency of a molecular graph from the labelling of its nodes.\nThe detailed characterisation and disclosure of the model architecture is provided in:\n[Structure Seer - a machine learning model for chemical structure elucidation\nfrom a node labelling of a molecular graph, Digital discovery, 2023](https://doi.org/10.1039/D3DD00178D)\n\n## Datasets\n\nThe repository does not contain initial datasets used for training. \n- Small example datasets for detailed model evaluation are provided in ```./example_dataset```\n- Model weights trained on QM9 and PubChem Datasets are stored in ```./weights```\n\n## Abstract\n\nThe repository contains the implementation for a novel graph convolution based machine-learning model which\nis designed to provide a quantitative probabilistic prediction on the connectivity of the atoms based on the\ninformation on the elemental composition of the molecule along with a list of atom-attributed isotropic shielding\nconstants. The suggested approach holds significant potential for scalability, as it can harness vast amounts\nof information on known chemical structures for the model's learning process. The model architecture allows for \ndirect structure reconstruction through prediction of molecular graph adjacency based solely on the\nlabelling of its nodes, which potentially allows dealing with molecules of any size and composition\n(given an appropriate training dataset is available) without significant increase in computational resources required. \t\t\t\t\t\n\t\t\t\t\n## Key approaches\n\n### Unification of adjacency matrix representation\n\nThe primary challenge in generating the adjacency matrix is that it is not an invariant for a given graph.\nFor a given graph with G nodes, there are G! adjacency matrices that can describe its connectivity.\nTo tackle this issue, the adjacency matrix representation needs to be unified. Typically, in the machine- readable\nrepresentation of a molecule, its atoms are stored in the first-depth-tree traversal order. \nWhile this order contains information about the stored structure, it cannot be easily reconstructed when only\nthe elemental composition of the molecule and the isotropic shielding constant for each atom are known. \nSince the shielding constant provides a unique characterization of an atom's chemical environment, it can be\nemployed to standardize the representation of the adjacency matrix in conjunction with element information.\n\n### Generic adjacency matrix\n\nThe architecture of the Structure Seer model bears similarities to other GCN-based models used for diverse tasks\ninvolving molecular graphs. However, its distinctive design is centred around encoding the molecule\nsolely based on node labelling, which allows for the generation of the complete adjacency matrix.\nThis feature makes the considered architecture applicable to a broad range of atom adjacency reconstruction tasks.\n\n## Training\n\nRefer to the training procedure in the Jupyter notebook ```./training.ipynb``` . \nCustomize the procedure by adjusting the global variables in the second code cell.\nThe main training function source code is in ```./training/train_model.py```.\n\nIn order to train the model using Google Colab - extract the repository to the GDrive into ```./MyDrive```.\n\n## Evaluation\n\nFor model evaluation, utilize ```./model_evaluation.ipynb``` with the pretrained model weights.\nSmall example datasets for detailed model evaluation are provided in ```./example_dataset```.\n\n## Code examples\n\nExplore model usage and functionality in ```./structure_seer_code_examples.ipynb```,\nwhich includes illustrative examples.\n\n\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fquantori%2Fstructure-seer","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fquantori%2Fstructure-seer","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fquantori%2Fstructure-seer/lists"}