{"id":22704970,"url":"https://github.com/akensert/pepgraph","last_synced_at":"2025-03-29T20:23:00.329Z","repository":{"id":244405732,"uuid":"815151976","full_name":"akensert/pepgraph","owner":"akensert","description":"Graph tensors and networks with TensorFlow and its ExtensionType API.","archived":false,"fork":false,"pushed_at":"2024-06-14T13:16:50.000Z","size":13,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-02-04T21:17:18.469Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/akensert.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-06-14T13:13:08.000Z","updated_at":"2024-06-14T13:16:53.000Z","dependencies_parsed_at":"2024-06-14T14:41:30.161Z","dependency_job_id":"b5819069-48c1-4fe6-9116-51ed28c3e121","html_url":"https://github.com/akensert/pepgraph","commit_stats":null,"previous_names":["akensert/pepgraph"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/akensert%2Fpepgraph","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/akensert%2Fpepgraph/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/akensert%2Fpepgraph/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/akensert%2Fpepgraph/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/akensert","download_url":"https://codeload.github.com/akensert/pepgraph/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":246238709,"owners_count":20745581,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-12-10T09:08:36.344Z","updated_at":"2025-03-29T20:23:00.308Z","avatar_url":"https://github.com/akensert.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cimg src=\"https://github.com/akensert/pepgraph/blob/main/docs/_static/pepgraph-logo-pixel.png\" alt=\"pepgraph-title\" width=\"90%\"\u003e\n\n**Work in progress**: Inspired by TF-GNN, this project aims to implement Graph Tensors and Graph Neural Networks using TF's ExtensionType API. While focusing on molecular structures such as peptides, PepGraph can be used for other graph data as well. \n\n\u003e As Keras 3 does not currently support extension types, this project currently requires Keras 2 (and TF\u003c=2.15)\n\n## Models \n\n*in progress*\n\n## Graph Tensor \n\n*in progress*\n\nObtain a `GraphTensor` instance encoding multiple peptides as a single disjoint graph. In addition to atoms and bonds, virtual nodes are added which corresponds to the residues (amino acids) of the peptides. Relevant atoms are linked to these virtual nodes in a unidirectional way; the features of the virtual nodes can subsequently be extracted for sequence modeling (using e.g., an LSTM).\n\n\u003e Current modules are experimental and may change in the future.\n\n```python \nfrom pepgraph import GraphTensor, Context, NodeSet, EdgeSet\n\npeptide_graph = GraphTensor(\n    context=Context({\n        \"n_residues\": [1, 2, 1]\n    }),\n    node_sets={        \n        \"atoms\": NodeSet(\n            sizes=[5, 9, 6], \n            features=[\n                \"N\", \"C\", \"C\", \"O\", \"O\",\n                \"N\", \"C\", \"C\", \"O\", \"N\", \"C\", \"C\", \"O\", \"O\",\n                \"N\", \"C\", \"C\", \"C\", \"O\", \"O\"\n            ]\n        ),   \n        \"residues\": NodeSet(\n            sizes=[1, 2, 1], \n            features=[\n                \"Gly\", \n                \"Gly\", \"Gly\", \n                \"Ala\"\n            ]\n        ),          \n    },\n    edge_sets={\n        \"bonds\": EdgeSet(\n            sizes=[8, 16, 10], \n            source=(\n                \"atoms\", [\n                    0,  1,  1,  2,  2,  2,  3,  4,  \n                    5,  6,  6,  7,  7,  7,  8,  9,  9, 10, 10, 11, 11, 11, 12, 13, \n                    14, 15, 15, 15, 16, 17, 17, 17, 18, 19\n                ]\n            ),\n            target=(\n                \"atoms\", [\n                    1,  0,  2,  1,  3,  4,  2,  2,  \n                    6,  5,  7,  6,  8,  9,  7,  7, 10, 9, 11, 10, 12, 13, 11, 11, \n                    15, 14, 16, 17, 15, 15, 18, 19, 17, 17\n                ]\n            )\n        ),\n        \"virtual_bonds\": EdgeSet(\n            sizes=[5, 9, 6], \n            source=(\n                \"atoms\", [\n                    0, 1, 2, 3, 4,\n                    5, 6, 7, 8, 9, 10, 11, 12, 13,\n                    14, 15, 16, 17, 18, 19,\n                ]\n            ),\n            target=(\n                \"residues\", [\n                    0, 0, 0, 0, 0,\n                    1, 1, 1, 1, 2, 2, 2, 2, 2,\n                    3, 3, 3, 3, 3, 3,\n                ]\n            )\n        ),\n    }\n)\n```\n\nAlthough this `GraphTensor` instance contains nested structures of variable sizes, it can be used with TensorFlow's Dataset API, and thus efficiently iterated over for modeling:\n\n```python\nds = tf.data.Dataset.from_tensor_slices(peptide_graph)\nds = ds.shuffle(3)\nds = ds.batch(2)\n\nfor x in ds:\n    print(x.node_sets[\"atoms\"].features)\n    print(x.edge_sets[\"bonds\"].source[1])\n```\n\nSave graphs to disk (via `tf.io.TFRecordWriter`):\n```python\nfrom pepgraph.datasets import records\n\npeptide_graphs = [peptide_graph[i] for i in range(3)]\nrecords.write(peptide_graphs, \"/tmp/tf_records/\")\n```\n\nLoad graphs from disk (via `tf.data.TFRecordDataset`):\n```python\nds = records.load(\"/tmp/tf_records/\")\nfor x in ds.shuffle(3).batch(2):\n    pass\n```","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fakensert%2Fpepgraph","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fakensert%2Fpepgraph","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fakensert%2Fpepgraph/lists"}