{"id":21074724,"url":"https://github.com/onesparse/graphony","last_synced_at":"2025-04-12T21:45:33.791Z","repository":{"id":126963882,"uuid":"398027499","full_name":"OneSparse/Graphony","owner":"OneSparse","description":null,"archived":false,"fork":false,"pushed_at":"2021-10-06T22:28:28.000Z","size":3840,"stargazers_count":5,"open_issues_count":0,"forks_count":1,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-03-26T15:54:34.410Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/OneSparse.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-08-19T17:43:29.000Z","updated_at":"2024-11-19T03:41:55.000Z","dependencies_parsed_at":"2023-06-19T07:52:26.723Z","dependency_job_id":null,"html_url":"https://github.com/OneSparse/Graphony","commit_stats":null,"previous_names":["onesparse/graphony","graphegon/graphony"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/OneSparse%2FGraphony","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/OneSparse%2FGraphony/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/OneSparse%2FGraphony/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/OneSparse%2FGraphony/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/OneSparse","download_url":"https://codeload.github.com/OneSparse/Graphony/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248637835,"owners_count":21137538,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-19T19:17:27.654Z","updated_at":"2025-04-12T21:45:33.769Z","avatar_url":"https://github.com/OneSparse.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Graphony\n\nGraphony is a Python library for doing high-performance graph analysis\nusing the GraphBLAS over sparse and hypersparse data sets.\n\nGraphony uses\n[pygraphblas](https://graphegon.github.io/pygraphblas/pygraphblas/index.html)\nto store graph data in sparse [GraphBLAS\nMatrices](http://graphblas.org) and node and edge properties in\n[PostgreSQL](https://postgresql.org).\n\nGraphony's primary role is to easily construnct graph matrices and\nmanage symbolic names and properties for graphs, nodes, properties and\nedges, and can be used to easily construct, save and manage graph data\nin a simple project directory format.\n\nGraphs can be:\n\n  - [Simple](https://en.wikipedia.org/wiki/Graph_(discrete_mathematics)#Graph):\n    an edge connects one source to one destination.\n\n  - [Hypergraph](https://en.wikipedia.org/wiki/Hypergraph): a graph\n    with at lest one *hyperedge* connecting multiple source nodes to\n    multiple destinations.\n\n  - [Multigraph](https://en.wikipedia.org/wiki/Multigraph): multiple\n    edges can exist between a source and destination.\n\n  - [Property\n    Graph](http://graphdatamodeling.com/Graph%20Data%20Modeling/GraphDataModeling/page/PropertyGraphs.html):\n    Nodes and and Edges can have arbitrary JSON properties.\n\n# Creating Graphs\n\nThis documentation is also a runnable Python test called a\n[doctest]().  In order to run and verify this documentation, we must\nfirst create some helper objects like a function `p()` that will\niterate results into a list and \"pretty print\" them.  We also have to\nsetup a test PostgreSQL database and initialize it with the base\nGraphony tables.\n\nThe core object of Graphony is a `Graph()`. A new Graph can be created\nwith a connection string to an existing initialized database:\n\n\u003c!--phmdoctest-setup--\u003e\n```python3\nimport os\nimport pprint\nimport postgresql\nfrom pathlib import Path\nfrom pygraphblas import FP64, INT64, gviz\nfrom graphony import Graph, Node\np = lambda r: pprint.pprint(sorted(list(r)))\npgdata, conn = postgresql.setup()\npostgresql.psql(f'-d \"{conn}\" -f dbinit/01_init_graphony.sql -f dbinit/02_karate_demo.sql')\nG = Graph(conn)\n```\n\nThe `Graph` object `G` is used throughout the following documenation\nto demonstrate the features of Graphony.  A Graphony `Graph` consists\nof four concepts:\n\n  - `Graph`: Top level object that contains all graph data in\n    sub-graphs called *properties*.\n\n  - `Property`: A named, typed sub-graph that holds edges.  A\n    property consists of two GraphBLAS [Incidence\n    Matrices](https://en.wikipedia.org/wiki/Incidence_matrix) that can\n    be multiplied to project an adjacency with themselves, or any\n    other combination of properties.\n\n  - `Edge`: Property edges can be simple point to point edges or\n    hyperedges that represent properties between multiple incoming and\n    outgoing nodes.\n\n  - `Node`: A node in the graph.\n\n# Simple Graphs\n\nGraphs consist of multiple typed subgraphs called *properties*.  All\nproperties share node ids across a Graph object, but each property can\nstore edge weights of different data types like int, float, or bool.\nInternally, \"simple\" graphs (non-hyper) are stored as [Adjacency\nMatrices](https://en.wikipedia.org/wiki/Adjacency_matrix).\n\nBefore you can add an edge, a property to hold it must be declared\nfirst.  The default edge type is `bool` if you don't specify one:\n\n```python3\n\u003e\u003e\u003e G.add_property('friend')\n```\n\nEdges can be added directly into the Graph with the `+=` method.  In\ntheir simplest form, an edge is a Python tuple with 2 elements a\nsource and a destination:\n\n```python3\n\u003e\u003e\u003e G.friend += ('bob', 'alice')\n\n\u003e\u003e\u003e G.friend.draw(weights=False, filename='docs/imgs/G_friend_1')\n\u003cgraphviz.dot.Digraph object at ...\u003e\n```\n![G_friend_1.png](docs/imgs/G_friend_1.png)\n\nUsing strings like `'bob'` and `'alice'` as edge endpoints creates new\ngraph nodes automatically.  You can also create nodes explicity and\nprovide attributes for them:\n\n```python3\n\u003e\u003e\u003e jane = Node(G, 'jane', favorite_color='blue')\n\u003e\u003e\u003e jane.favorite_color\n'blue'\n\u003e\u003e\u003e G.friend += ('alice', jane)\n\n\u003e\u003e\u003e G.friend.draw(weights=False, filename='docs/imgs/G_friend_2')\n\u003cgraphviz.dot.Digraph object at ...\u003e\n```\n![G_friend_2.png](docs/imgs/G_friend_2.png)\n\nNow there are two edges in the `friend` property, one from bob to\nalice and the other from alice to jane.\n\n```python3\n\u003e\u003e\u003e p(G.friend)\n[friend(bob, alice), friend(alice, jane)]\n```\n\nAn iterator of property tuples can also be provided to the `+=`\noperator which will consume them and add them to the property:\n\n```python3\n\u003e\u003e\u003e G.friend += [('bob', 'sal'), ('alice', 'rick')]\n\n\u003e\u003e\u003e G.friend.draw(weights=False, filename='docs/imgs/G_friend_3')\n\u003cgraphviz.dot.Digraph object at ...\u003e\n```\n![G_friend_3.png](docs/imgs/G_friend_3.png)\n\nAs shown above, tuples are stored as boolean edges whose weights are\nalways `True` and therefore can be ommited.\n\n# Hypergraphs\n\nA [Hypergraph](https://en.wikipedia.org/wiki/Hypergraph) is a\ngeneralization of a graph in which an edge can join any number of\nvertices in constrast to a simple graph, shown above, where an edge\nhas exactly two endpoints and can only connect only one vertex to one\nother vertex.\n\nIn Graphony a hypergraph can created in any *incidence* property by\npassing the `incidence=True` flag.  This causes the property to be\nstored internally as two [Incidence\nMatrices](https://en.wikipedia.org/wiki/Incidence_matrix) which can\nrepresent non-simple graphs like the hypergraph shown here:\n\n```python3\n\u003e\u003e\u003e G.add_property('manages', incidence=True)\n```\n\nNew hyperedges can be defined by passing a nested tuple of nodes as\neither the source or destinations, or both, for a hyperedge.\n\n```python3\n\u003e\u003e\u003e G.manages += [('bob', ('rick', 'alice')), (('alice', 'bob'), 'jane')]\n\n\u003e\u003e\u003e G.manages.draw(weights=True, filename='docs/imgs/G_manages_1')\n\u003cgraphviz.dot.Digraph object at ...\u003e\n```\n![G_manages_1.png](docs/imgs/G_manages_1.png)\n\nHere a hyperedge with one source and two destinations is created from\nbob to jane and alice, and another with two sources and one\ndestination is created from alice and bob to jane.\n\n# Property Graph\n\nGraphs can have any number of properties, each with a particular\nGraphBLAS type.  In general this is referred to as a Property Graph.\nAs shown above the default property type is `bool` which created\nunweighted edges, but graph edge types can be specified on a\nper-property basis:\n\n```python3\n\u003e\u003e\u003e G.add_property('distance', int)\n\u003e\u003e\u003e G.distance += [('bob', 'alice', 422), ('alice', 'jane', 42)]\n\n\u003e\u003e\u003e G.distance.draw(weights=True, filename='docs/imgs/G_distance_2')\n\u003cgraphviz.dot.Digraph object at ...\u003e\n```\n![G_distance_2.png](docs/imgs/G_distance_2.png)\n\nSupported python types include `bool`, `int`, `float` and `complex`\nwhich are converted into the GraphBLAS types `GrB_BOOL`, `GrB_INT64`,\n`GrB_FP64` and `GxB_FC64` for storage.  You can also pass a specific\nGraphBLAS type if you want different precision or a custom type.\n\n# Graph Querying\n\nCurrently our graph looks like this, it contains 3 properties,\n`friend`, `manages` and `distance`:\n\n```python3\n\u003e\u003e\u003e G.draw(weights=True, filename='docs/imgs/G_all_1')\n\u003cgraphviz.dot.Digraph object at ...\u003e\n```\n\n![G_all_1.png](docs/imgs/G_all_1.png)\n\nGraphs have a call interface like `G(...)` that can be used to query\nindividual edges.  A query consists of three optional arguments for\n`source`, `property` and `destination`.  The default value for all\nthree is None, which acts as a wildcard to matches all values.\n\n```python3\n\u003e\u003e\u003e p(G())\n[friend(bob, alice),\n friend(bob, sal),\n friend(alice, jane),\n friend(alice, rick),\n manages((bob), (alice, rick), (True, True)),\n manages((bob, alice), (jane), (True)),\n distance(bob, alice, 422),\n distance(alice, jane, 42)]\n```\n\nOnly print edges where `bob` is the src:\n\n```python3\n\u003e\u003e\u003e p(G(source='bob'))\n[friend(bob, alice),\n friend(bob, sal),\n manages((bob), (alice, rick), (True, True)),\n manages((bob, alice), (jane), (True)),\n distance(bob, alice, 422)]\n```\n\nOnly print edges where `manages` is the property:\n\n```python3\n\u003e\u003e\u003e p(G(property='manages'))\n[manages((bob), (alice, rick), (True, True)),\n manages((bob, alice), (jane), (True))]\n\n```\n\nOnly print edges where `jane` is the destination:\n\n```python3\n\u003e\u003e\u003e p(G(destination='jane'))\n[friend(alice, jane),\n manages((bob, alice), (jane), (True)),\n distance(alice, jane, 42)]\n```\n\nOnly print edges that match that `bob` is a `manages` of `jane`.\nNote in this case it returns two hyperedges, as in both cases bob is a\nsource and jane is a destination:\n\n```python3\n\u003e\u003e\u003e p(G(source='bob', property='manages', destination='jane'))\n[manages((bob, alice), (jane), (True))]\n```\n\n# Loading Graphs from SQL\n\nAny tuple producing iterator can be used to construct Graphs.\nGraphony offers a shorthand helper for this.  Any query that produces\n2 or 3 columns can be used to produce edges into the graph.\n\n```python3\n\u003e\u003e\u003e G.add_property('karate')\n\u003e\u003e\u003e G.karate += G.sql(\"select 'k_' || s_id, 'k_' || d_id from graphony.karate\")\n\n\u003e\u003e\u003e G.karate.draw(weights=False, filename='docs/imgs/G_karate_3',\n...               directed=False, graph_attr=dict(layout='sfdp'))\n\u003cgraphviz.dot.Graph object at ...\u003e\n```\n![G_karate_3.png](docs/imgs/G_karate_3.png)\n\nAll the edges are in the karate property, as defined in the sql\nquery above:\n\n```python3\n\u003e\u003e\u003e len(G.karate)\n78\n```\n\n# Multigraphs\n\nIn a [Multigraph](https://en.wikipedia.org/wiki/Multigraph) multiple\nedges can exist between two nodes.  A good example is a [De\nBruijn](https://en.wikipedia.org/wiki/De_Bruijn_graph) graph, a\ndirected graph that represents overlapping sequences of symbols.\n\nThese graphs are used in bioinformatics to analyze and assemble long\nsequences of genetic data.  Construction involves iterating a sequence\nof genetic information and constructing multiple edges between pairs\nof nodes.\n\n```python3\n\u003e\u003e\u003e from more_itertools import windowed\n\u003e\u003e\u003e G.add_property('debruijn', incidence=True)\n\u003e\u003e\u003e def kmer(t, k=3): \n...     return (tuple(map(\"\".join, windowed(i, k-1))) for i in map(\"\".join, windowed(t, k)))\n\n\u003e\u003e\u003e G.debruijn += kmer('ATCGATCGGATGACAGACACAATTC')\n\u003e\u003e\u003e G.debruijn.draw(graph_attr=dict(layout='circo'), weights=False, concentrate=True, filename='docs/imgs/G_debruijn_1')\n\u003cgraphviz...\u003e\n```\n![G_debruijn_1.png](docs/imgs/G_debruijn_1.png)\n\nOnce the graph is built up it can be \"collapsed\" into a weighted\ngraph, where the multi-edges between nodes are summed up into a single\nedge.  In the GraphBLAS this can be accomplished by calling the\nproperties with a semiring:\n\n```python3\n\u003e\u003e\u003e M = G.debruijn(INT64.plus_pair)\n\u003e\u003e\u003e gviz.draw_graph(M, weights=True, label_vector=G.debruijn.label_vector(M), \n...                 graph_attr=dict(layout='circo'), filename='docs/imgs/G_debruijn_2')\n\u003cgraphviz...\u003e\n```\n![G_debruijn_2.png](docs/imgs/G_debruijn_2.png)\n\n# Example Weighted De Bruijn using BioPython\n\nHere's an example or using [Biopython](https://biopython.org/) to\ncreate an weighted De Bruijn graph of a Circovirus:\n\n```python3\n\u003e\u003e\u003e from Bio import SeqIO, Entrez\n\u003e\u003e\u003e Entrez.email = \"info@graphegon.com\"\n\u003e\u003e\u003e handle = Entrez.efetch(db=\"nucleotide\", id=\"MZ299081\", rettype=\"gb\", retmode=\"text\")\n\u003e\u003e\u003e record = SeqIO.read(handle, \"genbank\")\n\u003e\u003e\u003e handle.close()\n\u003e\u003e\u003e from more_itertools import windowed\n\u003e\u003e\u003e G.add_property('circovirus', incidence=True)\n\u003e\u003e\u003e def kmer(t, k=3): \n...     return (tuple(map(\"\".join, windowed(i, k-1))) for i in map(\"\".join, windowed(t, k)))\n\u003e\u003e\u003e seq = str(record.seq)\n\u003e\u003e\u003e G.circovirus += kmer(seq, 3)\n\u003e\u003e\u003e M = G.circovirus(INT64.plus_pair)\n\u003e\u003e\u003e gviz.draw_graph(M, weights=True, labels=True, label_vector=G.circovirus.label_vector(M),\n...                 graph_attr=dict(layout='sfdp'), filename='docs/imgs/G_circovirus_1')\n\u003cgraphviz...\u003e\n```\n![G_circovirus_1.png](docs/imgs/G_circovirus_1.png)\n\n\u003c!--phmdoctest-teardown--\u003e\n```python3\npostgresql.teardown(pgdata)\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fonesparse%2Fgraphony","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fonesparse%2Fgraphony","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fonesparse%2Fgraphony/lists"}