Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/lepistabioinformatics/tax2graph
Parse taxonomic data from Catalog Of Life to a graph database.
https://github.com/lepistabioinformatics/tax2graph
graph neo4j neo4j-graph species-trees taxonomy
Last synced: about 2 hours ago
JSON representation
Parse taxonomic data from Catalog Of Life to a graph database.
- Host: GitHub
- URL: https://github.com/lepistabioinformatics/tax2graph
- Owner: LepistaBioinformatics
- Created: 2020-07-08T19:52:44.000Z (over 4 years ago)
- Default Branch: master
- Last Pushed: 2020-08-26T19:23:48.000Z (over 4 years ago)
- Last Synced: 2024-04-25T08:01:44.522Z (10 months ago)
- Topics: graph, neo4j, neo4j-graph, species-trees, taxonomy
- Language: Python
- Homepage:
- Size: 192 KB
- Stars: 4
- Watchers: 1
- Forks: 2
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# tax2graph
![Glomerellales graph](https://github.com/sgelias/tax2graph/blob/master/tax2graph/data/glomerellales-graph.png)
## Goal
`tax2graph` provide a simple way to convert i4Life tabled taxonomy to Neo4J graph representation.
## Example data
A example data are available on data folder. It contains the Sordariomycetes taxonomy downloaded from "i4Life WP4 Download Service of the Catalogue of Life:
Darwin Core Archive Export" (see http://www.catalogueoflife.org/DCA_Export/).## Usage
Initialize a connection dict containing at last the password key to perform queries to Neo4J database:
```python
import os
from tax2graph import GraphBuilder, ConnectionTypeconnection_variables: ConnectionType = {
"password": str(os.getenv('NEO_PASSWORD'))
}builder = GraphBuilder(connection_variables)
```
And build the graph from Sordariomycetes tab-separated file:
```python
builder.read('tax2graph/data/sordariomycetes/taxa.txt', low_memory=False)
builder.build_col_graph()
```
The second parameter (low_memoty) is optional and is the same of `pandas.read_csv()` (see details in https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_csv.html).
After load the base taxonomy, the Manager class can be used to perform queries. To perform simple queries use:
```python
from tax2graph import Manager
manager = Manager(connection_variables)
manager.get_node('Glomerellales')
manager.get_parent('Glomerellales')
```
In the above code the constructor of Manager class require the 'connection_variables' to be started, like GraphBuilder class. After started both methods `get_node` and `get_parent` are called. Both receives a simgle parameter as string indicating the name of the Node to search. The former return an dict containing the target (e.g. order Glomerellales) node and the last return the parent node (e.g. class Sordariomycetes).
The manager class also contain a single method to create custom nodes. To this, simpleously use `set_custom_node`, as example:
```python
from tax2graph import CustomNodeType
custom_node: CustomNodeType = {
'taxonRank': 'species',
'description': 'A custom clade'
}manager.set_custom_node(custom_node, 'Colletotrichum')
```
This method receive two parameters: the first is a dict of CustomNodeType type that contains two keys, as `taxonRank` and `description`; and the second is a string indicating the parent node in with the *custom_node* will be connected.
You can also include properties to relationships. To do, simpleously create an instance of CustomRelPropertiesType and include an additional parameter as the third argument of `set_custom_node` method.
```python
from datetime import datetime
from tax2graph import CustomRelPropertiesTyperelationship_properties: CustomRelPropertiesType = {
'created': datetime.now()
}manager.set_custom_node(custom_node, 'Colletotrichum', relationship_properties)
```
---
Feel free to add new features and contribute through pull requests. Be happy!!