Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/jervenbolleman/sapfhir
Join genomic variation graphs with public data or internal medical data e.g. FHIR. by having a FAIR data access, using W3C sparql as a standard protocol.
https://github.com/jervenbolleman/sapfhir
biohackeu20 graph-database rdf sparql
Last synced: 19 days ago
JSON representation
Join genomic variation graphs with public data or internal medical data e.g. FHIR. by having a FAIR data access, using W3C sparql as a standard protocol.
- Host: GitHub
- URL: https://github.com/jervenbolleman/sapfhir
- Owner: JervenBolleman
- License: gpl-3.0
- Created: 2020-11-17T18:46:54.000Z (about 4 years ago)
- Default Branch: main
- Last Pushed: 2024-03-27T20:26:28.000Z (9 months ago)
- Last Synced: 2024-05-18T07:42:15.392Z (8 months ago)
- Topics: biohackeu20, graph-database, rdf, sparql
- Language: Java
- Homepage:
- Size: 234 KB
- Stars: 3
- Watchers: 3
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# SapFhir
Join genomic variation graphs with public data or internal medical data e.g. FHIR.
by having a FAIR data access, using W3C sparql as a standard protocol.# Status
This is a [RDF4j](https://rdf4j.org/) SAIL implementation that can take any handlegraph4j
implementation and represent it as a [W3C sparql 1.1](https://www.w3.org/TR/sparql11-query/) endpoint.It is functionally complete. Performance depends hugly on the specific handlegraph implementation.
It is currently read-only, but could be made read/write.
There is a query optimizer that is active that can significantly rewrite queries for the best
performance.# Example queries
```sparql
#Find the ten most forward to forward connected nodes (needs a lot of RAM)
PREFIX vg:SELECT ?node
WHERE
{
?node vg:linksForwardToForward ?node2 .
}
GROUP BY ?node
ORDER BY (COUNT(?node2))
LIMIT 10
``````sparql
# Counts the number of sequences of length 1 in the graph
PREFIX vg:
SELECT
(COUNT(?n) AS ?c)
WHERE {
?n rdf:value ?sequence .
FILTER(strlen(?sequence) ==1)
}
``````sparql
# Counts the number of sequences with an R ambiguous nucleotide code
# handlegraph4j lower cases all dna sequences.
PREFIX vg:
SELECT
(COUNT(?n) AS ?c)
WHERE {
?n rdf:value ?sequence .
FILTER(contains(?sequence, 'r'))
}
``````sparql
# List all the Paths in the variation graph
PREFIX vg:
PREFIX rdfs:
SELECT
?path
?pathLabel
WHERE {
?path a vg:Path ;
rdfs:label ?pathLabel .
}
```