{"id":19388845,"url":"https://github.com/linkml/sparqlfun","last_synced_at":"2025-04-23T23:31:48.977Z","repository":{"id":40659642,"uuid":"429988987","full_name":"linkml/sparqlfun","owner":"linkml","description":"sparql templates for linkml (alpha)","archived":false,"fork":false,"pushed_at":"2022-04-30T00:10:38.000Z","size":6431,"stargazers_count":12,"open_issues_count":1,"forks_count":0,"subscribers_count":15,"default_branch":"main","last_synced_at":"2025-04-02T22:33:08.526Z","etag":null,"topics":["linked-data","linkml","obofoundry","ontobee","ontologies","rdflib","sparql","sparql-templates","ubergraph"],"latest_commit_sha":null,"homepage":"https://linkml.github.io/sparqlfun/home/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"cc0-1.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/linkml.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2021-11-20T02:28:24.000Z","updated_at":"2025-04-02T21:44:04.000Z","dependencies_parsed_at":"2022-08-10T00:10:41.903Z","dependency_job_id":null,"html_url":"https://github.com/linkml/sparqlfun","commit_stats":null,"previous_names":[],"tags_count":6,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/linkml%2Fsparqlfun","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/linkml%2Fsparqlfun/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/linkml%2Fsparqlfun/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/linkml%2Fsparqlfun/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/linkml","download_url":"https://codeload.github.com/linkml/sparqlfun/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":250532130,"owners_count":21446123,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["linked-data","linkml","obofoundry","ontobee","ontologies","rdflib","sparql","sparql-templates","ubergraph"],"created_at":"2024-11-10T10:13:52.018Z","updated_at":"2025-04-23T23:31:48.319Z","avatar_url":"https://github.com/linkml.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# sparqlfun\n\nLinkML based SPARQL template library and execution engine\n\n - modularized core library of [SPARQL templates](https://github.com/linkml/sparqlfun/tree/main/sparqlfun/schema)\n    - generic templates using common vocabs (rdf, owl, skos, ...)\n    - OBO and biology specific, e.g. Ubergraph\n    - coming soon: uniprot, wikidata, etc\n - Fully FAIR description of templates\n    - Each template has a URI (e.g.: https://linkml.io/sparqlfun/PairwiseCommonDescendant)\n    - Each template parameter has a URI (e.g. https://linkml.io/sparqlfun/subject)\n    - Full metadata including descriptions of each\n    - Templates described in YAML, RDF, SHACL, ShEx, ...\n - Rich expressive language for moedeling templates\n     - uses [LinkML](https://linkml.io/linkml/) as base language\n - optional python bindings / [object model](https://github.com/linkml/sparqlfun/blob/main/sparqlfun/model.py) using LinkML\n - supports both SELECT and CONSTRUCT\n - optional export to TSV, JSON, YAML, RDF\n - extensive [endpoint metadata](https://github.com/linkml/sparqlfun/tree/main/sparqlfun/config)\n\nThis is currently alpha software, interfaces and organization may change\n\n## Browse the default templates\n\n* [http://linkml.io/sparqlfun/](http://linkml.io/sparqlfun/)\n\nNote: currently not all metadata from the yaml is shown in the generated docs\n\n## Command Line\n\nUse the [sparqlfun:PairwiseCommonSubClassAncestor](https://linkml.io/sparqlfun/PairwiseCommonSubClassAncestor) template\n\n```bash\nsparqlfun query -e ubergraph -T PairwiseCommonSubClassAncestor node1=GO:0046220 node2=GO:0008295\n```\n\nresults:\n\n```yaml\nresults:\n- node1: GO:0046220\n  node2: GO:0008295\n  predicate1: rdfs:subClassOf\n  predicate2: rdfs:subClassOf\n  ancestor: GO:0009987\n- node1: GO:0046220\n  node2: GO:0008295\n  predicate1: rdfs:subClassOf\n  predicate2: rdfs:subClassOf\n  ancestor: GO:0044237\n- node1: GO:0046220\n  node2: GO:0008295\n  predicate1: rdfs:subClassOf\n  predicate2: rdfs:subClassOf\n  ancestor: GO:0044271\n...\n```\n\n## Local RDF files\n\nIf you specify the `-f` / `--format` option then `-e` is assumed to be a path to a file on disk:\n\n```bash\nsparqlfun query -e go.owl.ttl -f ttl -T PairwiseCommonSubClassAncestor node1=GO:0046220 node2=GO:0008295\n```\n\n\n## List all templates\n\n```bash\nsparqlfun endpoints\n```\n\n## Python\n\n```python\nfrom sparqlfun import SparqlEngine\nfrom sparqlfun.model import PairwiseCommonSubClassAncestor\n\nse = SparqlEngine(endpoint='ubergraph')\nse.bind_prefixes(GO='http://purl.obolibrary.org/obo/GO_')\nfor apair in se.query(PairwiseCommonSubClassAncestor(node1='GO:0046220', node2='GO:0008295')).results:\n        print(f'ROW={apair.node1} \u003c-\u003e {apair.node2} ANCESTOR = {apair.ancestor}')\n```\n\nFor more examples, see [tests/] in GitHub\n\n\n## Browsing the templates\n\n - source is in [sparqlfun/schema](https://github.com/linkml/sparqlfun/tree/main/sparqlfun/schema)\n     - add new templates here\n - Browse the generated markdown on the site\n     - markdown is auto-created from the yaml schema\n\nYou can also list templates here:\n\n```bash\nsparqlfun templates\n```\n\nor for detailed view:\n\n```bash\nsparqlfun templates --detail\n```\n\n## How it works\n\n\n### Basics\n\nTemplates are defined as YAML files following the LinkML schema.\n\nA yaml file with a single template might look like this:\n\n```yaml\nschema:\n   id: http://example.org/my-vocab/templates\nprefixes:\n   my: http://example.org/my-vocab/\nclasses:\n  my template:\n    slots:\n      - my_var1\n      - my_var2\n    annotations:\n      sparql.select: |-\n        SELECT  * WHERE { ... ?my_var1 ... ?my_var2}\n      \nslots:\n  my_var1:\n    description: about my var 1\n  my_var2:\n    description: about my var 2\n```      \n\nThis defines a template `MyTemplate` with two slots/parameters, and an\narbitrarily complex SPARQL select query.\n\nThe YAML file is broken into blocks that minimally include 3 sections:\n\n- schema metadata, including prefix declarations\n- your templates, which are in the `classes` section\n- your parameters/variables, which are in the `slots` section\n\nNote that the definitions of the slots go in a different section from\nthe classes/templates. You are encouraged to \"reuse\" slots across templates.\nHowever, you can use an\n[attribute declaration as a shortcut](https://linkml.io/linkml/faq/modeling.html?highlight=attributes#when-should-i-use-attributes-vs-slots)\nif you don't want to reuse.\n\nThe above can be used in queries:\n\n```bash\nsparqlfun -e ubergraph -T MyTemplate my_var2=MY_VAL\n```\n\nYou can ground any or all of your vars on the command line (if you ground all then your SELECT is effectively an ASK query).\n\nHowever, the features go beyond other templating systems, and leverage\nthe fact that LinkML is a fully-fledged rich modeling language with bindings to JSON-Schema, SHACL, ShEx, etc.\n\nFor example, you will get markdown documentation describing your templates. This markdown documentation will be even richer if you annotate your schemas with metadata such as\n\n - descriptions\n - ranges for slots\n - mappings and URIs for your templates and slots\n\nThis brings a number of tangible benefits:\n\n - your templates can be strongly typed\n - templates can be compiled to multiple other forms\n - templates are turned into Python dataclasses, giving an optional ORM-like layer, IDE suppport, etc\n - in future applications will be able to use template metadata\n    - documentation on each slot\n    - pickers for fields such as dates, enums, etc \n    - e.g. if a template slot has a range of class MyClass, applications could provide autocomplete\n\n### Template Inheritance\n\nTemplates can be [inherited](https://linkml.io/linkml/schemas/inheritance), facilitating reuse and composition patterns\n\nTo illustrate consider a simple \"base\" template to query a triple:\n\n```yaml\nclasses:\n  triple:\n    aliases:\n      - statement\n    description: \u003e-\n      Represents an RDF triple\n    slots:\n      - subject\n      - predicate\n      - object\n    class_uri: rdf:Statement\n    in_subset:\n      - base table\n    annotations:\n      sparql.select: SELECT  * WHERE { ?subject ?predicate ?object}\n```\n\nThis is arguable not a particularly useful template in isolation - you may as\nwell query directly with sparql (nevertheless it can be useful to have\ntemplates for even this simple pattern, to faciliate generation of\nAPIs etc)\n\nNew templates can use this as a base class, and inherit from it, which means that slots will be\ninherited, eliminating some boilerplate and the need to redefine them\n\n\n```yaml\nclasses:\n  quad:\n    is_a: triple\n  slots:\n     - graph  ## s/p/o slots inherited from triple\n  annotations:\n    sparql.select: SELECT  * WHERE {GRAPH ?graph { ?subject ?predicate ?object}}\n````\n\nInerhitance allows even more powerful features using the LinkML\n`classification_rules` construct. Let's say we want to represent type\ntriples as children of generic triples:\n\n\n```yaml\nrdf type triple:\n    is_a: triple\n    description: \u003e-\n      A triple that indicates the asserted type of the subject entity\n    slot_usage:\n      object:\n        description: \u003e-\n          The entity type\n        range: class node\n    classification_rules:\n      - is_a: triple\n        slot_conditions:\n          predicate:\n            equals_string: rdf:type\n```            \n\n**Note we don't need to specify a SPARQL template here** - the\ntemplate is autogenerated from the classification rule.\n\nUse of inheritance is a matter of choice. You may find it simpler to have some level of redundancy\nand repeat information in similar templates. Note you will still get a decent\namount of reuse via a common vocabulary of slots\n\n### SPARQL CONSTRUCT and nested/inlined objects\n\nExample CONSTRUCT query:\n\n```yaml\nobo class:\n    is_a: class node\n    class_uri: owl:Class\n    slots:\n      - definition\n      - exact_synonyms\n    annotations:\n      sparql.construct: |-\n        CONSTRUCT {\n          ?id a owl:Class ;\n              IAO:0000115 ?definition ;\n              oboInOwl:hasExactSynonym ?exact_snonyms\n        }\n        WHERE {\n          ?id a owl:Class .\n          OPTIONAL { ?id IAO:0000115 ?definition } .\n          OPTIONAL { ?id oboInOwl:hasExactSynonym ?exact_snonyms } .\n        }\n\n...\n\nslots:\n  definition:\n    slot_uri: IAO:0000115\n  exact_synonyms:\n    slot_uri: oboInOwl:hasExactSynonym\n    multivalued: true\n```\n\nWe can then query this as follows:\n\n```bash\nsparqlfun -e ontobee -T OboClass id=GO:0000023\n```\n\nThe results will be nested following the LinkML specification for the model\n\n```json\n{\n  \"results\": [\n    {\n      \"id\": \"GO:0000023\",\n      \"definition\": \"The chemical reactions and pathways involving the disaccharide maltose (4-O-alpha-D-glucopyranosyl-D-glucopyranose), an intermediate in the catabolism of glycogen and starch.\",\n      \"exact_synonyms\": [\n        \"malt sugar metabolic process\",\n        \"malt sugar metabolism\",\n        \"maltose metabolism\"\n      ]\n    }\n  ],\n  \"@type\": \"ResultSet\"\n}\n```\n\n(note: templates are also compiled to JSON-Schema, which can be used for additional validation)\n\nYou can also get the turtle as returned by the triplestore using `-f ttl`:\n\n```turtle\n@prefix ns1: \u003chttp://www.geneontology.org/formats/oboInOwl#\u003e .\n@prefix ns2: \u003chttp://purl.obolibrary.org/obo/\u003e .\n@prefix ns3: \u003chttps://w3id.org/sparqlfun/\u003e .\n\nns2:GO_0000023 a \u003chttp://www.w3.org/2002/07/owl#Class\u003e ;\n    ns2:IAO_0000115 \"The chemical reactions and pathways involving the disaccharide maltose (4-O-alpha-D-glucopyranosyl-D-glucopyranose), an intermediate in the catabolism of glycogen and starch.\" ;\n    ns1:hasExactSynonym \"malt sugar metabolic process\",\n        \"malt sugar metabolism\",\n        \"maltose metabolism\" .\n\n[] a ns3:ResultSet ;\n    ns3:results ns2:GO_0000023 .\n```\n\nWith `-t tsv` the linkml csv dumper will attempt to flatten the nested structure to TSV as closely as possible, e.g. using pipe internal seperators for multivalued\n\n### Multiple Values\n\nParameters can be passed as lists, which will be translated to `VALUES` statements\n\n```bash\nsparqlfun -e ontobee -T OboClass id=GO:0000023,GO:0000024\n```\n\n### Modularity\n\nLinkML allows importing so templates can be modularized using [imports](https://linkml.io/linkml/schemas/imports)\n\n__NOTE__ In future this repo may be split up, with the bio/obo specific features migrating to a new repo.\n\n### Use of Jinja commands\n\nYou can incorporate additional logic via Jinja2 templating instructions:\n\n```yaml\nobo class filtered:\n    is_a: class node\n    class_uri: owl:Class\n    slots:\n      - definition\n      - exact_synonyms\n    annotations:\n      sparql.construct: |-\n        CONSTRUCT {\n          ?id a owl:Class ;\n              IAO:0000115 ?definition ;\n              oboInOwl:hasExactSynonym ?exact_snonyms\n        }\n        WHERE {\n          ?id a owl:Class .\n          OPTIONAL { ?id IAO:0000115 ?definition } .\n          OPTIONAL { ?id oboInOwl:hasExactSynonym ?exact_snonyms } .\n          {% if query_has_subclass_ancestor %}\n          ?id rdfs:subClassOf ?query_has_subclass_ancestor\n          {% endif %}\n        }\n\nslots:\n  query_has_subclass_ancestor:\n    range: class node\n    description: transitive is_a parent\n    in_subset:\n       - ubergraph  ## requires relation-graph closure\n```\n\n## Supported Endpoints\n\nThis framework can be used with any SPARQL endpoint. However, the\ncurrent pre-defined templates are geared towards the combination of\nOBO-style ontologies together with storage patterns employed in\ntriplestores such as ubergraph and ontobee.\n\n - [ubergraph](https://github.com/INCATools/ubergraph)\n\nIn particular, ubergraph uses the relation-graph inference tool to\npre-compute inferred direct triples from TBox existential axioms,\nallowing for simple and powerful queries over inferred ontologies\n\nSee the config files in sparqlfun/config for a list of all pre-defined endpoints\n\nExample:\n\n```yaml\nendpoints:\n   ubergraph:\n      url: https://stars-app.renci.org/ubergraph/sparql\n      example_queries:\n         - query_template: PairwiseCommonSubClassAncestor\n           bindings:\n              node1: GO:0046220\n              node2: GO:0008295\n```\n\nSee config_schema.yaml for the schema for endpoints\n\nNote there is a rich metadata model that is intended to facilitate\napplications and automated testing. It should be possible to automatically determine\nwhich templates are compatible with which endpoints based on provided metadata.\n\n## Adding your own templates\n\nCurrently this library is easiest to use if you are working with the existing pre-defined templates (PRs are welcome)\n\nHowever, you can use the framework with your own templates for your own triple data.\n__THIS IS NOT YET WELL-SUPPORTED__\nThere are a couple of steps involved,\nin future this should be easier.\n\nFirst you need to create your own yaml file. This needs to conform to\nthe LinkML metamodel - we recommend just copying an existing template\nto do this. Some of this may seem like unnecessary boilerplate at this\nstage, but it will come in useful later.\n\nNext you need to compile the template:\n\n```bash\ngen-python my_template.yaml \u003e my_template.py\n```\n\nThis requires [linkml](https://linkml.io/linkml/) (this library uses linkml as a developer dependency)\n\nYou will need to pass BOTH of these as arguments to sparqlfun (`-m` and `-S`)\n\nTODO:\n\n - add a dependency to the full linkml framework\n - allow dynamic compilation of templates\n\n## See also\n\nThis was inspired by and designed as a replacement for the powerful but arcane [sparqlprog](https://github.com/cmungall/sparqlprog/) system.\n\nTODO: list other SPARQL template frameworks\n\n## TODOs\n\n - Better Document\n     - framework\n     - templates\n     - How-tos for use with Python, SHACL, ...\n     - exemplar notebooks\n - Unify with SQL/rdftab functionality in semantic-sql\n - Cypher bindings\n - Split into bio-specific\n - Expose more ubergraph awesomeness\n - FastAPI/serverless endpoint\n - Expose more validation\n - Integrate visualization / obographviz\n - compilation to other frameworks, e.g. grlc\n - Chaining\n    - inject output from one into another and merge results, e.g. to get labels\n    - similar to wikidata services\n - UI/yasgui integration\n - generation from dosdp (use dosdp-query algorithm)\n - Templates for\n    - uniprot\n    - gocams\n    - wikidata\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flinkml%2Fsparqlfun","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Flinkml%2Fsparqlfun","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flinkml%2Fsparqlfun/lists"}