{"id":25843058,"url":"https://github.com/incatools/semantic-sql","last_synced_at":"2025-04-05T09:09:32.145Z","repository":{"id":38285086,"uuid":"364391002","full_name":"INCATools/semantic-sql","owner":"INCATools","description":"SQL and SQLite builds of OWL ontologies","archived":false,"fork":false,"pushed_at":"2025-02-25T19:18:52.000Z","size":8039,"stargazers_count":43,"open_issues_count":13,"forks_count":5,"subscribers_count":5,"default_branch":"main","last_synced_at":"2025-03-29T08:11:14.059Z","etag":null,"topics":["linkml","oaklib","obofoundry","ontologies","owl","relation-graph","sparql","sql"],"latest_commit_sha":null,"homepage":"https://incatools.github.io/semantic-sql/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"bsd-3-clause","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/INCATools.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":".github/CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":".github/CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-05-04T21:31:08.000Z","updated_at":"2025-03-07T08:50:19.000Z","dependencies_parsed_at":"2023-02-01T17:30:57.229Z","dependency_job_id":"36fdea6a-77d9-41c7-b8ff-ceac232828d7","html_url":"https://github.com/INCATools/semantic-sql","commit_stats":{"total_commits":176,"total_committers":7,"mean_commits":"25.142857142857142","dds":"0.10795454545454541","last_synced_commit":"b87e6239504e6a24fe1876c15f8716f249049628"},"previous_names":[],"tags_count":17,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/INCATools%2Fsemantic-sql","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/INCATools%2Fsemantic-sql/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/INCATools%2Fsemantic-sql/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/INCATools%2Fsemantic-sql/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/INCATools","download_url":"https://codeload.github.com/INCATools/semantic-sql/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247312082,"owners_count":20918344,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["linkml","oaklib","obofoundry","ontologies","owl","relation-graph","sparql","sql"],"created_at":"2025-03-01T06:37:50.490Z","updated_at":"2025-04-05T09:09:32.105Z","avatar_url":"https://github.com/INCATools.png","language":"Python","readme":"# SemSQL: standard SQL views for RDF/OWL ontologies\n\n[![PyPI version](https://badge.fury.io/py/semsql.svg)](https://badge.fury.io/py/semsql)\n![](https://github.com/incatools/semantic-sql/workflows/Build/badge.svg)\n\n\nThis project provides a standard collection of SQL tables/views for ontologies, such that you can make queries like this,\nto find all terms starting with `Abnormality` in [HPO](https://obofoundry.org/ontology/hp).\n\n```sql\n$ sqlite db/hp.db\nsqlite\u003e SELECT * FROM rdfs_label_statement WHERE value LIKE 'Abnormality of %';\n```\n\n|stanza|subject|predicate|object|value|datatype|language|\n|---|---|---|---|---|---|---|\n|HP:0000002|HP:0000002|rdfs:label||Abnormality of body height|xsd:string||\n|HP:0000014|HP:0000014|rdfs:label||Abnormality of the bladder|xsd:string||\n|HP:0000022|HP:0000022|rdfs:label||Abnormality of male internal genitalia|xsd:string||\n|HP:0000032|HP:0000032|rdfs:label||Abnormality of male external genitalia|xsd:string||\n\n\nReady-made SQLite3 builds can also be downloaded for any ontology in [OBO](http://obofoundry.org), using URLs such as https://s3.amazonaws.com/bbop-sqlite/hp.db.gz\n\n[relation-graph](https://github.com/balhoff/relation-graph/) is used to pre-generate tables of [entailed edges](https://incatools.github.io/semantic-sql/EntailedEdge/). For example,\nall is-a and part-of ancestors of [finger](http://purl.obolibrary.org/obo/UBERON_0002389) in Uberon:\n\n```sql\n$ sqlite db/uberon.db\nsqlite\u003e SELECT * FROM entailed_edge WHERE subject='UBERON:0002389' and predicate IN ('rdfs:subClassOf', 'BFO:0000050');\n```\n\n|subject, predicate, object|\n|---|\n|UBERON:0002389, BFO:0000050, UBERON:0015212|\n|UBERON:0002389, BFO:0000050, UBERON:5002389|\n|UBERON:0002389, BFO:0000050, UBERON:5002544|\n|UBERON:0002389, rdfs:subClassOf, UBERON:0000061|\n|UBERON:0002389, rdfs:subClassOf, UBERON:0000465|\n|UBERON:0002389, rdfs:subClassOf, UBERON:0000475|\n\nSQLite provides many advantages\n\n- files can be downloaded and subsequently queried without network latency\n- compared to querying a static rdf, owl, or obo file, there is no startup/parse delay\n- robust and performant\n- excellent support in many languages\n\nAlthough the focus is on SQLite, this library can also be used for other DBMSs like PostgreSQL, MySQL, Oracle, etc\n\n## Tutorials\n\n- SemSQL: [notebooks/SemanticSQL-Tutorial.ipynb](https://github.com/INCATools/semantic-sql/blob/main/notebooks/SemanticSQL-Tutorial.ipynb)\n- Using OAK: [part 7 of OAK tutorial](https://incatools.github.io/ontology-access-kit/intro/tutorial07.html)\n\n## Installation\n\nSemSQL comes with a helper Python library. Use of this is optional. To install:\n\n```bash\npip install semsql\n```\n\n## Download ready-made SQLite databases\n\nPre-generated SQLite database are created weekly for all OBO ontologies and a selection of others (see [ontologies.yaml](https://github.com/INCATools/semantic-sql/blob/main/src/semsql/builder/registry/ontologies.yaml))\n\nTo download:\n\n```bash\nsemsql download obi -o obi.db\n```\n\nOr simply download using URL of the form:\n\n- https://s3.amazonaws.com/bbop-sqlite/hp.db.gz\n\n## Attaching databases\n\nIf you are using sqlite3, then databases can be attached to facilitate cross-database joins.\n\nFor example, many ontologies use ORCID URIs as the object of `dcterms:contributor` and `dcterms:creator` statements, but these are left \"dangling\". Metadata about these orcids are available in the semsql orcid database instance (derived from [wikidata-orcid-ontology](https://github.com/cthoyt/wikidata-orcid-ontology)), in the [Orcid table](https://incatools.github.io/semantic-sql/Orcid).\n\nYou can use [ATTACH DATABASE](https://www.sqlite.org/lang_attach.html) to connect two databases, for example:\n\n```sql\n$ sqlite3 db/cl.dl\nsqlite\u003e attach 'db/orcid.db' as orcid_db;\nsqlite\u003e select * from contributor inner join orcid_db.orcid on (orcid.id=contributor.object) where orcid.label like 'Chris%';\nobo:cl.owl|obo:cl.owl|dcterms:contributor|orcid:0000-0002-6601-2165||||orcid:0000-0002-6601-2165|Christopher J. Mungall\nCL:0010001|CL:0010001|dcterms:contributor|orcid:0000-0002-6601-2165||||orcid:0000-0002-6601-2165|Christopher J. Mungall\nCL:0010002|CL:0010002|dcterms:contributor|orcid:0000-0002-6601-2165||||orcid:0000-0002-6601-2165|Christopher J. Mungall\nCL:0010003|CL:0010003|dcterms:contributor|orcid:0000-0002-6601-2165||||orcid:0000-0002-6601-2165|Christopher J. Mungall\nCL:0010004|CL:0010004|dcterms:contributor|orcid:0000-0002-6601-2165||||orcid:0000-0002-6601-2165|Christopher J. Mungall\nUBERON:0000093|UBERON:0000093|dcterms:contributor|orcid:0000-0002-6601-2165||||orcid:0000-0002-6601-2165|Christopher J. Mungall\nUBERON:0000094|UBERON:0000094|dcterms:contributor|orcid:0000-0002-6601-2165||||orcid:0000-0002-6601-2165|Christopher J. Mungall\nUBERON:0000095|UBERON:0000095|dcterms:contributor|orcid:0000-0002-6601-2165||||orcid:0000-0002-6601-2165|Christopher J. Mungall\nUBERON:0000179|UBERON:0000179|dcterms:contributor|orcid:0000-0002-6601-2165||||orcid:0000-0002-6601-2165|Christopher J. Mungall\nUBERON:0000201|UBERON:0000201|dcterms:contributor|orcid:0000-0002-6601-2165||||orcid:0000-0002-6601-2165|Christopher J. Mungall\nUBERON:0000202|UBERON:0000202|dcterms:contributor|orcid:0000-0002-6601-2165||||orcid:0000-0002-6601-2165|Christopher J. Mungall\nUBERON:0000203|UBERON:0000203|dcterms:contributor|orcid:0000-0002-6601-2165||||orcid:0000-0002-6601-2165|Christopher J. Mungall\nUBERON:0000204|UBERON:0000204|dcterms:contributor|orcid:0000-0002-6601-2165||||orcid:0000-0002-6601-2165|Christopher J. Mungall\n```\n\n## Creating a SQLite database from an OWL file\n\nThere are two protocols for doing this:\n\n1. install build dependencies\n2. use Docker\n\nIn either case:\n\n- The input MUST be in RDF/XML serialization and have the suffix `.owl`:\n- use robot to convert if format is different\n\nWe are planning to simplify this process in future.\n\n### 1. Build a SQLite database directly\n\nThis requires some basic technical knowledge about how to install things on your machine\nand how to put things in your PATH. It does not require Docker.\n\nRequirements:\n\n- [rdftab.rs](https://github.com/ontodev/rdftab.rs)\n- [relation-graph](https://github.com/balhoff/relation-graph) `2.3.1` or higher\n\nAfter installing these and putting both `relation-graph` and `rdftab.rs` in your path:\n\n```bash\nsemsql make foo.db\n```\n\nThis assumes `foo.owl` is in the same folder\n\n### 2. Use Docker\n\nThere are two docker images that can be used:\n\n- [ODK](https://hub.docker.com/r/obolibrary/odkfull)\n- [semantic-sql](https://hub.docker.com/repository/docker/linkml/semantic-sql)\n\nThe ODK image may lag behind\n\n```bash\ndocker run  -v $PWD:/work -w /work -ti linkml/semantic-sql semsql make foo.db\n```\n\n## Schema\n\nSee [Schema Documentation](https://incatools.github.io/semantic-sql/)\n\nThe [source schema](https://github.com/INCATools/semantic-sql/tree/main/src/semsql/linkml) is in [LinkML](https://linkml.io) - this is then compiled down to SQL Tables and Views\n\nThe basic idea is as follows:\n\nThere are a small number of \"base tables\":\n\n* [statements](https://incatools.github.io/semantic-sql/Statements/)\n* [prefix](https://incatools.github.io/semantic-sql/Prefix/)\n* [entailed_edge](https://incatools.github.io/semantic-sql/EntailedEdge/) - populated by relation-graph\n\nAll other tables are actually views (derived tables), and are provided for convenience.\n\n## ORM Layer\n\nA SemSQL relational database can be accessed in exactly the same way as any other SQLdb\n\nFor convenience, we provide a Python Object-Relational Mapping (ORM) layer using SQL Alchemy.\nThis allows for code uchlike the following, which joins [RdfsSubclassOfStatement](https://incatools.github.io/semantic-sql/RdfsSubclassOfStatement) and [existential restrictions](https://incatools.github.io/semantic-sql/OwlSomeValuesFrom):\n\n```python\nengine = create_engine(f\"sqlite:////path/to/go.db\")\nSessionClass = sessionmaker(bind=engine)\nsession = SessionClass()\nq = session.query(RdfsSubclassOfStatement)\nq = q.add_entity(OwlSomeValuesFrom)\nq = q.join(OwlSomeValuesFrom, RdfsSubclassOfStatement.object == OwlSomeValuesFrom.id)\n\nlines = []\nfor ax, ex in q.all():\n    line = f'{ax.subject} subClassOf {ex.on_property} SOME {ex.filler}'\n    logging.info(line)\n    lines.append(line)\n```    \n\n(this example is just for illustration - to do the same thing there is a simpler Edge relation)\n\n## Applications\n\nThe semsql python library is intentionally low level - we recommend using the [ontology-access-kit](https://github.com/INCATools/ontology-access-kit)\n\nFor example:\n\n```bash\nrunoak -i db/envo.db search t~biome\n```\n\nYou can also pass in an OWL file and have the sqlite be made on the fly\n\n```bash\nrunoak -i sqlite:envo.owl search t~biome\n```\n\nEven if using OAK, it can be useful to access SQL tables directly to do complex multi-join queries in a performant way.\n\n## Optimization\n\n```bash\npoetry run semsql view2table edge --full-index | sqlite3 $db/mydb.db\n```\n\nSee [indexes](indexes) for some ready-made indexes\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fincatools%2Fsemantic-sql","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fincatools%2Fsemantic-sql","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fincatools%2Fsemantic-sql/lists"}