{"id":42358639,"url":"https://github.com/ontodev/rdftab.rs","last_synced_at":"2026-01-27T16:38:03.240Z","repository":{"id":48920465,"uuid":"269104056","full_name":"ontodev/rdftab.rs","owner":"ontodev","description":"RDF Tables in Rust","archived":false,"fork":false,"pushed_at":"2022-08-26T14:39:09.000Z","size":123,"stargazers_count":17,"open_issues_count":17,"forks_count":3,"subscribers_count":5,"default_branch":"master","last_synced_at":"2025-05-01T02:41:12.764Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Rust","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"bsd-3-clause","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ontodev.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2020-06-03T14:03:53.000Z","updated_at":"2024-12-12T05:01:50.000Z","dependencies_parsed_at":"2023-01-16T21:15:50.019Z","dependency_job_id":null,"html_url":"https://github.com/ontodev/rdftab.rs","commit_stats":null,"previous_names":[],"tags_count":2,"template":false,"template_full_name":null,"purl":"pkg:github/ontodev/rdftab.rs","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ontodev%2Frdftab.rs","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ontodev%2Frdftab.rs/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ontodev%2Frdftab.rs/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ontodev%2Frdftab.rs/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ontodev","download_url":"https://codeload.github.com/ontodev/rdftab.rs/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ontodev%2Frdftab.rs/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28816563,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-27T12:25:15.069Z","status":"ssl_error","status_checked_at":"2026-01-27T12:25:05.297Z","response_time":168,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2026-01-27T16:38:03.180Z","updated_at":"2026-01-27T16:38:03.231Z","avatar_url":"https://github.com/ontodev.png","language":"Rust","funding_links":[],"categories":[],"sub_categories":[],"readme":"# rdftab.rs: RDF Tables with Rust\n\n`rdftab` reads RDFXML and generates a `statements` table like this:\n\nstanza | subject | predicate          | object                   | value | datatype | language\n-------|---------|--------------------|--------------------------|-------|----------|----------\nex:foo | ex:foo  | rdfs:label         |                          | Foo   |          |\nex:foo | ex:foo  | rdfs:label         |                          | Fou   |          | fr\nex:foo | ex:foo  | ex:size            |                          | 123   | xsd:int  |\nex:foo | ex:foo  | ex:link            | \u003chttp://example.com/foo\u003e |       |          |\nex:foo | ex:foo  | rdf:type           | owl:Class                |       |          |\nex:foo | ex:foo  | rdfs:subClassOf    | _:b1                     |       |          |\nex:foo | _:b1    | rdf:type           | owl:Restriction          |       |          |\nex:foo | _:b1    | owl:onProperty     | ex:part-of               |       |          |\nex:foo | _:b1    | owl:someValuesFrom | ex:bar                   |       |          |\n\nThis is an early prototype that only works with RDFXML input and SQLite databases.\nWe use the Rust programming language to read and insert as quickly as possible,\nusing as little memory as possible.\n\n## Usage\n\n1. download the binary for your platform\n   from the \"Assets\" section of the latest release on the\n   [Releases](https://github.com/ontodev/rdftab.rs/releases) page.\n2. make sure that the binary is executable\n3. create a SQLite database file with a [`prefix`](src/prefix.sql) table\n4. run `rdftab` with the database you want to use, and the RDFXML input as STDIN\n5. query your database with SQLite\n\n```\n$ curl -L -o rdftab https://github.com/ontodev/rdftab.rs/releases/download/v0.1.1/rdftab-x86_64-apple-darwin\n$ chmod +x rdftab\n$ sqlite3 example.db \u003c test/prefix.sql\n$ ./rdftab example.db \u003c test/example.owl\n$ sqlite3 example.db\n\u003e select * from statements limit 3;\n```\n\n## Build\n\nIf we haven't provided a binary for your platform,\nor you want to modify the `rdftab` code,\nyou can build the code as you would any Rust project:\n\n1. install Rust tools: [`rustup`](https://rustup.rs)\n2. clone this repository: `git clone https://github.com/ontodev/rdftab.rs \u0026\u0026 cd rdftab.rs`\n3. run [`cargo build`](https://doc.rust-lang.org/cargo/guide/working-on-an-existing-project.html)\n\n## Motivation\n\nRDF data consists of subject-predicate-object triples that form a graph.\nWith SPARQL we can perform complex queries over that graph.\nWith OWLAPI we can interpret that graph as a rich set of logical axioms.\nBut loading a large RDF graph into OWLAPI or a triplestore for SPARQL\ncan be slow and require a lot of memory.\n\nIn many cases the queries we want to run are actually quite simple.\nWe often just want all the triples associated with a set of terms,\nor all the subjects that match a given predicate and object.\nIn these cases, SQLite is actually very fast, efficient, and effective.\nBetter yet, you can use SQLite from the command line\nor pretty much any programming language.\n\n## Examples\n\n\u003ctable\u003e\n  \u003ctr\u003e\n    \u003cth\u003eTask\u003c/th\u003e\n    \u003cth\u003eSQL\u003c/th\u003e\n    \u003cth\u003eSPARQL\u003c/th\u003e\n  \u003c/tr\u003e\n\n  \u003ctr\u003e\n    \u003ctd\u003eGet subjects with labels\u003c/td\u003e\n    \u003ctd\u003e\n      \u003cpre lang=\"sql\"\u003eSELECT subject, value AS label\nFROM statements\nWHERE predicate = \"rdfs:label\";\u003c/pre\u003e\n    \u003c/td\u003e\n    \u003ctd\u003e\n      \u003cpre lang=\"sparql\"\u003eSELECT ?subject, ?label\nWHERE {\n  ?subject rdfs:label ?label .\n}\u003c/pre\u003e\n    \u003c/td\u003e\n  \u003c/tr\u003e\n\n  \u003ctr\u003e\n    \u003ctd\u003eGet OWL classes with labels\u003c/td\u003e\n    \u003ctd\u003e\n      \u003cpre lang=\"sql\"\u003eSELECT s1.subject, s2.value AS label\nFROM statements s1\nJOIN statements s2 ON s2.subject = s1.subject\nWHERE s1.predicate = \"rdf:type\"\n  AND s1.object = \"owl:Class\"\n  AND s2.predicate = \"rdfs:label\";\u003c/pre\u003e\n    \u003c/td\u003e\n    \u003ctd\u003e\n      \u003cpre lang=\"sparql\"\u003eSELECT ?subject, ?label\nWHERE {\n  ?subject\n    rdf:type owl:Class ;\n    rdfs:label ?label .\n}\u003c/pre\u003e\n    \u003c/td\u003e\n  \u003c/tr\u003e\n\n  \u003ctr\u003e\n    \u003ctd\u003eGet all triples for a subject, including nested anonymous structures such as OWL class expressions and OWL annotation axioms\u003c/td\u003e\n    \u003ctd\u003e\n      \u003cpre lang=\"sql\"\u003eSELECT *\nFROM statements\nWHERE stanza = \"ex:foo\";\u003c/pre\u003e\n    \u003c/td\u003e\n    \u003ctd\u003e\n    Annoying...\n    \u003c/td\u003e\n  \u003c/tr\u003e\n\u003c/table\u003e\n\n\n## Design\n\nIf you've worked with RDF before,\nall of these columns in the example above should be familiar,\nexcept for `stanza`.\nWe'll discuss stanzas in a moment.\n\nIn each of these columns, values are encoded pretty much as you would in Turtle syntax:\n\n- IRIs (URLs) are wrapped in angle brackets: `\u003chttp://example.com/foo\u003e`\n- prefixed names use a prefix from the `prefix` table: `ex:foo`\n- blank nodes start with `_:`: `_:b1234`\n\nSome differences from Turtle syntax:\n\n- literals are multiline strings, without enclosing quotations marks or escaping\n- language tags do not include an `@`\n\nThis means it's quite simple to convert this table to Turtle format.\nAs a first pass:\n\n```sql\nSELECT\n  \"@prefix \" || prefix || \": \u003c\" || base || \"\u003e .\"\nFROM prefix\nUNION ALL\nSELECT \n   subject\n|| \" \"\n|| predicate\n|| \" \"\n|| coalesce(\n     object,\n     \"\"\"\" || value || \"\"\"^^\" || datatype,\n     \"\"\"\" || value || \"\"\"@\" || language,\n     \"\"\"\" || value || \"\"\"\"\n   )\n|| \" .\"\nFROM statements;\n```\n\nThe [`src/turtle.sql`](src/turtle.sql) file is a more complete example,\nwith better escaping of special characters.\n\n### Objects\n\nWe use four columns to encode RDF objects, which fall into four types:\n\n1. IRI: use the `object` column; `value`, `datatype`, and `language` are NULL\n2. Plain literal: use the `value` column; `object`, `datatype`, and `language` are NULL\n3. Typed literal: use the `value` and `datatype` columns; `object` and `language` are NULL\n4. Langage tagged literal: use the `value` and `language` columns; `object` and `datatype` are NULL\n\n### Prefixes\n\nWhile any IRI can be wrapped in angle brackets,\nit's much easier for people to read prefixed names.\nWhen reading RDFXML `rdftab` uses a `prefix` table from your SQLite database,\nand tries to convert each IRI it encounters into a prefixes name.\n[`src/prefix.sql`](src/prefix.sql) provides an example.\n\nSome warnings:\n\n- Since SQL simply compares strings, not expanded IRIs,\n  it's your job to ensure that your prefixes are consistent across your data.\n- Turtle prefixed names are a superset of XML QNames and a subset of CURIEs.\n  `rdftab`'s prefix handling is currently very primitive.\n  Depending on your choices of prefixes and the IRIs in your RDF,\n  `rdftab` may generate prefixed names that are not valid in Turtle.\n\n### Stanzas\n\nThe RDF graph structure is exceedingly simple.\nTo encode data with more structure than a simple triple,\nwe usually construct some sort of tree using blank nodes as subjects.\nTo encode an OWL class expression \"rdfs:subClassOf (ex:part-of some ex:bar)\"\nwe use a little tree like this:\n\n```ttl\nex:foo rdfs:subClassOf _:b1 .\n_:b1 rdf:type owl:Restriction .\n_:b1 owl:onProperty ex:part-of .\n_:b1 owl:someValuesFrom ex:bar .\n```\n\nWhen we want to query for all the information about `ex:foo`,\nwe can't simply ask for all the subjects matching `ex:foo`.\nWe also have to query for `_:b1`.\nIn general, we have to recurse through these trees of blank nodes.\n\nTurtle provides some \"syntactic sugar\" for nested anonymous structures,\nand Turtle processors also group together all the triples about a given subject:\n\n```ttl\nex:foo\n  rdfs:label \"Foo\", \"Fou\"@fr ;\n  ex:size \"123\"^^xsd:int ;\n  ex:link \u003chttp://example.com/foo\u003e ;\n  rdf:type owl:Class ;\n  rdfs:subClassOf [\n    rdf:type owl:Restriction ;\n    owl:onProperty ex:part-of ;\n    owl:someValuesFrom ex:bar\n  ] .\n```\n\nWhen we ask for all the information about `ex:foo` this is what we want!\nIn the [Turtle grammar](https://www.w3.org/TR/turtle/#sec-grammar-grammar)\nthis is just called `triples`,\nbut we call it a \"stanza\".\nRDFXML has a similar stanza structure,\nwhere each child element of the root element is a tree\nspecifying a particular subject,\nand various nested anonymous structures are encoded in the XML tree structure.\n\n```xml\n\u003cowl:Class rdf:about=\"http://example.com/foo\"\u003e\n  \u003crdfs:label\u003eFoo\u003c/rdfs:label\u003e\n  \u003crdfs:label xml:lang=\"fr\"\u003eFou\u003c/rdfs:label\u003e\n  \u003cex:size rdf:datatype=\"http://www.w3.org/2001/XMLSchema#int\"\u003e123\u003c/ex:size\u003e\n  \u003cex:link rdf:resource=\"http://example.com/foo\"/\u003e\n  \u003crdfs:subClassOf\u003e\n    \u003cowl:Restriction\u003e\n      \u003cowl:onProperty rdf:resource=\"http://example.com/part-of\"/\u003e\n      \u003cowl:someValuesFrom rdf:resource=\"http://example.com/bar\"/\u003e\n    \u003c/owl:Restriction\u003e\n  \u003c/rdfs:subClassOf\u003e\n\u003c/owl:Class\u003e\n```\n\n(See [example.owl](test/examle.owl).)\n\nTo encode stanza information in the `statements` table,\n`rdftab` uses a [slightly modified version](https://github.com/ontodev/rio)\nof [`rio`](https://github.com/oxigraph/rio)\nthat emits a special triple when a child element of the RDFXML root element is closed.\nThis information is used to associate the \"top-level subject\"\nwith all the triples that came out of that element.\nWe put that top-level subject in the `stanza` column.\n\nLooking back to our main example,\nyou can see that the subjects `ex:foo` and `_:b1` both the same stanza `ex:foo`.\nNow when we query SQLite for `stanza = \"ex:foo\"`\nwe will get all the triples for the subject `ex:foo`\n**and** all of the nested anonymous structures.\n\nNote that the `stanza` column is usually a named subject,\nbut there are also cases where the top-level subject is a blank node.\n\n### OWL Annotation Axioms\n\nOWL Annotation Axioms provide a way to make statements about other statements in the RDF graph.\nFor example, we can add a comment on a label:\n\n```ttl\nex:foo rdfs:label \"Foo\" .\n[ rdf:type owl:Axiom ;\n  owl:annotatedSource ex:foo ;\n  owl:annotatedProperty ex:label ;\n  owl:annotatedTarget \"Foo\" ;\n  rdfs:comment \"A silly label\"\n] .\n```\n\nThe top-level subject for the OWL Annotation Axiom is a blank node.\nHowever when we query for `ex:foo` we want to get this information as well.\nSo `rdftab` looks for the `owl:annotatedSource` predicate,\nand uses the object of that triple as the stanza.\n\nstanza | subject | predicate             | object                   | value         | datatype | language\n-------|---------|-----------------------|--------------------------|---------------|----------|----------\nex:foo | ex:foo  | rdfs:label            |                          | Foo           |          |\nex:foo | _:b1    | rdf:type              | owl:Axiom                |               |          |\nex:foo | _:b1    | owl:annotatedSubject  | ex:foo                   |               |          |\nex:foo | _:b1    | owl:annotatedProperty | rdfs:label               |               |          |\nex:foo | _:b1    | owl:annotatedTarget   |                          | Foo           |          |\nex:foo | _:b1    | rdfs:comment          |                          | A silly label |          |\n\n\n### Stanza edge cases for OWL\n\nNote that OWL does not prioritize the directionality of some symmetric axiom types - for example, when you have a disjointness axiom or equivalence axiom connecting two named classes. In this case, the stanza is chosen arbitrarily.\n\nE.g. in an ontology that has `A disjointWith B` we get:\n\nstanza|subject|predicate|object|value|datatype|language\n---|---|---|---|---|---|---\nex:B|ex:B|rdfs:label||B||\nex:B|ex:B|owl:disjointWith|ex:a|||\nex:B|ex:B|rdf:type|owl:Class|||\nex:a|ex:a|rdfs:label||A||\nex:a|ex:a|rdf:type|owl:Class|||\n\nThis means that the convenience query shown above does not reliably fetch all axioms for a class.\nFor example, in the above querying on A would not get the disjointness axiom\n\nAdditionally, the stanza name may not be meaningful for GCIs.\nFor example, given an axiom in Manchester syntax:\n\n```\ndevelops-from some (part-of some B) DisjointWith develops-from some (part-of some A)\n```\n\nwhich gives you this RDF/OWL:\n\n```xml\n   \u003cowl:Restriction\u003e\n        \u003cowl:onProperty rdf:resource=\"http://example.com/develops-from\"/\u003e\n        \u003cowl:someValuesFrom\u003e\n            \u003cowl:Restriction\u003e\n                \u003cowl:onProperty rdf:resource=\"http://example.com/part-of\"/\u003e\n                \u003cowl:someValuesFrom rdf:resource=\"http://example.com/B\"/\u003e\n            \u003c/owl:Restriction\u003e\n        \u003c/owl:someValuesFrom\u003e\n        \u003cowl:disjointWith\u003e\n            \u003cowl:Restriction\u003e\n                \u003cowl:onProperty rdf:resource=\"http://example.com/develops-from\"/\u003e\n                \u003cowl:someValuesFrom\u003e\n                    \u003cowl:Restriction\u003e\n                        \u003cowl:onProperty rdf:resource=\"http://example.com/part-of\"/\u003e\n                        \u003cowl:someValuesFrom rdf:resource=\"http://example.com/a\"/\u003e\n                    \u003c/owl:Restriction\u003e\n                \u003c/owl:someValuesFrom\u003e\n            \u003c/owl:Restriction\u003e\n        \u003c/owl:disjointWith\u003e\n    \u003c/owl:Restriction\u003e\n```\n\nwith RDFTab we get:\n\nstanza|subject|predicate|object|value|datatype|language\n---|---|---|---|---|---|---\nowl:disjointWith|_:riog00000011|owl:disjointWith|_:riog00000013|||\nowl:disjointWith|_:riog00000013|owl:someValuesFrom|_:riog00000014|||\nowl:disjointWith|_:riog00000014|owl:someValuesFrom|ex:a|||\nowl:disjointWith|_:riog00000014|owl:onProperty|ex:part-of|||\nowl:disjointWith|_:riog00000014|rdf:type|owl:Restriction|||\nowl:disjointWith|_:riog00000013|owl:onProperty|ex:develops-from|||\nowl:disjointWith|_:riog00000013|rdf:type|owl:Restriction|||\nowl:disjointWith|_:riog00000011|owl:someValuesFrom|_:riog00000012|||\nowl:disjointWith|_:riog00000012|owl:someValuesFrom|ex:B|||\nowl:disjointWith|_:riog00000012|owl:onProperty|ex:part-of|||\nowl:disjointWith|_:riog00000012|rdf:type|owl:Restriction|||\nowl:disjointWith|_:riog00000011|owl:onProperty|ex:develops-from|||\nowl:disjointWith|_:riog00000011|rdf:type|owl:Restriction|||\n\nIn this case, to fetch all axioms for class ex:a or ex:B we need to iteratively query to walk up the graph\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fontodev%2Frdftab.rs","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fontodev%2Frdftab.rs","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fontodev%2Frdftab.rs/lists"}