https://github.com/ontodev/ldtab

Linked Data Tables: General documentation, specifications, and tests
https://github.com/ontodev/ldtab

Last synced: 5 months ago
JSON representation

Linked Data Tables: General documentation, specifications, and tests

Host: GitHub
URL: https://github.com/ontodev/ldtab
Owner: ontodev
License: bsd-3-clause
Created: 2023-11-28T16:55:15.000Z (over 2 years ago)
Default Branch: main
Last Pushed: 2024-05-05T17:04:40.000Z (about 2 years ago)
Last Synced: 2024-05-05T18:23:59.589Z (about 2 years ago)
Language: Makefile
Size: 29.3 KB
Stars: 0
Watchers: 3
Forks: 0
Open Issues: 4
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

          # ldtab: Linked Data Tables

`ldtab` reads an RDF graph and generates a `statements` table like this:

assertion | retraction | graph | subject     | predicate       | object   | datatype | annotation

----------|------------|-------|-------------|-----------------|----------|----------|------------

1         | 0          | graph | pizza:Pizza | skos:prefLabel  | Pizza    | @en	     | 

1         | 0          | graph | pizza:Pizza | rdfs:seeAlso    |  | _IRI	|

1         | 0          | graph | pizza:Pizza | rdfs:label      | Pizza | @en	|

1         | 0          | graph | pizza:Pizza | rdfs:subClassOf | {"owl:onProperty":[{"datatype":"_IRI","object":"pizza:hasBase"}],"owl:someValuesFrom":[{"datatype":"_IRI","object":"pizza:PizzaBase"}],"rdf:type":[{"datatype":"_IRI","object":"owl:Restriction"}]} | _JSON	 | 

1         | 0          | graph | pizza:Pizza | rdfs:subClassOf | pizza:Food | _IRI	|

1         | 0          | graph | pizza:Pizza | rdf:type        | owl:Class | _IRI |

The design of `ldtab` is still in development. 

A prototype implementation is available in Clojure: [ldtab.clj](https://github.com/ontodev/ldtab.clj).

This implementation uses Jena to parse input RDF graphs and supports SQLite and PostgreSQL databases.

## Motivation

The motivation for `ldtab` is threefold:

1. facilitate work with *large RDF graphs*,

2. *simplify* certain SPARQL queries for complex RDF structures involving blank nodes,

3. enable text-based *diffs* between different versions of an RDF graph.

The following provides more details and examples for each of these goals. 

### 1. Querying large RDF Graphs 

RDF data consists of subject-predicate-object triples that form a graph.

With SPARQL we can perform queries over that graph.

However, loading a large RDF graph into a triplestore for SPARQL can be slow and require a lot of memory (similar issues exist with tools for OWL ontologies).

Yet, in many cases the queries we want to run are actually quite simple.

We often just want all the triples associated with a set of terms,

or all the subjects that match a given predicate and object.

In these cases, SQLite is both efficient and effective.

Consider the following examples:

  

    Task

    SQL

    SPARQL

  

  

    Get subjects with labels

    

      
SELECT subject, object AS label

FROM statements

WHERE predicate = "rdfs:label";

    

    

      SELECT ?subject, ?label

WHERE {

  ?subject rdfs:label ?label .

}

    

  

  

    Get OWL classes with labels

    

      
SELECT s1.subject, s2.object AS label

FROM statements s1

JOIN statements s2 ON s2.subject = s1.subject

WHERE s1.predicate = "rdf:type"

  AND s1.object = "owl:Class"

  AND s2.predicate = "rdfs:label";

    

    

      SELECT ?subject, ?label

WHERE {

  ?subject

    rdf:type owl:Class ;

    rdfs:label ?label .

}

    

  

### 2. Simplify Complex Queries 

Querying RDF data for an entity can be annoying and error-prone

if the entities representation involves complex structures, such as compound OWL class expressions or OWL annotation axioms.

In `ldtab`, such queries can be constructed in a straightforward manner:

  

    Task

    SQL

    SPARQL

  

  

    Get all relevant RDF triples for a subject (including nested anonymous structures such as OWL class expressions)

    

      
SELECT *

FROM statements

WHERE subject = "pizza:Pizza";

    

    

    Annoying...

    

  

### 3. Text-based Diffs between RDF Graphs

An RDF graph can be serialized in many equivalent ways.

Even for a given concrete syntax, the serialization of an RDF graph is not uniquely determined.

In practice, existing tools rarely guarantee to output the exact same serialization (using a single concrete syntax) of a given RDF graph. 

This makes tracking changes in RDF graphs (or OWL ontologies) using popular version control systems, e.g., git, challenging.

`ldtab` provides support to serialize an RDF graph in a uniquely determined manner, enabling text-based `diff`s in version control systems.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/ontodev/ldtab

Awesome Lists containing this project

README