https://github.com/weblyzard/streaming-sparql

Cross-server SPARQL query library with support for incremental, streaming result processing.
https://github.com/weblyzard/streaming-sparql

Last synced: 7 months ago
JSON representation

Cross-server SPARQL query library with support for incremental, streaming result processing.

Host: GitHub
URL: https://github.com/weblyzard/streaming-sparql
Owner: weblyzard
License: apache-2.0
Created: 2017-08-26T09:53:49.000Z (over 8 years ago)
Default Branch: master
Last Pushed: 2024-05-31T13:37:26.000Z (over 1 year ago)
Last Synced: 2025-04-04T18:12:45.471Z (8 months ago)
Language: Java
Homepage:
Size: 140 KB
Stars: 6
Watchers: 13
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

awesome-semantic-web - streaming-sparql

README

## Streaming SPARQL
[![Build Status](https://github.com/weblyzard/streaming-sparql/actions/workflows/build.yml/badge.svg)](https://github.com/weblyzard/streaming-sparql/actions/workflows/build.yml)

Provides a robust, incremental processing of streaming results received from SPARQL servers.
The `StreamingResultSet` iterator yields results as they are received from the server.

## Javadoc

http://javadoc.io/doc/com.weblyzard.sparql/streaming-sparql/

## Example code:
```java
try (StreamingResultSet s = StreamingQueryExecutor.getResultSet("http://dbpedia.org/sparql", "SELECT ?s ?p ?o WHERE { ?s ?p ?o. } LIMIT 5")) {
while (s.hasNext()) {
System.out.println("Tupel " + s.getRowNumber() + ": " + s.next())
}
}
```

## Command line client

Streaming SPARQL also provides a command line client for testing queries.

### Usage

```bash
java -jar ./streaming-client-0.0.7-SNAPSHOT.jar
QueryEntitites [URL] [Query]
URL ... URL to the linked data repository
Query ... The query to perform on the server
```

### Example
```bash
java -jar ./streaming-client-0.0.7-SNAPSHOT.jar http://localhost:8080/rdf4j-sesame/test "SELECT ?s ?p ?o WHERE { ?s ?p ?o. } LIMIT 5"
```

## Background

We have been using Fuseki and RDF4j together with comprehensive result sets (> 100 Mio. tuple) which lead to
instabilities with the native libraries that have been extremely difficult to debug.

Example error messages on the server site have been:

```
[2017-05-04 19:50:14] Fuseki WARN [1450] Runtime IO Exception (client left?) RC = 500 : org.eclipse.jetty.io.EofException
org.apache.jena.atlas.RuntimeIOException: org.eclipse.jetty.io.EofException
```

```
[2017-05-04 19:50:14] Fuseki WARN (HttpChannel.java:468) (and one from ServletHandler.java:631):
java.io.IOException: java.util.concurrent.TimeoutException: Idle timeout expired: 30001/30000 m
```

These problems triggered the development of Streaming SPARQL which has proven to be very robust - even for queries that take more than one hour to process and transfer multiple gigabytes of results.
(Note: you will need to call `getResultSet` with a higher timeout to prevent TimeoutExceptions on the server).

## Compatiblity

Streaming SPARQL is known to work with Jena, OpenRDF, RDF4j and Virtuoso.

## Changelog

Please refer to the [release](https://github.com/weblyzard/streaming-sparql/releases) page.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/weblyzard/streaming-sparql

Awesome Lists containing this project

README