https://github.com/weblyzard/streaming-sparql
Cross-server SPARQL query library with support for incremental, streaming result processing.
https://github.com/weblyzard/streaming-sparql
Last synced: 7 months ago
JSON representation
Cross-server SPARQL query library with support for incremental, streaming result processing.
- Host: GitHub
- URL: https://github.com/weblyzard/streaming-sparql
- Owner: weblyzard
- License: apache-2.0
- Created: 2017-08-26T09:53:49.000Z (over 8 years ago)
- Default Branch: master
- Last Pushed: 2024-05-31T13:37:26.000Z (over 1 year ago)
- Last Synced: 2025-04-04T18:12:45.471Z (8 months ago)
- Language: Java
- Homepage:
- Size: 140 KB
- Stars: 6
- Watchers: 13
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- awesome-semantic-web - streaming-sparql
README
## Streaming SPARQL
[](https://github.com/weblyzard/streaming-sparql/actions/workflows/build.yml)
Provides a robust, incremental processing of streaming results received from SPARQL servers.
The `StreamingResultSet` iterator yields results as they are received from the server.
## Javadoc
http://javadoc.io/doc/com.weblyzard.sparql/streaming-sparql/
## Example code:
```java
try (StreamingResultSet s = StreamingQueryExecutor.getResultSet("http://dbpedia.org/sparql", "SELECT ?s ?p ?o WHERE { ?s ?p ?o. } LIMIT 5")) {
while (s.hasNext()) {
System.out.println("Tupel " + s.getRowNumber() + ": " + s.next())
}
}
```
## Command line client
Streaming SPARQL also provides a command line client for testing queries.
### Usage
```bash
java -jar ./streaming-client-0.0.7-SNAPSHOT.jar
QueryEntitites [URL] [Query]
URL ... URL to the linked data repository
Query ... The query to perform on the server
```
### Example
```bash
java -jar ./streaming-client-0.0.7-SNAPSHOT.jar http://localhost:8080/rdf4j-sesame/test "SELECT ?s ?p ?o WHERE { ?s ?p ?o. } LIMIT 5"
```
## Background
We have been using Fuseki and RDF4j together with comprehensive result sets (> 100 Mio. tuple) which lead to
instabilities with the native libraries that have been extremely difficult to debug.
Example error messages on the server site have been:
```
[2017-05-04 19:50:14] Fuseki WARN [1450] Runtime IO Exception (client left?) RC = 500 : org.eclipse.jetty.io.EofException
org.apache.jena.atlas.RuntimeIOException: org.eclipse.jetty.io.EofException
```
```
[2017-05-04 19:50:14] Fuseki WARN (HttpChannel.java:468) (and one from ServletHandler.java:631):
java.io.IOException: java.util.concurrent.TimeoutException: Idle timeout expired: 30001/30000 m
```
These problems triggered the development of Streaming SPARQL which has proven to be very robust - even for queries that take more than one hour to process and transfer multiple gigabytes of results.
(Note: you will need to call `getResultSet` with a higher timeout to prevent TimeoutExceptions on the server).
## Compatiblity
Streaming SPARQL is known to work with Jena, OpenRDF, RDF4j and Virtuoso.
## Changelog
Please refer to the [release](https://github.com/weblyzard/streaming-sparql/releases) page.