Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/jervenbolleman/sparql-readonly

Test and experimental code, not for any production just for me to learn
https://github.com/jervenbolleman/sparql-readonly

Last synced: 19 days ago
JSON representation

Test and experimental code, not for any production just for me to learn

Host: GitHub
URL: https://github.com/jervenbolleman/sparql-readonly
Owner: JervenBolleman
License: other
Created: 2017-04-25T08:06:19.000Z (over 7 years ago)
Default Branch: master
Last Pushed: 2021-01-12T18:45:29.000Z (almost 4 years ago)
Last Synced: 2024-05-18T07:42:16.430Z (8 months ago)
Language: Java
Size: 110 KB
Stars: 0
Watchers: 2
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE.md

Awesome Lists containing this project

README

        

The aim of this project is to have a SPARQL capabable triple store for 

datasets that are bulkloaded and then do not change, have a small number of 

predicates (<1024), limited number of graphs (<128) but large numbers of 

triples (20 billion>) In other words datasets that look like UniProt in 2016.

Literals, IRIs and blanknodes are stored in separate dictionaries.

Each hopefully optimized for their contents. IRIs are split into 

multiple dictionaries, the first selecting on a namespace (in this case

last '/' unless overriden) then the rest of the IRI localname. We aim 

to detect if a the localname part is a digit. In which case they 

will be stored in a bitset. As we are readonly the dictionary will

be stored sorted.

During loading we need two passes over the data first to create the value dictionaries, then to build the triple tables. There will be a triple table for each predicate + value type combination.