Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/Fedict/dcattools

Various DCAT tools for updating data.gov.be
https://github.com/Fedict/dcattools

Last synced: about 2 months ago
JSON representation

Various DCAT tools for updating data.gov.be

Awesome Lists containing this project

README

        

# Enhancers / SPARQL scripts

## Scripts.txt file

The scraper jar contains one scripts.txt file per scraper package / data source (stored as a resource).
This file contains the list of additional data files to be loaded and/or additional SPARQL scripts to be run.

E.g. scripts.txt in `be.gov.data.scrapers.bipt`:

```
bipt/data-publ-contact.ttl
bipt/sparql-publ-contact.qry
bipt/sparql-license.qry
bipt/sparql-theme.qry
sparql-geo-belgium.qry
data-media.ttl
sparql-map-media.qry
clear-skos.qry
```

Data are loaded and scripts are executed in the order they appear in the file.
Files ending with .ttl are considered to be data files, files ending with .qry are SPARQL query files.
Lines starting with a `#` are comments, and are thus ignored.

Files prefixed with `/` are specific files located in the scraper package `be.gov.data.scrapers.`,
generic queries / data are part of the `be.gov.data.scrapers` package and are not prefixed.

## Note on mappings

Mapping e.g. free text keywords to the controlled list of DCAT themes is done by
loading an RDF file with SKOS mapping (using altLabel or exactMatch), performing a
SparqlUpdate query and removing the SKOS triples afterwards.

DCAT uses several pre-defined controlled vocabularies (e.g. file types, geo...),
but these vocabularies are not directly supported by Drupal / the data.gov.be tools.

Therefore, one has to manually create the taxonomies in Drupal, and map them to
the URIs of the controlled DCAT vocabularies.

For example:

@prefix skos: .


skos:exactMatch ;
skos:prefLabel "AGRICULTURE, FISHERIES, FORESTRY, FOOD"@en ;
skos:altLabel "arbre"@fr, "paysage"@fr .

This indicates that the harvested metadata uses terms like "arbre" and "paysage",
which are both corresponding to the http://data.gov.be/en/taxonomy/term/32
term on data.gov.be, and also (via skos:exactMatch) to the Agriculture theme
on the EU Data portal.

The same goes for the taxonomy of organizations, geo coverage and file types.