Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/pudo/graphkit
Process data based on JSON schema
https://github.com/pudo/graphkit
Last synced: about 2 months ago
JSON representation
Process data based on JSON schema
- Host: GitHub
- URL: https://github.com/pudo/graphkit
- Owner: pudo
- License: mit
- Archived: true
- Created: 2015-08-06T14:26:43.000Z (about 9 years ago)
- Default Branch: master
- Last Pushed: 2015-09-20T14:32:10.000Z (about 9 years ago)
- Last Synced: 2024-07-27T00:45:04.786Z (2 months ago)
- Language: Python
- Size: 295 KB
- Stars: 12
- Watchers: 5
- Forks: 2
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# graphkit [![Build Status](https://travis-ci.org/pudo/graphkit.svg?branch=master)](https://travis-ci.org/pudo/graphkit)
GraphKit is a pipeline processing tool for graph-based data extraction,
transformation and analysis. The tool's graph model is based on annotated
[JSON schema](http://json-schema.org/) definitions.A typical pipeline might extract data from a set of CSV files or database
tables, translate them to JSON using a given schema, combine them into an
RDF graph, perform de-duplication and data integration, and eventually run
a set of queries on the resulting graph.## Stages
The following stages / operations should be supported in the graph processing
pipeline:* ``csv:read``: Generate an iterator from a CSV file.
* ``readtable``: Generate an iterator from a SQL database table.
* ``json:map``: Apply a JSON schema mapping to the data coming from a source.
* ``rdf:load``: Import the data from a JSON stream into a triple store.
* ``rdf:dedupe``: Apply sameAs mappings based on some external mapping file.
* ``rdf:sparql``: Run a SPARQL query.
* ``mql:query``: Run an MQL query.
* ``rdf:dump``: Export RDF data to a file.
* ``json:unmap``: Apply a JSON schema mapping to convert objects to a flat table.
* ``csv:write``: Export data to a CSV file.To link flat data structures to nested object graphs matching JSON schema
definitions, ``jsonmapping`` is used.## Tests
The test suite will usually be executed in it's own ``virtualenv`` and perform a
coverage check as well as the tests. To execute on a system with ``virtualenv``
and ``make`` installed, type:```bash
$ make test
```