https://github.com/datafabricrus/rya-beam-pipelines
Apache Beam Pipelines for Apache Rya
https://github.com/datafabricrus/rya-beam-pipelines
apache-beam apache-rya google-dataflow
Last synced: about 1 year ago
JSON representation
Apache Beam Pipelines for Apache Rya
- Host: GitHub
- URL: https://github.com/datafabricrus/rya-beam-pipelines
- Owner: DataFabricRus
- License: mit
- Created: 2018-08-07T09:53:26.000Z (almost 8 years ago)
- Default Branch: master
- Last Pushed: 2019-01-11T16:33:42.000Z (over 7 years ago)
- Last Synced: 2025-02-17T15:15:38.956Z (over 1 year ago)
- Topics: apache-beam, apache-rya, google-dataflow
- Language: Java
- Homepage:
- Size: 67.4 KB
- Stars: 0
- Watchers: 3
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Apache Beam Pipelines for Apache Rya
Pipelines:
* `bulkload` - loads RDF in the triplestore,
* `statistics` - reads triples from the SPO index, generates statistics (aka [Prospects Table](https://github.com/apache/incubator-rya/blob/master/extras/rya.manual/src/site/markdown/eval.md)) about the triples and writes them to a separate index.
* `elasticsearch` - reads triples from the SPO index, generates the full text index and writes it in Elasticsearch.
> At the moment, only the [DataFabric's fork](http://github.com/DataFabricRus/incubator-rya) of Apache Rya is supported.
## Supported runners
Current implementations were tested with [Google Dataflow](https://cloud.google.com/dataflow/) only.