{"id":25536797,"url":"https://github.com/balhoff/blazegraph-runner","last_synced_at":"2025-09-10T08:41:07.265Z","repository":{"id":23039502,"uuid":"97999461","full_name":"balhoff/blazegraph-runner","owner":"balhoff","description":"Simple CLI for Blazegraph","archived":false,"fork":false,"pushed_at":"2024-08-05T16:00:05.000Z","size":105,"stargazers_count":9,"open_issues_count":18,"forks_count":5,"subscribers_count":3,"default_branch":"master","last_synced_at":"2025-03-25T11:01:35.800Z","etag":null,"topics":["blazegraph","monarchinitiative","rdf"],"latest_commit_sha":null,"homepage":null,"language":"Scala","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"bsd-3-clause","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/balhoff.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.txt","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null}},"created_at":"2017-07-22T01:48:03.000Z","updated_at":"2023-05-01T15:29:44.000Z","dependencies_parsed_at":"2023-10-20T18:16:36.419Z","dependency_job_id":null,"html_url":"https://github.com/balhoff/blazegraph-runner","commit_stats":null,"previous_names":[],"tags_count":18,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/balhoff%2Fblazegraph-runner","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/balhoff%2Fblazegraph-runner/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/balhoff%2Fblazegraph-runner/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/balhoff%2Fblazegraph-runner/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/balhoff","download_url":"https://codeload.github.com/balhoff/blazegraph-runner/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248424450,"owners_count":21101178,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["blazegraph","monarchinitiative","rdf"],"created_at":"2025-02-20T04:37:48.302Z","updated_at":"2025-04-11T14:50:59.501Z","avatar_url":"https://github.com/balhoff.png","language":"Scala","funding_links":[],"categories":[],"sub_categories":[],"readme":"# blazegraph-runner\n`blazegraph-runner` provides a simple command-line wrapper for the [Blazegraph](https://www.blazegraph.com) open source \nRDF database. It provides operations on an \"offline\" database, so that you can easily load data or execute queries against \na Blazegraph journal file, without needing to run it as an HTTP SPARQL server.\n\n## Usage\n\n```\nUsage\n\n blazegraph-runner [options] command [command options]\n\nOptions\n\n   --informat   : Input format\n   --journal    : Blazegraph journal file\n   --outformat  : Output format\n   --properties : Blazegraph properties file\n\nCommands\n\n   construct \u003cquery\u003e \u003coutput\u003e : SPARQL construct\n\n   dump [command options] \u003coutput\u003e : Dump Blazegraph database to an RDF file\n      --graph : Named graph to load triples into\n\n   load [command options] \u003cdata-files\u003e ... : Load triples\n      --base=STRING\n      --graph              : Named graph to load triples into\n      --use-ontology-graph\n\n   reason [command options] : Materialize inferences\n      --append-graph-name=STRING : If a target-graph is not provided, append this text to the end of source graph name to use as target graph for inferred statements.\n      --merge-sources            : Merge all selected source graphs into one set of statements before reasoning. Inferred statements will be stored in provided `target-graph`, or else in the default graph. If `merge-sources` is false (default), source graphs will be reasoned separately and in parallel.\n      --ontology                 : Ontology to use as rule source. If the passed value is a valid filename, the ontology will be read from the file. Otherwise, if the value is an ontology IRI, it will be loaded from the database if such a graph exists, or else, from the web.\n      --parallelism=NUM          : Maximum graphs to simultaneously either read from database or run reasoning on.\n      --reasoner=STRING          : Reasoner choice: 'arachne' (default) or 'whelk'\n      --rules-file               : Reasoning rules in Jena syntax.\n      --source-graphs            : Space-separated graph IRIs on which to perform reasoning (must be passed as one shell argument).\n      --source-graphs-query      : File name or query text of SPARQL select used to obtain graph names on which to perform reasoning. The query must return a column named `source_graph`.\n      --target-graph             : Named graph to store inferred statements.\n\n   select \u003cquery\u003e \u003coutput\u003e : SPARQL select\n\n   update \u003cupdate\u003e : SPARQL update\n```\n\n## Commands\n\nThere are a number of general options that apply to all commands:\n\n- **journal**: Blazegraph journal file\n- **properties**: Blazegraph properties file. If this is not set, a [default properties file](https://github.com/balhoff/blazegraph-runner/blob/master/src/main/resources/org/renci/blazegraph/blazegraph.properties) is used that includes named graphs and text indexing.\n- **informat**: Input format. Valid values for this option depend on the command.\n- **outformat**: Output format. Valid values for this option depend on the command.\n\n\n### Load\n\nLoad RDF data from files into a Blazegraph journal. A list of files or folders can be passed to the command; folders will \nbe recursively searched for data files.\n\n```\nblazegraph-runner load --journal=blazegraph.jnl --graph=\"http://example.org/mydata\" --informat=rdfxml mydata1.rdf mydata2.rdf\n```\n\nWhen loading multiple files (or a folder with files), by default each file is loaded under its own graph, currently named using the file's path. (This can be exploited in a SPARQL query, for example to distinguish between triples coming from different files.)\n\nIf your data files are OWL ontologies, `blazegraph-runner` can efficiently search within each file to find the ontology IRI \nif you want to use it as the target named graph:\n\n```\nblazegraph-runner load --journal=blazegraph.jnl --use-ontology-graph=true --informat=rdfxml go.owl\n```\n\nIf you set `--use-ontology-graph=true` and also provide a value for `--graph`, the `--graph` will be used as a fallback value \nin the case that an ontology IRI is not found.\n\n### Dump\n\nExport RDF data from a Blazegraph journal to a file. If a value for `--graph` is provided, only data from that graph is \nexported. If `--graph` is not provided, data from the default graph will be exported. *In the future this command should \nbe extended to dump all graphs to separate file or dump all data to a quad format.*\n\n```\nblazegraph-runner dump --journal=blazegraph.jnl --graph=\"http://example.org/mydata\" --outformat=turtle mydata.ttl\n```\n\n### Select\n\nQuery a Blazegraph journal using SPARQL SELECT. Results can be output as TSV, XML, or JSON.\n\n```\nblazegraph-runner select --journal=blazegraph.jnl --outformat=tsv myquery.rq mydata.tsv\n```\n\n### Construct\n\nQuery a Blazegraph journal using SPARQL CONSTRUCT. Results can be output as Turtle, RDFXML, or N-triples.\n\n```\nblazegraph-runner construct --journal=blazegraph.jnl --outformat=turtle myquery.rq mydata.ttl\n```\n\n### Update\n\nApply a SPARQL UPDATE to modify data in a Blazegraph journal.\n\n```\nblazegraph-runner update --journal=blazegraph.jnl myupdate.rq\n```\n\n### Reason\n\nMaterialize inferences derived from data in a Blazegraph journal, and store the inferred triples back to the journal. Reasoning rules are applied in-memory using the [Arachne](https://github.com/balhoff/arachne) reasoner. This command has a number of different options:\n\n- **rules-file**: a file of reasoning rules in [Jena rules format](https://jena.apache.org/documentation/inference/index.html) (not all Jena rule constructs are supported by Arachne).\n- **ontology**: an OWL ontology to convert to reasoning rules. If the passed value is a valid filename, the ontology will be read from the file. Otherwise, if the value is an IRI, it will be loaded from the Blazegraph journal if such a graph exists, or else, `blazegraph-runner` will attempt to download it from the web.\n- **target-graph**: the graph IRI in which to store inferred triples\n- **append-graph-name**: if `target-graph` is not provided, text provided with this option will be appended to the graph name of a given source graph to create a target graph IRI in which to store inferred triples.\n- **source-graphs**: space-separated list of graph IRIs on which to perform reasoning (must be passed as one shell argument).\n- **source-graphs-query**: file name or query text of SPARQL SELECT query used to obtain graph IRIs on which to perform reasoning. The query must return a column named `source_graph`.\n- **merge-sources**: whether to merge all selected source graphs into one set of statements before reasoning. Inferred statements will be stored in provided `target-graph`, or else in the default graph. If `merge-sources` is false (default), source graphs will be reasoned separately and in parallel, with results stored either together in `target-graph` or separately using `append-graph-name`.\n- **parallelism**: set the number of concurrent workers to use for reasoning on a set of graphs. Arachne is single-threaded, but if reasoning is applied independently to a set of graphs, this can occur in parallel.\n\nThis command line will select all named graphs from the database, materialize inferences for each one separately (up to 8 simultaneously), using rules derived from the RO ontology, and store the inferred triples in separate graphs corresponding to each source graph:\n```\nblazegraph-runner reason --journal=blazegraph.jnl --ontology=\"http://purl.obolibrary.org/obo/ro.owl\" --source-graphs-query=graphs.rq --append-graph-name=\"#inferred\" --merge-sources=false --parallelism=8\n```\n\n`graphs.rq` could look like this:\n\n```sparql\nSELECT DISTINCT ?source_graph\nWHERE {\n  GRAPH ?source_graph { ?s ?p ?o . }\n}\n```\n\n## Building\n\nIf you clone the `blazegraph-runner` repository and want to build locally, you will need to have [SBT](https://www.scala-sbt.org) installed.\n\n#### Package a local version to run from the repo:\n\n```\nsbt stage\n./target/universal/stage/bin/blazegraph-runner \u003coptions\u003e\n```\n\n#### Zip up a distribution\n\n```\nsbt universal:packageZipTarball\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbalhoff%2Fblazegraph-runner","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbalhoff%2Fblazegraph-runner","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbalhoff%2Fblazegraph-runner/lists"}