{"id":20950419,"url":"https://github.com/atomgraph/json2rdf","last_synced_at":"2025-05-14T03:32:37.887Z","repository":{"id":44940322,"uuid":"202337439","full_name":"AtomGraph/JSON2RDF","owner":"AtomGraph","description":"Streaming generic JSON to RDF converter","archived":false,"fork":false,"pushed_at":"2023-08-31T19:06:57.000Z","size":47,"stargazers_count":79,"open_issues_count":4,"forks_count":12,"subscribers_count":9,"default_branch":"master","last_synced_at":"2024-04-16T18:33:32.062Z","etag":null,"topics":["docker-image","json","json-converter","json-ld","json2rdf","knowledge-graph","linked-data","rdf","semantic-web","sparql","streaming","transformer"],"latest_commit_sha":null,"homepage":"https://hub.docker.com/r/atomgraph/json2rdf","language":"Java","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/AtomGraph.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.md","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2019-08-14T11:32:41.000Z","updated_at":"2024-04-04T21:19:52.000Z","dependencies_parsed_at":"2022-09-17T03:52:05.997Z","dependency_job_id":null,"html_url":"https://github.com/AtomGraph/JSON2RDF","commit_stats":null,"previous_names":[],"tags_count":8,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AtomGraph%2FJSON2RDF","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AtomGraph%2FJSON2RDF/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AtomGraph%2FJSON2RDF/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AtomGraph%2FJSON2RDF/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/AtomGraph","download_url":"https://codeload.github.com/AtomGraph/JSON2RDF/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":225275712,"owners_count":17448387,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["docker-image","json","json-converter","json-ld","json2rdf","knowledge-graph","linked-data","rdf","semantic-web","sparql","streaming","transformer"],"created_at":"2024-11-19T00:48:26.520Z","updated_at":"2024-11-19T00:48:27.126Z","avatar_url":"https://github.com/AtomGraph.png","language":"Java","funding_links":[],"categories":[],"sub_categories":[],"readme":"# JSON2RDF\nStreaming generic JSON to RDF converter\n\nReads JSON data and streams N-Triples output. The conversion algorithm is similar to that of [JSON-LD](https://www.w3.org/TR/json-ld11-api/) but accepts arbitrary JSON and does not require a `@context`.\n\nThe resulting RDF representation is lossless with the exception of array ordering and some [datatype round-tripping](https://www.w3.org/TR/json-ld11-api/#data-round-tripping).\nThe lost ordering should not be a problem in the majority of cases, as RDF applications tend to impose their own value-based ordering using SPARQL `ORDER BY`.\n\nA common use case is feeding the JSON2RDF output into a triplestore or SPARQL processor and using a SPARQL `CONSTRUCT` query to map the generic RDF to more specific RDF that uses terms from some vocabulary.\nSPARQL is an inherently more flexible RDF mapping mechanism than JSON-LD `@context`.\n\n## Build\n\n    mvn clean install\n\nThat should produce an executable JAR file `target/json2rdf-jar-with-dependencies.jar` in which dependency libraries will be included.\n\n## Maven\n\nEach version is released to the Maven central repository as [`com.atomgraph.etl.json/json2rdf`](https://central.sonatype.com/artifact/com.atomgraph.etl.json/json2rdf)\n\n## Usage\n\nThe JSON data is read from `stdin`, the resulting RDF data is written to `stdout`.\n\nJSON2RDF is available as a `.jar` as well as a Docker image [atomgraph/json2rdf](https://hub.docker.com/r/atomgraph/json2rdf) (recommended).\n\nParameters:\n* `base` - the base URI for the data. Property namespace is constructed by adding `#` to the base URI.\n\nOptions:\n* `--input-charset` - JSON input encoding, by default UTF-8\n* `--output-charset` - RDF output encoding, by default UTF-8\n\n## Examples\n\nJSON2RDF output is streaming and produces N-Triples, therefore we pipe it through [`riot`](https://jena.apache.org/documentation/io/) to get a more readable Turtle output.\n\n***\n\nBob DuCharme's blog post on using JSON2RDF: [Converting JSON to RDF](http://www.bobdc.com/blog/json2rdf/).\n\n***\n\nJSON data in [`ordinary-json-document.json`](https://www.w3.org/TR/json-ld11/#interpreting-json-as-json-ld)\n```json\n{\n  \"name\": \"Markus Lanthaler\",\n  \"homepage\": \"http://www.markus-lanthaler.com/\",\n  \"image\": \"http://twitter.com/account/profile_image/markuslanthaler\"\n}\n```\n\nJava execution from shell:\n\n```bash\ncat ordinary-json-document.json | java -jar json2rdf-jar-with-dependencies.jar https://localhost/ | riot --formatted=TURTLE\n```\n\nAlternatively, Docker execution from shell:\n```bash\ncat ordinary-json-document.json | docker run --rm -i -a stdin -a stdout -a stderr atomgraph/json2rdf https://localhost/ | riot --formatted=TURTLE\n```\n\nNote that using Docker you need to [bind](https://docs.docker.com/engine/reference/commandline/run/#attach-to-stdinstdoutstderr--a) `stdin`/`stdout`/`stderr` streams.\n\nTurtle output\n\n```turtle\n[ \u003chttps://localhost/#homepage\u003e  \"http://www.markus-lanthaler.com/\" ;\n  \u003chttps://localhost/#image\u003e     \"http://twitter.com/account/profile_image/markuslanthaler\" ;\n  \u003chttps://localhost/#name\u003e      \"Markus Lanthaler\"\n] .\n```\n\nThe following SPARQL query can be used to map this generic RDF to the desired target RDF, e.g. a structure that uses [schema.org](https://schema.org) vocabulary.\n\n```sparql\nBASE \u003chttps://localhost/\u003e\nPREFIX : \u003c#\u003e\nPREFIX schema: \u003chttp://schema.org/\u003e\n\nCONSTRUCT\n{\n  ?person schema:homepage ?homepage ;\n    schema:image ?image ;\n    schema:name ?name .\n}\n{\n  ?person :homepage ?homepageStr ;\n    :image ?imageStr ;\n    :name ?name .\n  BIND (URI(?homepageStr) AS ?homepage)\n  BIND (URI(?imageStr) AS ?image)\n}\n```\n\nTurtle output after the mapping\n\n```turtle\n[ \u003chttp://schema.org/homepage\u003e  \u003chttp://www.markus-lanthaler.com/\u003e ;\n  \u003chttp://schema.org/image\u003e     \u003chttp://twitter.com/account/profile_image/markuslanthaler\u003e ;\n  \u003chttp://schema.org/name\u003e      \"Markus Lanthaler\"\n] .\n```\n\n***\n\nJSON data in [`city-distances.json`](https://www.w3.org/TR/xslt-30/#json-to-xml-mapping)\n\n```json\n{\n  \"desc\"    : \"Distances between several cities, in kilometers.\",\n  \"updated\" : \"2014-02-04T18:50:45\",\n  \"uptodate\": true,\n  \"author\"  : null,\n  \"cities\"  : {\n    \"Brussels\": [\n      {\"to\": \"London\",    \"distance\": 322},\n      {\"to\": \"Paris\",     \"distance\": 265},\n      {\"to\": \"Amsterdam\", \"distance\": 173}\n    ],\n    \"London\": [\n      {\"to\": \"Brussels\",  \"distance\": 322},\n      {\"to\": \"Paris\",     \"distance\": 344},\n      {\"to\": \"Amsterdam\", \"distance\": 358}\n    ],\n    \"Paris\": [\n      {\"to\": \"Brussels\",  \"distance\": 265},\n      {\"to\": \"London\",    \"distance\": 344},\n      {\"to\": \"Amsterdam\", \"distance\": 431}\n    ],\n    \"Amsterdam\": [\n      {\"to\": \"Brussels\",  \"distance\": 173},\n      {\"to\": \"London\",    \"distance\": 358},\n      {\"to\": \"Paris\",     \"distance\": 431}\n    ]\n  }\n}\n```\n\nJava execution from shell:\n```bash\ncat city-distances.json | java -jar json2rdf-jar-with-dependencies.jar https://localhost/ | riot --formatted=TURTLE\n```\n\nAlternatively, Docker execution from shell:\n```bash\ncat city-distances.json | docker run --rm -i -a stdin -a stdout -a stderr atomgraph/json2rdf https://localhost/ | riot --formatted=TURTLE\n```\n\nTurtle output\n\n```turtle\n[ \u003chttps://localhost/#cities\u003e    [ \u003chttps://localhost/#Amsterdam\u003e  [ \u003chttps://localhost/#distance\u003e  \"431\"^^\u003chttp://www.w3.org/2001/XMLSchema#int\u003e ;\n                                                                     \u003chttps://localhost/#to\u003e        \"Paris\"\n                                                                   ] ;\n                                   \u003chttps://localhost/#Amsterdam\u003e  [ \u003chttps://localhost/#distance\u003e  \"358\"^^\u003chttp://www.w3.org/2001/XMLSchema#int\u003e ;\n                                                                     \u003chttps://localhost/#to\u003e        \"London\"\n                                                                   ] ;\n                                   \u003chttps://localhost/#Amsterdam\u003e  [ \u003chttps://localhost/#distance\u003e  \"173\"^^\u003chttp://www.w3.org/2001/XMLSchema#int\u003e ;\n                                                                     \u003chttps://localhost/#to\u003e        \"Brussels\"\n                                                                   ] ;\n                                   \u003chttps://localhost/#Brussels\u003e   [ \u003chttps://localhost/#distance\u003e  \"322\"^^\u003chttp://www.w3.org/2001/XMLSchema#int\u003e ;\n                                                                     \u003chttps://localhost/#to\u003e        \"London\"\n                                                                   ] ;\n                                   \u003chttps://localhost/#Brussels\u003e   [ \u003chttps://localhost/#distance\u003e  \"265\"^^\u003chttp://www.w3.org/2001/XMLSchema#int\u003e ;\n                                                                     \u003chttps://localhost/#to\u003e        \"Paris\"\n                                                                   ] ;\n                                   \u003chttps://localhost/#Brussels\u003e   [ \u003chttps://localhost/#distance\u003e  \"173\"^^\u003chttp://www.w3.org/2001/XMLSchema#int\u003e ;\n                                                                     \u003chttps://localhost/#to\u003e        \"Amsterdam\"\n                                                                   ] ;\n                                   \u003chttps://localhost/#London\u003e     [ \u003chttps://localhost/#distance\u003e  \"358\"^^\u003chttp://www.w3.org/2001/XMLSchema#int\u003e ;\n                                                                     \u003chttps://localhost/#to\u003e        \"Amsterdam\"\n                                                                   ] ;\n                                   \u003chttps://localhost/#London\u003e     [ \u003chttps://localhost/#distance\u003e  \"322\"^^\u003chttp://www.w3.org/2001/XMLSchema#int\u003e ;\n                                                                     \u003chttps://localhost/#to\u003e        \"Brussels\"\n                                                                   ] ;\n                                   \u003chttps://localhost/#London\u003e     [ \u003chttps://localhost/#distance\u003e  \"344\"^^\u003chttp://www.w3.org/2001/XMLSchema#int\u003e ;\n                                                                     \u003chttps://localhost/#to\u003e        \"Paris\"\n                                                                   ] ;\n                                   \u003chttps://localhost/#Paris\u003e      [ \u003chttps://localhost/#distance\u003e  \"431\"^^\u003chttp://www.w3.org/2001/XMLSchema#int\u003e ;\n                                                                     \u003chttps://localhost/#to\u003e        \"Amsterdam\"\n                                                                   ] ;\n                                   \u003chttps://localhost/#Paris\u003e      [ \u003chttps://localhost/#distance\u003e  \"344\"^^\u003chttp://www.w3.org/2001/XMLSchema#int\u003e ;\n                                                                     \u003chttps://localhost/#to\u003e        \"London\"\n                                                                   ] ;\n                                   \u003chttps://localhost/#Paris\u003e      [ \u003chttps://localhost/#distance\u003e  \"265\"^^\u003chttp://www.w3.org/2001/XMLSchema#int\u003e ;\n                                                                     \u003chttps://localhost/#to\u003e        \"Brussels\"\n                                                                   ]\n                                 ] ;\n  \u003chttps://localhost/#desc\u003e      \"Distances between several cities, in kilometers.\" ;\n  \u003chttps://localhost/#updated\u003e   \"2014-02-04T18:50:45\" ;\n  \u003chttps://localhost/#uptodate\u003e  true\n] .\n```\n### Mapping Twitter export to RDF\n\nYou can [download your Twitter data](https://twitter.com/settings/download_your_data) which includes tweets in `tweets.js`. Remove the `window.YTD.tweets.part0 = ` string and save the rest as `tweets.json`.\n\nTo get the RDF output, save the following query as `tweets.rq`\n\n```sparql\nBASE            \u003chttps://twitter.com/\u003e\nPREFIX :        \u003c#\u003e\nPREFIX xsd:     \u003chttp://www.w3.org/2001/XMLSchema#\u003e\nPREFIX sioc:    \u003chttp://rdfs.org/sioc/ns#\u003e\nPREFIX dct:     \u003chttp://purl.org/dc/terms/\u003e\n\nCONSTRUCT\n{\n    ?tweet a sioc:Post ;\n        sioc:id ?id ;\n        dct:created ?created ;\n        sioc:content ?content ;\n        sioc:reply_of ?reply_of .\n}\n{\n    ?tweet_obj :id ?id ;\n        :created_at ?created_at_string ;\n        :full_text ?content .\n    OPTIONAL\n    {\n        ?tweet_obj :in_reply_to_status_id ?in_reply_to_status_id ;\n            :in_reply_to_screen_name ?in_reply_to_screen_name .\n        BIND(URI(CONCAT(?in_reply_to_screen_name, \"/status/\", ?in_reply_to_status_id)) AS ?reply_of)\n    }\n\n    BIND(\"atomgraphhq\" AS ?username)\n    BIND(URI(CONCAT(?username, \"/status/\", ?id)) AS ?tweet)\n    BIND(SUBSTR(?created_at_string, 27, 4) AS ?year_string)\n    BIND(SUBSTR(?created_at_string, 5, 3) AS ?month_string)\n    BIND(SUBSTR(?created_at_string, 9, 2) AS ?day_string)\n    VALUES (?month_string ?month_number_string)\n    {\n         (\"Jan\"    \"01\")\n         (\"Feb\"    \"02\")\n         (\"Mar\"    \"03\")\n         (\"Apr\"    \"04\")\n         (\"May\"    \"05\")\n         (\"Jun\"    \"06\")\n         (\"Jul\"    \"07\")\n         (\"Aug\"    \"08\")\n         (\"Sep\"    \"09\")\n         (\"Oct\"    \"10\")\n         (\"Nov\"    \"11\")\n         (\"Dec\"    \"12\")\n    }\n    BIND(SUBSTR(?created_at_string, 12, 8) AS ?time)\n    BIND(SUBSTR(?created_at_string, 21, 3) AS ?tz_hours)\n    BIND(SUBSTR(?created_at_string, 24, 2) AS ?tz_minutes)\n    BIND(STRDT(CONCAT(?year_string, \"-\", ?month_number_string, \"-\", ?day_string, \"T\", ?time, ?tz_hours, \":\", ?tz_minutes), xsd:dateTime) AS ?created)\n}\n```\nadjust your Twitter handle in the query string as `?username`, and then run this command:\n```bash\ncat tweets.json | docker run --rm -i -a stdin -a stdout -a stderr atomgraph/json2rdf https://twitter.com/ \u003e tweets.nt \u0026\u0026 \\\n    sparql --data tweets.nt --query tweets.rq \u003e tweets.ttl\n```\nOutput sample:\n```turtle\n\u003chttps://twitter.com/atomgraphhq/status/1535239790693699587\u003e\n        a              sioc:Post ;\n        dct:created    \"2022-06-10T12:37:44+00:00\"^^xsd:dateTime ;\n        sioc:content   \"Follow it on GitHub!\\nhttps://t.co/pu5KkOoIOX\" ;\n        sioc:id        \"1535239790693699587\" ;\n        sioc:reply_of  \u003chttps://twitter.com/atomgraphhq/status/1535211486582382593\u003e .\n```\nImprovements to the mapping query are welcome.\n\n## Performance\n\nLargest dataset tested so far: 2.95 GB / 30459482 lines of JSON to 4.5 GB / 21964039 triples in 2m10s.\nHardware: x64 Windows 10 PC with Intel Core i5-7200U 2.5 GHz CPU and 16 GB RAM.\n\n## Dependencies\n\n* [javax.json](https://mvnrepository.com/artifact/org.glassfish/javax.json)\n* [Apache Jena](https://jena.apache.org/)\n* [picocli](https://picocli.info)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fatomgraph%2Fjson2rdf","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fatomgraph%2Fjson2rdf","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fatomgraph%2Fjson2rdf/lists"}