Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/rubensworks/rdf-parse.js

Parses RDF from any serialization
https://github.com/rubensworks/rdf-parse.js

hacktoberfest linked-data parser rdf rdfjs streaming

Last synced: 9 days ago
JSON representation

Parses RDF from any serialization

Awesome Lists containing this project

README

        

# RDF Parse

[![Build status](https://github.com/rubensworks/rdf-parse.js/workflows/CI/badge.svg)](https://github.com/rubensworks/rdf-parse.js/actions?query=workflow%3ACI)
[![Coverage Status](https://coveralls.io/repos/github/rubensworks/rdf-parse.js/badge.svg?branch=master)](https://coveralls.io/github/rubensworks/rdf-parse.js?branch=master)
[![npm version](https://badge.fury.io/js/rdf-parse.svg)](https://www.npmjs.com/package/rdf-parse)

This library parses _RDF streams_ based on _content type_ (or file name)
and outputs [RDF/JS](http://rdf.js.org/)-compliant quads as a stream.

This is useful in situations where you have _RDF in some serialization_,
and you just need the _parsed triples/quads_,
without having to concern yourself with picking the correct parser.

The following RDF serializations are supported:

| **Name** | **Content type** | **Extensions** |
| -------- | ---------------- | ------------- |
| [TriG](https://www.w3.org/TR/trig/) | `application/trig` | `.trig` |
| [N-Quads](https://www.w3.org/TR/n-quads/) | `application/n-quads` | `.nq`, `.nquads` |
| [Turtle](https://www.w3.org/TR/turtle/) | `text/turtle` | `.ttl`, `.turtle` |
| [N-Triples](https://www.w3.org/TR/n-triples/) | `application/n-triples` | `.nt`, `.ntriples` |
| [Notation3](https://www.w3.org/TeamSubmission/n3/) | `text/n3` | `.n3` |
| [JSON-LD](https://json-ld.org/) | `application/ld+json`, `application/json` | `.json`, `.jsonld` |
| [RDF/XML](https://www.w3.org/TR/rdf-syntax-grammar/) | `application/rdf+xml` | `.rdf`, `.rdfxml`, `.owl` |
| [RDFa](https://www.w3.org/TR/rdfa-in-html/) and script RDF data tags [HTML](https://html.spec.whatwg.org/multipage/)/[XHTML](https://www.w3.org/TR/xhtml-rdfa/) | `text/html`, `application/xhtml+xml` | `.html`, `.htm`, `.xhtml`, `.xht` |
| [Microdata](https://w3c.github.io/microdata-rdf/) | `text/html`, `application/xhtml+xml` | `.html`, `.htm`, `.xhtml`, `.xht` |
| [RDFa](https://www.w3.org/TR/2008/REC-SVGTiny12-20081222/metadata.html#MetadataAttributes) in [SVG](https://www.w3.org/TR/SVGTiny12/)/[XML](https://html.spec.whatwg.org/multipage/) | `image/svg+xml`,`application/xml` | `.xml`, `.svg`, `.svgz` |
| [SHACL Compact Syntax](https://w3c.github.io/shacl/shacl-compact-syntax/) | `text/shaclc` | `.shaclc`, `.shc` |
| [Extended SHACL Compact Syntax](https://github.com/jeswr/shaclcjs#extended-shacl-compact-syntax) | `text/shaclc-ext` | `.shaclce`, `.shce` |

Internally, this library makes use of RDF parsers from the [Comunica framework](https://github.com/comunica/comunica),
which enable streaming processing of RDF.

Internally, the following fully spec-compliant parsers are used:

* [N3.js](https://github.com/rdfjs/n3.js)
* [jsonld-streaming-parser.js](https://github.com/rubensworks/jsonld-streaming-parser.js)
* [microdata-rdf-streaming-parser.js](https://github.com/rubensworks/microdata-rdf-streaming-parser.js)
* [rdfa-streaming-parser.js](https://github.com/rubensworks/rdfa-streaming-parser.js)
* [rdfxml-streaming-parser.js](https://github.com/rdfjs/rdfxml-streaming-parser.js)
* [shaclcjs](https://github.com/jeswr/shaclcjs)

## Installation

```bash
$ npm install rdf-parse
```

or

```bash
$ yarn add rdf-parse
```

This package also works out-of-the-box in browsers via tools such as [webpack](https://webpack.js.org/) and [browserify](http://browserify.org/).

## Require

```typescript
import { rdfParser } from "rdf-parse";
```

_or_

```javascript
const { rdfParser } = require("rdf-parse");
```

## Usage

### Parsing by content type

The `rdfParser.parse` method takes in a text stream containing RDF in any serialization,
and an options object, and outputs an [RDFJS stream](http://rdf.js.org/stream-spec/#stream-interface) that emits RDF quads.

```javascript
const textStream = require('streamify-string')(`
, .
`);

rdfParser.parse(textStream, { contentType: 'text/turtle', baseIRI: 'http://example.org' })
.on('data', (quad) => console.log(quad))
.on('error', (error) => console.error(error))
.on('end', () => console.log('All done!'));
```

### Parsing by file name

Sometimes, the content type of an RDF document may be unknown,
for those cases, this library allows you to provide the path/URL of the RDF document,
using which the extension will be determined.

For example, Turtle documents can be detected using the `.ttl` extension.

```javascript
const textStream = require('streamify-string')(`
, .
`);

rdfParser.parse(textStream, { path: 'http://example.org/myfile.ttl', baseIRI: 'http://example.org' })
.on('data', (quad) => console.log(quad))
.on('error', (error) => console.error(error))
.on('end', () => console.log('All done!'));
```

### Getting all known content types

With `rdfParser.getContentTypes()`, you can retrieve a list of all content types for which a parser is available.
Note that this method returns a promise that can be `await`-ed.

`rdfParser.getContentTypesPrioritized()` returns an object instead,
with content types as keys, and numerical priorities as values.

```javascript
// An array of content types
console.log(await rdfParser.getContentTypes());

// An object of prioritized content types
console.log(await rdfParser.getContentTypesPrioritized());
```

### Obtaining prefixes

Using the `'prefix'` event, you can obtain the prefixes that were available when parsing from documents in formats such as Turtle and TriG.

```javascript
rdfParser.parse(textStream, { contentType: 'text/turtle' })
.on('prefix', (prefix, iri) => console.log(prefix + ':' + iri))
```

### Obtaining contexts

Using the `'context'` event, you can obtain all contexts (`@context`) when parsing JSON-LD documents.

Multiple contexts can be found, and the context values that are emitted correspond exactly to the context value as included in the JSON-LD document.

```javascript
rdfParser.parse(textStream, { contentType: 'application/ld+json' })
.on('context', (context) => console.log(context))
```

## License
This software is written by [Ruben Taelman](http://rubensworks.net/).

This code is released under the [MIT license](http://opensource.org/licenses/MIT).