Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/wooorm/dioscuri
A gemtext (`text/gemini`) parser with support for streaming, ASTs, and CSTs
https://github.com/wooorm/dioscuri
ast cst gemini gemtext html mdast parse
Last synced: 26 days ago
JSON representation
A gemtext (`text/gemini`) parser with support for streaming, ASTs, and CSTs
- Host: GitHub
- URL: https://github.com/wooorm/dioscuri
- Owner: wooorm
- License: mit
- Created: 2021-01-06T11:25:01.000Z (about 4 years ago)
- Default Branch: main
- Last Pushed: 2022-11-22T18:14:04.000Z (about 2 years ago)
- Last Synced: 2024-12-08T15:48:53.589Z (about 1 month ago)
- Topics: ast, cst, gemini, gemtext, html, mdast, parse
- Language: JavaScript
- Homepage:
- Size: 191 KB
- Stars: 41
- Watchers: 3
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: readme.md
- Funding: funding.yml
- License: license
Awesome Lists containing this project
- awesome-gemini - dioscuri - A Gemtext parser with interfaces to transform to and from mdast (markdown ast) and to compile to HTML. (Tools / Gemtext converters)
README
# dioscuri
[![Build][build-badge]][build]
[![Coverage][coverage-badge]][coverage]
[![Downloads][downloads-badge]][downloads]
[![Size][size-badge]][size]A gemtext (`text/gemini`) parser with support for streaming, ASTs, and CSTs.
Do you:
* 🤨 think that HTTP and HTML are bloated?
* 😔 feel markdown has superfluous features?
* 🤔 find gopher too light?
* 🥰 like BRUTALISM?Then [Gemini][] might be for you (see [this post][devault] or [this
one][christine] on why it’s cool).## Contents
* [What is this?](#what-is-this)
* [When should I use this?](#when-should-i-use-this)
* [Install](#install)
* [Use](#use)
* [API](#api)
* [`buffer(doc, encoding?, options?)`](#bufferdoc-encoding-options)
* [`stream(options?)`](#streamoptions)
* [`fromGemtext(doc, encoding?)`](#fromgemtextdoc-encoding)
* [`toGemtext(tree)`](#togemtexttree)
* [`fromMdast(tree, options?)`](#frommdasttree-options)
* [`toMdast(tree)`](#tomdasttree)
* [gast](#gast)
* [`Root`](#root)
* [`Break`](#break)
* [`Heading`](#heading)
* [`Link`](#link)
* [`List`](#list)
* [`ListItem`](#listitem)
* [`Pre`](#pre)
* [`Quote`](#quote)
* [`Text`](#text)
* [Types](#types)
* [Compatibility](#compatibility)
* [Related](#related)
* [Contribute](#contribute)
* [Security](#security)
* [License](#license)## What is this?
**Dioscuri** (named for the gemini twins Castor and Pollux) is a
tokenizer/lexer/parser/etc for gemtext (the `text/gemini` markup format).
It gives you several things:* buffering and streaming interfaces that compile to HTML
* interfaces to create **[unist][]** compliant abstract syntax trees and
serialize those back to gemtext
* interfaces to transform to and from **[mdast][]** (markdown ast)
* parts that could be used to generate CSTsThese tools can be used if you now have markdown but want to transform it to
gemtext.
Or if you want to combine your posts into an RSS feed or on your “homepage”.
And many other things!## When should I use this?
Use this for all your gemtext needs!
## Install
This package is [ESM only][esm].
In Node.js (version 14.14+, 16.0+), install with [npm][]:[npm][]:
```sh
npm install dioscuri
```In Deno with [`esm.sh`][esmsh]:
```js
import * as dioscuri from 'https://esm.sh/dioscuri@1'
```In browsers with [`esm.sh`][esmsh]:
```html
import * as dioscuri from 'https://esm.sh/dioscuri@1?bundle'
```
## Use
See each interface below for examples.
## API
This package exports the identifiers `buffer`, `stream`, `fromGemtext`,
`toGemtext`, `fromMdast`, `toMdast`.
The raw `compiler` and `parser` are also exported.
There is no default export.### `buffer(doc, encoding?, options?)`
Compile gemtext to HTML.
###### `doc`
Gemtext to parse (`string` or [`Buffer`][buffer]).
###### `encoding`
[Character encoding][encoding] to understand `doc` as when it’s a
[`Buffer`][buffer] (`string`, default: `'utf8'`).###### `options.defaultLineEnding`
Value to use for line endings not in `doc` (`string`, default: first line
ending or `'\n'`).Generally, discuri copies line endings (`'\n'` or `'\r\n'`) in the document over
to the compiled HTML.
In some cases, such as `> a`, extra line endings are added:
`\n`.a
\n###### `options.allowDangerousProtocol`
Whether to allow potentially dangerous protocols in URLs (`boolean`, default:
`false`).
URLs relative to the current protocol are always allowed (such as, `image.jpg`).
Otherwise, the allowed protocols are `gemini`, `http`, `https`, `irc`, `ircs`,
`mailto`, and `xmpp`.###### Returns
Compiled HTML (`string`).
###### Example
Say we have a gemtext document, `example.gmi`:
```gemini
# Hello, world!Some text
=> https://example.com An example
> A quote
* List
```…and our module `example.js` looks as follows:
```js
import fs from 'node:fs/promises'
import {buffer} from 'dioscuri'const doc = await fs.readFile('example.gmi')
console.log(buffer(doc))
```…now running `node example.js` yields:
```html
Hello, world!
Some text
A quote
- List
```
### `stream(options?)`
Streaming interface to compile gemtext to HTML.
`options` is the same as the buffering interface above.
###### Example
Assuming the same `example.gmi` as before and an `example.js` like this:
```js
import fs from 'node:fs'
import {stream} from 'dioscuri'
fs.createReadStream('example.gmi')
.on('error', handleError)
.pipe(stream())
.pipe(process.stdout)
function handleError(error) {
throw error // Handle your error here!
}
```
…then running `node example.js` yields the same as before.
### `fromGemtext(doc, encoding?)`
Parse gemtext to an AST (**[gast][]**).
`doc` and `encoding` are the same as the buffering interface above.
###### Returns
[Root][].
###### Example
Assuming the same `example.gmi` as before and an `example.js` like this:
```js
import fs from 'node:fs/promises'
import {fromGemtext} from 'dioscuri'
const doc = await fs.readFile('example.gmi')
console.dir(fromGemtext(doc), {depth: null})
```
…now running `node example.js` yields (positional info removed for brevity):
```js
{
type: 'root',
children: [
{type: 'heading', rank: 1, value: 'Hello, world!'},
{type: 'break'},
{type: 'text', value: 'Some text'},
{type: 'break'},
{type: 'link', url: 'https://example.com', value: 'An example'},
{type: 'break'},
{type: 'quote', value: 'A quote'},
{type: 'break'},
{type: 'list', children: [{type: 'listItem', value: 'List'}]}
]
}
```
### `toGemtext(tree)`
Serialize **[gast][]**.
###### Example
Say our script `example.js` looks as follows:
```js
import {toGemtext} from 'dioscuri'
const tree = {
type: 'root',
children: [
{type: 'heading', rank: 1, value: 'Hello, world!'},
{type: 'break'},
{type: 'text', value: 'Some text'}
]
}
console.log(toGemtext(tree))
```
…then running `node example.js` yields:
```gemini
# Hello, world!
Some text
```
### `fromMdast(tree, options?)`
Transform **[mdast][]** to **[gast][]**.
###### `options.endlinks`
Place links at the end of the document (`boolean`, default: `false`).
The default is to place links before the next heading.
###### `options.tight`
Do not put blank lines between blocks (`boolean`, default: `false`).
The default is to place breaks between each block (paragraph, heading, etc).
###### Returns
**[gast][]**, probably.
Some mdast nodes have no gast representation so they are dropped.
If you pass one of those in as `tree`, you’ll get `undefined` out.
###### Example
Say we have a markdown document `example.md`:
````markdown
# Hello, world!
Some text, *emphasis*, **strong**\
`code()`, and ~~scratch that~~strikethrough.
Here’s a [link](https://example.com 'Just an example'), [link reference][*],
and images: [image reference][*], [](example.png 'Another example').
***
> Some
> quotes
* a list
* with another item
1. “Ordered”
2. List
```
A
Poem
```
```js
console.log(1)
```
| Name | Value |
| ---- | ----- |
| Beep | 1.2 |
| Boop | 3.14 |
* [x] Checked
* [ ] Unchecked
Footnotes[^†], ^[even inline].
[*]: https://example.org "URL definition"
[^†]: Footnote definition
````
…and our module `example.js` looks as follows:
```js
import fs from 'node:fs/promises'
import {gfm} from 'micromark-extension-gfm'
import {footnote} from 'micromark-extension-footnote'
import {fromMarkdown} from 'mdast-util-from-markdown'
import {gfmFromMarkdown} from 'mdast-util-gfm'
import {footnoteFromMarkdown} from 'mdast-util-footnote'
import {fromMdast, toGemtext} from 'dioscuri'
const mdast = fromMarkdown(await fs.readFile('example.md'), {
extensions: [gfm(), footnote({inlineNotes: true})],
mdastExtensions: [gfmFromMarkdown, footnoteFromMarkdown]
})
console.log(toGemtext(fromMdast(mdast)))
```
…now running `node example.js` yields:
````gemini
# Hello, world!
Some text, emphasis, strong code(), and strikethrough.
Here’s a link[1], link reference[2], and images: image reference[2], [3].
> Some quotes
* a list
* with another item
* “Ordered”
* List
```
A
Poem
```
```js
console.log(1)
```
```csv
Name,Value
Beep,1.2
Boop,3.14
```
* ✓ Checked
* ✗ Unchecked
Footnotes[a], [b].
=> https://example.com [1] Just an example
=> https://example.org [2] URL definition
=> example.png [3] Another example
[a] Footnote definition
[b] even inline
````
### `toMdast(tree)`
Transform **[gast][]** to **[mdast][]**.
###### Returns
**[mdast][]**, probably.
Some gast nodes have no mdast representation so they are dropped.
If you pass one of those in as `tree`, you’ll get `undefined` out.
###### Example
Say we have a gemtext document `example.gmi`:
```gemini
# Hello, world!
Some text
=> https://example.com An example
> A quote
* List
```
…and our module `example.js` looks as follows:
```js
import fs from 'node:fs/promises'
import {fromGemtext, toMdast} from 'dioscuri'
const doc = await fs.readFile('example.gmi')
console.dir(toMdast(fromGemtext(doc)), {depth: null})
```
…now running `node example.js` yields (position info removed for brevity):
```js
{
type: 'root',
children: [
{
type: 'heading',
depth: 1,
children: [{type: 'text', value: 'Hello, world!'}]
},
{
type: 'paragraph',
children: [{type: 'text', value: 'Some text'}]
},
{
type: 'paragraph',
children: [
{
type: 'link',
url: 'https://example.com',
title: null,
children: [{type: 'text', value: 'An example'}]
}
]
},
{
type: 'blockquote',
children: [
{type: 'paragraph', children: [{type: 'text', value: 'A quote'}]}
]
},
{
type: 'list',
ordered: false,
spread: false,
children: [
{
type: 'listItem',
spread: false,
children: [
{type: 'paragraph', children: [{type: 'text', value: 'List'}]}
]
}
]
}
]
}
```
## gast
**[gast][]** extends **[unist][]**, a format for syntax trees, to benefit from
its ecosystem of utilities.
### `Root`
```idl
interface Root <: Parent {
type: 'root'
children: [Break | Heading | Link | List | Pre | Quote | Text]
}
```
**Root** ([**Parent**][dfn-parent]) represents a document.
### `Break`
```idl
interface Break <: Node {
type: 'break'
}
```
**Break** ([**Node**][dfn-node]) represents a hard break.
### `Heading`
```idl
interface Heading <: Literal {
type: 'heading'
rank: 1 | 2 | 3
value: string?
}
```
**Heading** ([**Literal**][dfn-literal]) represents a heading of a section.
### `Link`
```idl
interface Link <: Literal {
type: 'link'
url: string
value: string?
}
```
**Link** ([**Literal**][dfn-literal]) represents a resource.
A `url` field must be present.
It represents a URL to the resource.
### `List`
```idl
interface List <: Parent {
type: 'list'
children: [ListItem]
}
```
**List** ([**Parent**][dfn-parent]) represents an enumeration.
### `ListItem`
```idl
interface ListItem <: Literal {
type: 'listItem'
value: string?
}
```
**ListItem** ([**Literal**][dfn-literal]) represents an item in a list.
### `Pre`
```idl
interface Pre <: Literal {
type: 'pre'
alt: string?
value: string?
}
```
**Pre** ([**Literal**][dfn-literal]) represents preformatted text.
An `alt` field may be present.
When present, the node represents computer code, and the field gives the
language of computer code being marked up.
### `Quote`
```idl
interface Quote <: Literal {
type: 'quote'
value: string?
}
```
**Quote** ([**Literal**][dfn-literal]) represents a quote.
### `Text`
```idl
interface Text <: Literal {
type: 'text'
value: string
}
```
**Text** ([**Literal**][dfn-literal]) represents a paragraph.
## Types
This package is fully typed with [TypeScript][].
It exports the additional types `Value` (for the input, string or buffer),
`BufferEncoding` (`'utf8'` etc), `CompileOptions` (options to turn things to a
string), and `FromMdastOptions` (options to turn things into gast).
## Compatibility
This package is at least compatible with all maintained versions of Node.js.
As of now, that is Node.js 14.14+ and 16.0+.
It also works in Deno and modern browsers.
## Related
* [`@derhuerst/gemini`](https://github.com/derhuerst/gemini)
– gemini protocol server and client
* [`gemini-fetch`](https://github.com/RangerMauve/gemini-fetch)
– load gemini protocol data the way you would fetch from HTTP in JavaScript
## Contribute
Yes please!
See [How to Contribute to Open Source][contribute].
## Security
Gemtext is safe.
As for the generated HTML: that’s safe by default.
Pass `allowDangerousProtocol: true` if you want to live dangerously.
## License
[MIT][license] © [Titus Wormer][author]
[build-badge]: https://github.com/wooorm/dioscuri/workflows/main/badge.svg
[build]: https://github.com/wooorm/dioscuri/actions
[coverage-badge]: https://img.shields.io/codecov/c/github/wooorm/dioscuri.svg
[coverage]: https://codecov.io/github/wooorm/dioscuri
[downloads-badge]: https://img.shields.io/npm/dm/dioscuri.svg
[downloads]: https://www.npmjs.com/package/dioscuri
[size-badge]: https://img.shields.io/bundlephobia/minzip/dioscuri.svg
[size]: https://bundlephobia.com/result?p=dioscuri
[npm]: https://docs.npmjs.com/cli/install
[esm]: https://gist.github.com/sindresorhus/a39789f98801d908bbc7ff3ecc99d99c
[esmsh]: https://esm.sh
[typescript]: https://www.typescriptlang.org
[contribute]: https://opensource.guide/how-to-contribute/
[license]: license
[author]: https://wooorm.com
[gemini]: https://gemini.circumlunar.space
[unist]: https://github.com/syntax-tree/unist
[mdast]: https://github.com/syntax-tree/mdast
[devault]: https://drewdevault.com/2020/11/01/What-is-Gemini-anyway.html
[christine]: https://christine.website/blog/gemini-web-fear-missing-out-2020-08-02
[encoding]: https://nodejs.org/api/buffer.html#buffer_buffers_and_character_encodings
[buffer]: https://nodejs.org/api/buffer.html
[gast]: #gast
[root]: #root
[dfn-parent]: https://github.com/syntax-tree/unist#parent
[dfn-node]: https://github.com/syntax-tree/unist#node
[dfn-literal]: https://github.com/syntax-tree/unist#literal