https://github.com/frencojobs/fenceparser

A tiny, well-tested parser for parsing metadata out of fenced code blocks in Markdown
https://github.com/frencojobs/fenceparser

markdown

Last synced: 6 months ago
JSON representation

A tiny, well-tested parser for parsing metadata out of fenced code blocks in Markdown

Host: GitHub
URL: https://github.com/frencojobs/fenceparser
Owner: frencojobs
License: mit
Created: 2021-07-12T12:18:21.000Z (over 4 years ago)
Default Branch: main
Last Pushed: 2023-05-05T20:01:55.000Z (over 2 years ago)
Last Synced: 2025-03-26T19:01:44.430Z (7 months ago)
Topics: markdown
Language: TypeScript
Homepage: https://npm.im/fenceparser
Size: 272 KB
Stars: 15
Watchers: 2
Forks: 2
Open Issues: 2
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- License: LICENSE

Awesome Lists containing this project

README

          

 A tiny, well-tested parser for parsing metadata out of fenced code blocks in Markdown. 





## Overview ・  

Assuming you have this code fence in your Markdown,

````md

 ```ts twoslash {1-3, 5} title="Hello, World"

````

Using [remark](https://github.com/remarkjs/remark) will yield two information about that code block, `lang` and `meta` like this.

```json

{

  "lang": "ts",

  "meta": "twoslash {1-3, 5} title=\"Hello, World\""

}

```

Use `fenceparser` to parse the `meta` string out to a useful object.

```js

import parse from 'fenceparser'

console.log(parse(meta))

// {

//   twoslash: true,

//   highlight: { '1-3': true, '5': true },

//   title: 'Hello, World'

// }

```

> The parser won't intentionally handle parsing the language part since it is usually handled by the Markdown parsers.

But if you want to allow loose syntax grammars such as `ts{1-3, 5}` as well as `ts {1-3, 5}` which is used by [gatsby-remark-vscode](https://github.com/andrewbranch/gatsby-remark-vscode) as an example, remark won't parse the language correctly.

```json5

{

  "lang": "ts{1-3,", // because remark uses space to split

  "meta": "5}"

}

```

In these cases, you can use the the library's `lex` function to get a properly tokenized array. You may then take out the first element as `lang`. For example,

```js

import {lex, parse} from 'fenceparser'

// Notice this ^ parse is not the same the default export function

const full = [node.lang, node.meta].join(' ') // Join them back

const tokens = lex(full)

const lang = tokens.shift() // ts

const meta = parse(tokens) // { highlight: {'1-3': true, '5': true} }

```

## Syntax

The syntax grammar is loosely based on techniques used by various syntax-highlighters. Rules are such that

- Valid HTML attributes can be used, `attribute`, `data-attribute`, etc.

- Just like in HTML, top-level attribute names are case insensitive

- Attributes without values are assigned as `true`

- Attribute values can be single or double quoted strings, int/float numbers, booleans, objects or arrays

- Non-quoted strings are valid as long as they are not separated by a whitespace or a line-break, `attr=--theme-color`

- Objects can accept valid attributes as children, or valid attributes with value assigned by `:` keyword, `{1-3, 5, ids: {7}}`

- Arrays are just like JavaScript's arrays

- Objects without attribute keys `{1-3} {7}` are merged and assigned to the `highlight` object

- No trailing commas

## Acknowledgements

1. This project is made initially to use with [Twoslash](https://github.com/shikijs/twoslash).

2. The initial implementations of lexer and parser are based on the examples from the book [Crafting Interpreters](http://craftinginterpreters.com).

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/frencojobs/fenceparser

Awesome Lists containing this project

README