{"id":13483877,"url":"https://github.com/micromark/micromark","last_synced_at":"2025-05-13T18:04:51.112Z","repository":{"id":37405886,"uuid":"157420458","full_name":"micromark/micromark","owner":"micromark","description":"small, safe, and great commonmark (optionally gfm, mdx) compliant markdown parser","archived":false,"fork":false,"pushed_at":"2025-04-02T14:02:48.000Z","size":2121,"stargazers_count":1949,"open_issues_count":4,"forks_count":70,"subscribers_count":13,"default_branch":"main","last_synced_at":"2025-05-06T16:17:42.003Z","etag":null,"topics":["ast","commonmark","compile","cst","gfm","markdown","parse","render","tokenize","unified"],"latest_commit_sha":null,"homepage":"https://unifiedjs.com","language":"JavaScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/micromark.png","metadata":{"files":{"readme":"readme.md","changelog":null,"contributing":null,"funding":null,"license":"license","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null},"funding":{"github":"unifiedjs","open_collective":"unified","thanks_dev":"u/gh/micromark"}},"created_at":"2018-11-13T17:34:37.000Z","updated_at":"2025-05-06T08:11:44.000Z","dependencies_parsed_at":"2023-09-27T09:50:02.485Z","dependency_job_id":"730ef76e-c024-4c6e-a527-5f21b9ba7869","html_url":"https://github.com/micromark/micromark","commit_stats":{"total_commits":634,"total_committers":18,"mean_commits":35.22222222222222,"dds":0.05362776025236593,"last_synced_commit":"c8f2e0754ebe3d25482ad9f2fab5d34b76c843a1"},"previous_names":[],"tags_count":120,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/micromark%2Fmicromark","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/micromark%2Fmicromark/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/micromark%2Fmicromark/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/micromark%2Fmicromark/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/micromark","download_url":"https://codeload.github.com/micromark/micromark/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":252731331,"owners_count":21795452,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ast","commonmark","compile","cst","gfm","markdown","parse","render","tokenize","unified"],"created_at":"2024-07-31T17:01:16.366Z","updated_at":"2025-05-13T18:04:51.099Z","avatar_url":"https://github.com/micromark.png","language":"JavaScript","funding_links":["https://github.com/sponsors/unifiedjs","https://opencollective.com/unified","https://thanks.dev/u/gh/micromark"],"categories":["JavaScript","markdown"],"sub_categories":[],"readme":"\u003ch1 align=\"center\"\u003e\n  \u003cimg src=\"https://raw.githubusercontent.com/micromark/micromark/2e476c9/logo.svg?sanitize=true\" alt=\"micromark\" /\u003e\n\u003c/h1\u003e\n\n[![Build][build-badge]][build]\n[![Coverage][coverage-badge]][coverage]\n[![Downloads][downloads-badge]][downloads]\n[![Size][bundle-size-badge]][bundle-size]\n[![Sponsors][sponsors-badge]][opencollective]\n[![Backers][backers-badge]][opencollective]\n[![Chat][chat-badge]][chat]\n\nThe smallest CommonMark compliant markdown parser.\nWith positional info and concrete tokens.\n\n## Feature highlights\n\n\u003c!-- Note: this section has to be in sync with the `micromark` readme. --\u003e\n\n* [x] **[compliant][commonmark]** (100% to CommonMark)\n* [x] **[extensions][]** (100% [GFM][], 100% [MDX.js][mdxjs], [directives][],\n  [frontmatter][], [math][])\n* [x] **[safe][security]** (by default)\n* [x] **[robust][test]** (±2k tests, 100% coverage, fuzz testing)\n* [x] **[small][size-debug]** (smallest CM parser at ±14kb)\n\n## Contents\n\n* [When should I use this?](#when-should-i-use-this)\n* [What is this?](#what-is-this)\n* [Install](#install)\n* [Use](#use)\n* [API](#api)\n* [Extensions](#extensions)\n  * [List of extensions](#list-of-extensions)\n  * [`SyntaxExtension`](#syntaxextension)\n  * [`HtmlExtension`](#htmlextension)\n  * [Extending markdown](#extending-markdown)\n  * [Creating a micromark extension](#creating-a-micromark-extension)\n* [Architecture](#architecture)\n  * [Overview](#overview)\n  * [Preprocess](#preprocess)\n  * [Parse](#parse)\n  * [Postprocess](#postprocess)\n  * [Compile](#compile)\n* [Examples](#examples)\n  * [GitHub flavored markdown (GFM)](#github-flavored-markdown-gfm)\n  * [Math](#math)\n  * [Syntax tree](#syntax-tree)\n* [Markdown](#markdown)\n  * [CommonMark](#commonmark)\n  * [Grammar](#grammar)\n* [Project](#project)\n  * [Comparison](#comparison)\n  * [Test](#test)\n  * [Size \u0026 debug](#size--debug)\n  * [Version](#version)\n  * [Security](#security)\n  * [Contribute](#contribute)\n  * [Sponsor](#sponsor)\n  * [Origin story](#origin-story)\n  * [License](#license)\n\n## When should I use this?\n\n\u003c!-- Note: this section has to be in sync with the `micromark` readme. --\u003e\n\n* If you *just* want to turn markdown into HTML (with maybe a few extensions)\n* If you want to do *really complex things* with markdown\n\nSee [§ Comparison][comparison] for more info\n\n## What is this?\n\n\u003c!-- Note: this section has to be in sync with the `micromark` readme. --\u003e\n\n`micromark` is an open source markdown parser written in JavaScript.\nIt’s implemented as a state machine that emits concrete tokens, so that every\nbyte is accounted for, with positional info.\nIt then compiles those tokens directly to HTML, but other tools can take the\ndata and for example build an AST which is easier to work with\n([`mdast-util-to-markdown`][mdast-util-to-markdown]).\n\nWhile most markdown parsers work towards compliancy with CommonMark (or GFM),\nthis project goes further by following how the reference parsers (`cmark`,\n`cmark-gfm`) work, which is confirmed with thousands of extra tests.\n\nOther than CommonMark and GFM, micromark also supports common extensions to\nmarkdown such as MDX, math, and frontmatter.\n\nThese npm packages have a sibling project in Rust:\n[`markdown-rs`][markdown-rs].\n\n* to learn markdown, see this [cheatsheet and tutorial][cheat]\n* for more about us, see [`unifiedjs.com`][site]\n* for questions, see [Discussions][chat]\n* to help, see [contribute][] and [sponsor][] below\n\n## Install\n\n\u003c!-- Note: this section has to be in sync with the `micromark` readme. --\u003e\n\nThis package is [ESM only][esm].\nIn Node.js (version 16+), install with [npm][]:\n\n```sh\nnpm install micromark\n```\n\nIn Deno with [`esm.sh`][esmsh]:\n\n```js\nimport {micromark} from 'https://esm.sh/micromark@3'\n```\n\nIn browsers with [`esm.sh`][esmsh]:\n\n```html\n\u003cscript type=\"module\"\u003e\n  import {micromark} from 'https://esm.sh/micromark@3?bundle'\n\u003c/script\u003e\n```\n\n## Use\n\n\u003c!-- Note: this section has to be in sync with the `micromark` readme. --\u003e\n\nTypical use (buffering):\n\n```js\nimport {micromark} from 'micromark'\n\nconsole.log(micromark('## Hello, *world*!'))\n```\n\nYields:\n\n```html\n\u003ch2\u003eHello, \u003cem\u003eworld\u003c/em\u003e!\u003c/h2\u003e\n```\n\nYou can pass extensions (in this case [`micromark-extension-gfm`][gfm]):\n\n```js\nimport {micromark} from 'micromark'\nimport {gfmHtml, gfm} from 'micromark-extension-gfm'\n\nconst value = '* [x] contact@example.com ~~strikethrough~~'\n\nconst result = micromark(value, {\n  extensions: [gfm()],\n  htmlExtensions: [gfmHtml()]\n})\n\nconsole.log(result)\n```\n\nYields:\n\n```html\n\u003cul\u003e\n\u003cli\u003e\u003cinput checked=\"\" disabled=\"\" type=\"checkbox\"\u003e \u003ca href=\"mailto:contact@example.com\"\u003econtact@example.com\u003c/a\u003e \u003cdel\u003estrikethrough\u003c/del\u003e\u003c/li\u003e\n\u003c/ul\u003e\n```\n\nStreaming interface:\n\n```js\nimport {createReadStream} from 'node:fs'\nimport {stream} from 'micromark/stream'\n\ncreateReadStream('example.md')\n  .on('error', handleError)\n  .pipe(stream())\n  .pipe(process.stdout)\n\nfunction handleError(error) {\n  // Handle your error here!\n  throw error\n}\n```\n\n## API\n\nSee [§ API][api] in the `micromark` readme.\n\n## Extensions\n\nmicromark supports extensions.\nThere are two types of extensions for micromark:\n[`SyntaxExtension`][syntax-extension],\nwhich change how markdown is parsed, and [`HtmlExtension`][html-extension],\nwhich change how it compiles.\nThey can be passed in [`options.extensions`][api-option-extensions] or\n[`options.htmlExtensions`][api-option-htmlextensions], respectively.\n\nAs a user of extensions, refer to each extension’s readme for more on how to use\nthem.\nAs a (potential) author of extensions, refer to\n[§ Extending markdown][extending-markdown] and\n[§ Creating a micromark extension][create-extension].\n\n### List of extensions\n\n* [`micromark/micromark-extension-directive`][directives]\n  — support directives (generic extensions)\n* [`micromark/micromark-extension-frontmatter`][frontmatter]\n  — support frontmatter (YAML, TOML, etc)\n* [`micromark/micromark-extension-gfm`][gfm]\n  — support GFM (GitHub Flavored Markdown)\n* [`micromark/micromark-extension-gfm-autolink-literal`](https://github.com/micromark/micromark-extension-gfm-autolink-literal)\n  — support GFM autolink literals\n* [`micromark/micromark-extension-gfm-footnote`](https://github.com/micromark/micromark-extension-gfm-footnote)\n  — support GFM footnotes\n* [`micromark/micromark-extension-gfm-strikethrough`](https://github.com/micromark/micromark-extension-gfm-strikethrough)\n  — support GFM strikethrough\n* [`micromark/micromark-extension-gfm-table`](https://github.com/micromark/micromark-extension-gfm-table)\n  — support GFM tables\n* [`micromark/micromark-extension-gfm-tagfilter`](https://github.com/micromark/micromark-extension-gfm-tagfilter)\n  — support GFM tagfilter\n* [`micromark/micromark-extension-gfm-task-list-item`](https://github.com/micromark/micromark-extension-gfm-task-list-item)\n  — support GFM tasklists\n* [`micromark/micromark-extension-math`][math]\n  — support math\n* [`micromark/micromark-extension-mdx`](https://github.com/micromark/micromark-extension-mdx)\n  — support MDX\n* [`micromark/micromark-extension-mdxjs`][mdxjs]\n  — support MDX.js\n* [`micromark/micromark-extension-mdx-expression`][mdx-expression]\n  — support MDX (or MDX.js) expressions\n* [`micromark/micromark-extension-mdx-jsx`](https://github.com/micromark/micromark-extension-mdx-jsx)\n  — support MDX (or MDX.js) JSX\n* [`micromark/micromark-extension-mdx-md`](https://github.com/micromark/micromark-extension-mdx-md)\n  — support misc MDX changes\n* [`micromark/micromark-extension-mdxjs-esm`](https://github.com/micromark/micromark-extension-mdxjs-esm)\n  — support MDX.js import/exports\n\n#### Community extensions\n\n* [`wataru-chocola/micromark-extension-definition-list`](https://github.com/wataru-chocola/micromark-extension-definition-list)\n  — support definition lists\n\n### `SyntaxExtension`\n\nA syntax extension is an object whose fields are typically the names of hooks,\nreferring to where constructs “hook” into.\nThe fields at such objects are character codes, mapping to constructs as values.\n\nThe built in [constructs][] are an example.\nSee it and [existing extensions][extensions] for inspiration.\n\n### `HtmlExtension`\n\nAn HTML extension is an object whose fields are typically `enter` or `exit`\n(reflecting whether a token is entered or exited).\nThe values at such objects are names of tokens mapping to handlers.\n\nSee [existing extensions][extensions] for inspiration.\n\n### Extending markdown\n\nmicromark lets you change markdown syntax, yes, but there are alternatives.\nThe alternatives are often better.\n\nOver the years, many micromark and remark users have asked about their unique\ngoals for markdown.\nSome exemplary goals are:\n\n1. I want to add `rel=\"nofollow\"` to external links\n2. I want to add links from headings to themselves\n3. I want line breaks in paragraphs to become hard breaks\n4. I want to support embedded music sheets\n5. I want authors to add arbitrary attributes\n6. I want authors to mark certain blocks with meaning, such as tip, warning,\n   etc\n7. I want to combine markdown with JS(X)\n8. I want to support our legacy flavor of markdown-like syntax\n\nThese can be solved in different ways and which solution is best is both\nsubjective and dependent on unique needs.\nOften, there is already a solution in the form of an existing remark or rehype\nplugin.\nRespectively, their solutions are:\n\n1. [`remark-external-links`](https://github.com/remarkjs/remark-external-links)\n2. [`rehype-autolink-headings`](https://github.com/rehypejs/rehype-autolink-headings)\n3. [`remark-breaks`](https://github.com/remarkjs/remark-breaks)\n4. custom plugin similar to\n   [`rehype-katex`](https://github.com/remarkjs/remark-math/tree/main/packages/rehype-katex)\n   but integrating [`abcjs`](https://www.abcjs.net)\n5. either [`remark-directive`][remark-directive]\n   and a custom plugin or with\n   [`rehype-attr`](https://github.com/jaywcjlove/rehype-attr)\n6. [`remark-directive`][remark-directive]\n   combined with a custom plugin\n7. combining the existing micromark MDX extensions however you please, such as\n   done by [`mdx-js/mdx`][mdx] or\n   [`xdm`](https://github.com/wooorm/xdm)\n8. Writing a micromark extension\n\nLooking at these from a higher level, they can be categorized:\n\n* **Changing the output by transforming syntax trees**\n  (1 and 2)\n\n  This category is nice as the format remains plain markdown that authors are\n  already familiar with and which will work with existing tools and platforms.\n\n  Implementations will deal with the syntax tree\n  ([`mdast`][mdast]) and the ecosystems\n  **[remark][]** and **[rehype][]**.\n  There are many existing\n  [utilities for working with that tree][utilities].\n  Many [remark plugins][remark-plugins] and\n  [rehype plugins][rehype-plugins] also exist.\n* **Using and abusing markdown to add new meaning**\n  (3, 4, potentially 5)\n\n  This category is similar to *Changing the output by transforming syntax\n  trees*, but adds a new meaning to certain things which already have\n  semantics in markdown.\n\n  Some examples in pseudocode:\n\n  ````markdown\n  *   **A list item with the first paragraph bold**\n\n      And then more content, is turned into `\u003cdl\u003e` / `\u003cdt\u003e` / `\u003cdd\u003e` elements\n\n  Or, the title attributes on links or images is [overloaded](/url 'rel:nofollow')\n  with a new meaning.\n\n  ```csv\n  fenced,code,can,include,data\n  which,is,turned,into,a,graph\n  ```\n\n  ```js data can=\"be\" passed=true\n  // after the code language name\n  ```\n\n  HTML, especially comments, could be used as **markers**\u003c!--id=\"markers\"--\u003e\n  ````\n* **Arbitrary extension mechanism**\n  (potentially 5; 6)\n\n  This category is nice when content should contain embedded “components”.\n  Often this means it’s required for authors to have some programming\n  experience.\n  There are three good ways to solve arbitrary extensions.\n\n  **HTML**: Markdown already has an arbitrary extension syntax.\n  It works in most places and authors are already familiar with the syntax,\n  but it’s reasonably hard to implement securely.\n  Certain platforms will remove HTML completely, others sanitize it to varying\n  degrees.\n  HTML also supports custom elements.\n  These could be used and enhanced by client side JavaScript or enhanced when\n  transforming the syntax tree.\n\n  **Generic directives**: although\n  [a proposal][directive-proposal]\n  and not supported on most platforms, directives do work with many tools\n  already.\n  They’re not the easiest to author compared to, say, a heading, but sometimes\n  that’s okay.\n  They do have potential: they nicely solve the need for an infinite number of\n  potential extensions to markdown in a single markdown-esque way.\n\n  **MDX** also adds support for components by swapping HTML out for JS(X).\n  JSX is an extension to JavaScript, so MDX is something along the lines of\n  literate programming.\n  This does require knowledge of React (or Vue) and JavaScript, excluding some\n  authors.\n* **Extending markdown syntax**\n  (7 and 8)\n\n  Extend the syntax of markdown means:\n\n  * Authors won’t be familiar with the syntax\n  * Content won’t work in other places (such as on GitHub)\n  * Defeating the purpose of markdown: being simple to author and looking\n    like what it means\n\n  …and it’s hard to do as it requires some in-depth knowledge of JavaScript\n  and parsing.\n  But it’s possible and in certain cases very powerful.\n\n### Creating a micromark extension\n\nThis section shows how to create an extension for micromark that parses\n“variables” (a way to render some data) and one to turn a default construct off.\n\n\u003e Stuck?\n\u003e See [`support.md`][support].\n\n#### Prerequisites\n\n* You should possess an intermediate to high understanding of JavaScript:\n  it’s going to get a bit complex\n* Read the readme of [unified][] (until you hit the API section) to better\n  understand where micromark fits\n* Read the [§ Architecture][architecture] section to understand how micromark\n  works\n* Read the [§ Extending markdown][extending-markdown] section to understand\n  whether it’s a good idea to extend the syntax of markdown\n\n#### Extension basics\n\nmicromark supports two types of extensions.\nSyntax extensions change how markdown is parsed.\nHTML extensions change how it compiles.\n\nHTML extensions are not always needed, as micromark is often used through\n[`mdast-util-from-markdown`][mdast-util-from-markdown] to parse to a markdown\nsyntax tree.\nSo instead of an HTML extension a `from-markdown` utility is needed.\nThen, a [`mdast-util-to-markdown`][mdast-util-to-markdown] utility, which is\nresponsible for serializing syntax trees to markdown, is also needed.\n\nWhen developing something for internal use only, you can pick and choose which\nparts you need.\nWhen open sourcing your extensions, it should probably contain four parts:\nsyntax extension, HTML extension, `from-markdown` utility, and a `to-markdown`\nutility.\n\nOn to our first case!\n\n#### Case: variables\n\nLet’s first outline what we want to make: render some data, similar to how\n[Liquid](https://github.com/Shopify/liquid/wiki/Liquid-for-Designers) and the\nlike work, in our markdown.\nIt could look like this:\n\n```markdown\nHello, {planet}!\n```\n\nTurned into:\n\n```html\n\u003cp\u003eHello, Venus!\u003c/p\u003e\n```\n\nAn opening curly brace, followed by one or more characters, and then a closing\nbrace.\nWe’ll then look up `planet` in some object and replace the variable with its\ncorresponding value, to get something like `Venus` out.\n\nIt looks simple enough, but with markdown there are often a couple more things\nto think about.\nFor this case, I can see the following:\n\n* Is there a “block” version too?\n* Are spaces allowed?\n  Line endings?\n  Should initial and final white space be ignored?\n* Balanced nested braces?\n  Superfluous ones such as `{{planet}}` or meaningful ones such as\n  `{a {pla} net}`?\n* Character escapes (`{pla\\}net}`) and character references\n  (`{pla\u0026#x7d;net}`)?\n\nTo keep things as simple as possible, let’s not support a block syntax, see\nspaces as special, support line endings, or support nested braces.\nBut to learn interesting things, we *will* support character escapes and\n-references.\n\nNote that this particular case is already solved quite nicely by\n[`micromark-extension-mdx-expression`][mdx-expression].\nIt’s a bit more powerful and does more things, but it can be used to solve this\ncase and otherwise serve as inspiration.\n\n##### Setup\n\nCreate a new folder, enter it, and set up a new package:\n\n```sh\nmkdir example\ncd example\nnpm init -y\n```\n\nIn this example we’ll use ESM, so add `type: 'module'` to `package.json`:\n\n```diff\n@@ -2,6 +2,7 @@\n   \"name\": \"example\",\n   \"version\": \"1.0.0\",\n   \"description\": \"\",\n+  \"type\": \"module\",\n   \"main\": \"index.js\",\n   \"scripts\": {\n     \"test\": \"echo \\\"Error: no test specified\\\" \u0026\u0026 exit 1\"\n```\n\nAdd a markdown file, `example.md`, with the following text:\n\n```markdown\nHello, {planet}!\n\n{pla\\}net} and {pla\u0026#x7d;net}.\n```\n\nTo check if our extension works, add an `example.js` module, with the following\ncode:\n\n```js\nimport fs from 'node:fs/promises'\nimport {micromark} from 'micromark'\nimport {variables} from './index.js'\n\nconst buf = await fs.readFile('example.md')\nconst out = micromark(buf, {extensions: [variables]})\nconsole.log(out)\n```\n\nWhile working on the extension, run `node example` to see whether things work.\nFeel free to add more examples of the variables syntax in `example.md` if\nneeded.\n\nOur extension doesn’t work yet, for one because `micromark` is not installed:\n\n```sh\nnpm install micromark --save-dev\n```\n\n…and we need to write our extension.\nLet’s do that in `index.js`:\n\n```js\nexport const variables = {}\n```\n\nAlthough our extension doesn’t do anything, running `node example` now somewhat\nworks!\n\n##### Syntax extension\n\nMuch in micromark is based on character codes (see [§ Preprocess][preprocess]).\nFor this extension, the relevant codes are:\n\n* `-5`\n  — M-0005 CARRIAGE RETURN (CR)\n* `-4`\n  — M-0004 LINE FEED (LF)\n* `-3`\n  — M-0003 CARRIAGE RETURN LINE FEED (CRLF)\n* `null`\n  — EOF (end of the stream)\n* `92`\n  — U+005C BACKSLASH (`\\`)\n* `123`\n  — U+007B LEFT CURLY BRACE (`{`)\n* `125`\n  — U+007D RIGHT CURLY BRACE (`}`)\n\nAlso relevant are the content types (see [§ Content types][content-types]).\nThis extension is a *text* construct, as it’s parsed alongsides links and such.\nThe content inside it (between the braces) is *string*, to support character\nescapes and -references.\n\nLet’s write our extension.\nAdd the following code to `index.js`:\n\n```js\nconst variableConstruct = {name: 'variable', tokenize: variableTokenize}\n\nexport const variables = {text: {123: variableConstruct}}\n\nfunction variableTokenize(effects, ok, nok) {\n  return start\n\n  function start(code) {\n    console.log('start:', effects, code);\n    return nok(code)\n  }\n}\n```\n\nThe above code exports an extension with the identifier `variables`.\nThe extension defines a *text* construct for the character code `123`.\nThe construct has a `name`, so that it can be turned off (optional, see next\ncase), and it has a `tokenize` function that sets up a state machine, which\nreceives `effects` and the `ok` and `nok` states.\n`ok` can be used when successful, `nok` when not, and so constructs are a bit\nsimilar to how promises can *resolve* or *reject*.\n`tokenize` returns the initial state, `start`, which itself receives the current\ncharacter code, prints some debugging information, and then returns a call\nto `nok`.\n\nEnsure that things work by running `node example` and see what it prints.\n\nNow we need to define our states and figure out how variables work.\nSome people prefer sketching a diagram of the flow.\nI often prefer writing it down in pseudo-code prose.\nI’ve also found that test driven development works well, where I write unit\ntests for how it should work, then write the state machine, and finally use a\ncode coverage tool to ensure I’ve thought of everything.\n\nIn prose, what we have to code looks like this:\n\n* **start**:\n  Receive `123` as `code`, enter a token for the whole (let’s call it\n  `variable`), enter a token for the marker (`variableMarker`), consume\n  `code`, exit the marker token, enter a token for the contents\n  (`variableString`), switch to *begin*\n* **begin**:\n  If `code` is `125`, reconsume in *nok*.\n  Else, reconsume in *inside*\n* **inside**:\n  If `code` is `-5`, `-4`, `-3`, or `null`, reconsume in `nok`.\n  Else, if `code` is `125`, exit the string token, enter a `variableMarker`,\n  consume `code`, exit the marker token, exit the variable token, and switch\n  to *ok*.\n  Else, consume, and remain in *inside*.\n\nThat should be it!\nReplace `variableTokenize` with the following to include the needed states:\n\n```js\nfunction variableTokenize(effects, ok, nok) {\n  return start\n\n  function start(code) {\n    effects.enter('variable')\n    effects.enter('variableMarker')\n    effects.consume(code)\n    effects.exit('variableMarker')\n    effects.enter('variableString')\n    return begin\n  }\n\n  function begin(code) {\n    return code === 125 ? nok(code) : inside(code)\n  }\n\n  function inside(code) {\n    if (code === -5 || code === -4 || code === -3 || code === null) {\n      return nok(code)\n    }\n\n    if (code === 125) {\n      effects.exit('variableString')\n      effects.enter('variableMarker')\n      effects.consume(code)\n      effects.exit('variableMarker')\n      effects.exit('variable')\n      return ok\n    }\n\n    effects.consume(code)\n    return inside\n  }\n}\n```\n\nRun `node example` again and see what it prints!\nThe HTML compiler ignores things it doesn’t know, so variables are now removed.\n\nWe have our first syntax extension, and it sort of works, but we don’t handle\ncharacter escapes and -references yet.\nWe need to do two things to make that work:\na) skip over `\\\\` and `\\}` in our algorithm,\nb) tell micromark to parse them.\n\nChange the code in `index.js` to support escapes like so:\n\n```diff\n@@ -23,6 +23,11 @@ function variableTokenize(effects, ok, nok) {\n       return nok(code)\n     }\n\n+    if (code === 92) {\n+      effects.consume(code)\n+      return insideEscape\n+    }\n+\n     if (code === 125) {\n       effects.exit('variableString')\n       effects.enter('variableMarker')\n@@ -35,4 +40,13 @@ function variableTokenize(effects, ok, nok) {\n     effects.consume(code)\n     return inside\n   }\n+\n+  function insideEscape(code) {\n+    if (code === 92 || code === 125) {\n+      effects.consume(code)\n+      return inside\n+    }\n+\n+    return inside(code)\n+  }\n }\n```\n\nFinally add support for character references and character escapes between\nbraces by adding a special token that defines a content type:\n\n```diff\n@@ -11,6 +11,7 @@ function variableTokenize(effects, ok, nok) {\n     effects.consume(code)\n     effects.exit('variableMarker')\n     effects.enter('variableString')\n+    effects.enter('chunkString', {contentType: 'string'})\n     return begin\n   }\n\n@@ -29,6 +30,7 @@ function variableTokenize(effects, ok, nok) {\n     }\n\n     if (code === 125) {\n+      effects.exit('chunkString')\n       effects.exit('variableString')\n       effects.enter('variableMarker')\n       effects.consume(code)\n```\n\nTokens with a `contentType` will be replaced by *postprocess* (see\n[§ Postprocess][postprocess]) by the tokens belonging to that content type.\n\n##### HTML extension\n\nUp next is an HTML extension to replace variables with data.\nChange `example.js` to use one like so:\n\n```diff\n@@ -1,11 +1,12 @@\n import fs from 'node:fs/promises'\n import {micromark} from 'micromark'\n-import {variables} from './index.js'\n+import {variablesHtml, variables} from './index.js'\n\n const buf = await fs.readFile('example.md')\n-const out = micromark(buf, {extensions: [variables]})\n+const html = variablesHtml({planet: '1', 'pla}net': '2'})\n+const out = micromark(buf, {extensions: [variables], htmlExtensions: [html]})\n console.log(out)\n```\n\nAnd add the HTML extension, `variablesHtml`, to `index.js` like so:\n\n```diff\n@@ -52,3 +52,19 @@ function variableTokenize(effects, ok, nok) {\n     return inside(code)\n   }\n }\n+\n+export function variablesHtml(data = {}) {\n+  return {\n+    enter: {variableString: enterVariableString},\n+    exit: {variableString: exitVariableString},\n+  }\n+\n+  function enterVariableString() {\n+    this.buffer()\n+  }\n+\n+  function exitVariableString() {\n+    var id = this.resume()\n+    if (id in data) {\n+      this.raw(this.encode(data[id]))\n+    }\n+  }\n+}\n```\n\n`variablesHtml` is a function that receives an object mapping “variables” to\nstrings and returns an HTML extension.\nThe extension hooks two functions to `variableString`, one when it starts,\nthe other when it ends.\nWe don’t need to do anything to handle the other tokens as they’re already\nignored by default.\n`enterVariableString` calls `buffer`, which is a function that “stashes” what\nwould otherwise be emitted.\n`exitVariableString` calls `resume`, which is the inverse of `buffer` and\nreturns the stashed value.\nIf the variable is defined, we ensure it’s made safe (with `this.encode`) and\nfinally output that (with `this.raw`).\n\n##### Further exercises\n\nIt works!\nWe’re done!\nOf course, it can be better, such as with the following potential features:\n\n* Add support for empty variables\n* Add support for spaces between markers and string\n* Add support for line endings in variables\n* Add support for nested braces\n* Add support for blocks\n* Add warnings on undefined variables\n* Use `micromark-build`, and use `devlop`, `debug`, and\n  `micromark-util-symbol` (see [§ Size \u0026 debug][size-debug])\n* Add [`mdast-util-from-markdown`][mdast-util-from-markdown] and\n  [`mdast-util-to-markdown`][mdast-util-to-markdown] utilities to parse and\n  serialize the AST\n\n#### Case: turn off constructs\n\nSometimes it’s needed to turn a default construct off.\nThat’s possible through a syntax extension.\nNote that not everything can be turned off (such as paragraphs) and even if it’s\npossible to turn something off, it could break micromark (such as character\nescapes).\n\nTo disable constructs, refer to them by name in an array at the `disable.null`\nfield of an extension:\n\n```js\nimport {micromark} from 'micromark'\n\nconst extension = {disable: {null: ['codeIndented']}}\n\nconsole.log(micromark('\\ta', {extensions: [extension]}))\n```\n\nYields:\n\n```html\n\u003cp\u003ea\u003c/p\u003e\n```\n\n## Architecture\n\nmicromark is maintained as a monorepo.\nMany of its internals, which are used in `micromark` (core) but also useful for\ndevelopers of extensions or integrations, are available as separate modules.\nEach module maintained here is available in [`packages/`][packages].\n\n### Overview\n\nThe naming scheme in [`packages/`][packages] is as follows:\n\n* `micromark-build`\n  — Small CLI to build dev code into production code\n* `micromark-core-commonmark`\n  — CommonMark constructs used in micromark\n* `micromark-factory-*`\n  — Reusable subroutines used to parse parts of constructs\n* `micromark-util-*`\n  — Reusable helpers often needed when parsing markdown\n* `micromark`\n  — Core module\n\nmicromark has two interfaces: buffering (maintained in\n[`micromark/dev/index.js`](https://github.com/micromark/micromark/blob/main/packages/micromark/dev/index.js))\nand streaming (maintained in\n[`micromark/dev/stream.js`](https://github.com/micromark/micromark/blob/main/packages/micromark/dev/stream.js)).\nThe first takes all input at once whereas the last uses a Node.js stream to take\ninput separately.\nThey thinly wrap how data flows through micromark:\n\n```text\n                                            micromark\n+-----------------------------------------------------------------------------------------------+\n|            +------------+         +-------+         +-------------+         +---------+       |\n| -markdown-\u003e+ preprocess +-chunks-\u003e+ parse +-events-\u003e+ postprocess +-events-\u003e+ compile +-html- |\n|            +------------+         +-------+         +-------------+         +---------+       |\n+-----------------------------------------------------------------------------------------------+\n```\n\n### Preprocess\n\nThe **preprocessor**\n([`micromark/dev/lib/preprocess.js`](https://github.com/micromark/micromark/blob/main/packages/micromark/dev/lib/preprocess.js))\ntakes markdown and turns it into chunks.\n\nA **chunk** is either a character code or a slice of a buffer in the form of a\nstring.\nChunks are used because strings are more efficient storage than character codes,\nbut limited in what they can represent.\nFor example, the input `ab\\ncd` is represented as `['ab', -4, 'cd']` in chunks.\n\nA character **code** is often the same as what `String#charCodeAt()` yields but\nmicromark adds meaning to certain other values.\n\nIn micromark, the actual character U+0009 CHARACTER TABULATION (HT) is replaced\nby one M-0002 HORIZONTAL TAB (HT) and between 0 and 3 M-0001 VIRTUAL SPACE (VS)\ncharacters, depending on the column at which the tab occurred.\nFor example, the input `\\ta` is represented as `[-2, -1, -1, -1, 97]` and `a\\tb`\nas `[97, -2, -1, -1, 98]` in character codes.\n\nThe characters U+000A LINE FEED (LF) and U+000D CARRIAGE RETURN (CR) are\nreplaced by virtual characters depending on whether they occur together: M-0003\nCARRIAGE RETURN LINE FEED (CRLF), M-0004 LINE FEED (LF), and M-0005 CARRIAGE\nRETURN (CR).\nFor example, the input `a\\r\\nb\\nc\\rd` is represented as\n`[97, -5, 98, -4, 99, -3, 100]` in character codes.\n\nThe `0` (U+0000 NUL) character code is replaced by U+FFFD REPLACEMENT CHARACTER\n(`�`).\n\nThe `null` code represents the end of the input stream (called *eof* for end of\nfile).\n\n### Parse\n\nThe **parser**\n([`micromark/dev/lib/parse.js`](https://github.com/micromark/micromark/blob/main/packages/micromark/dev/lib/parse.js))\ntakes chunks and turns them into events.\n\nAn **event** is the start or end of a token amongst other events.\nTokens can “contain” other tokens, even though they are stored in a flat list,\nby entering before and exiting after them.\n\nA **token** is a span of one or more codes.\nTokens are most of what micromark produces: the built in HTML compiler or other\ntools can turn them into different things.\nTokens are essentially names attached to a slice, such as `lineEndingBlank` for\ncertain line endings, or `codeFenced` for a whole fenced code.\n\nSometimes, more info is attached to tokens, such as `_open` and `_close` by\n`attention` (strong, emphasis) to signal whether the sequence can open or close\nan attention run.\nThese fields have to do with how the parser works, which is complex and not\nalways pretty.\n\nCertain fields (`previous`, `next`, and `contentType`) are used in many cases:\nlinked tokens for subcontent.\nLinked tokens are used because outer constructs are parsed first.\nTake for example:\n\n```markdown\n- *a\n  b*.\n```\n\n1. The list marker and the space after it is parsed first\n2. The rest of the line is a `chunkFlow` token\n3. The two spaces on the second line are a `linePrefix` of the list\n4. The rest of the line is another `chunkFlow` token\n\nThe two `chunkFlow` tokens are linked together and the chunks they span are\npassed through the flow tokenizer.\nThere the chunks are seen as `chunkContent` and passed through the content\ntokenizer.\nThere the chunks are seen as a paragraph and seen as `chunkText` and passed\nthrough the text tokenizer.\nFinally, the attention (emphasis) and data (“raw” characters) is parsed there,\nand we’re done!\n\n#### Content types\n\nThe parser starts out with a document tokenizer.\n*Document* is the top-most content type, which includes containers such as block\nquotes and lists.\nContainers in markdown come from the margin and include more constructs\non the lines that define them.\n\n*Flow* represents the sections (block constructs such as ATX and setext\nheadings, HTML, indented and fenced code, thematic breaks), which like\n*document* are also parsed per line.\nAn example is HTML, which has a certain starting condition (such as `\u003cscript\u003e`\non its own line), then continues for a while, until an end condition is found\n(such as `\u003c/style\u003e`).\nIf that line with an end condition is never found, that flow goes until the end.\n\n*Content* is zero or more definitions, and then zero or one paragraph.\nIt’s a weird one, and needed to make certain edge cases around definitions spec\ncompliant.\nDefinitions are unlike other things in markdown, in that they behave like *text*\nin that they can contain arbitrary line endings, but *have* to end at a line\nending.\nIf they end in something else, the whole definition instead is seen as a\nparagraph.\n\nThe content in markdown first needs to be parsed up to this level to figure out\nwhich things are defined, for the whole document, before continuing on with\n*text*, as whether a link or image reference forms or not depends on whether\nit’s defined.\nThis unfortunately prevents a true streaming markdown parser.\n\n*Text* contains phrasing content (rich inline text: autolinks, character escapes\nand -references, code, hard breaks, HTML, images, links, emphasis, strong).\n\n*String* is a limited *text*-like content type which only allows character\nreferences and character escapes.\nIt exists in things such as identifiers (media references, definitions),\ntitles, or URLs and such.\n\n#### Constructs\n\nConstructs are the things that make up markdown.\nSome examples are lists, thematic breaks, or character references.\n\nNote that, as a general rule of thumb, markdown is *really weird*.\nIt’s essentially made up of edge cases rather than logical rules.\nWhen browsing the built in constructs, or venturing to build your own, you’ll\nfind confusing new things and run into complex custom hooks.\n\nOne more reasonable construct is the thematic break\n([see code](https://github.com/micromark/micromark/blob/main/packages/micromark-core-commonmark/dev/lib/thematic-break.js)).\nIt’s an object that defines a `name` and a `tokenize` function.\nMost of what constructs do is defined in their required `tokenize` function,\nwhich sets up a state machine to handle character codes streaming in.\n\n### Postprocess\n\nThe **postprocessor**\n([`micromark/dev/lib/postprocess.js`](https://github.com/micromark/micromark/blob/main/packages/micromark/dev/lib/postprocess.js))\nis a small step that takes events, ensures all their\nnested content is parsed, and returns the modified events.\n\n### Compile\n\nThe **compiler**\n([`micromark/dev/lib/compile.js`](https://github.com/micromark/micromark/blob/main/packages/micromark/dev/lib/compile.js))\ntakes events and turns them into HTML.\nWhile micromark was created mostly to advance markdown parsing irrespective of\ncompiling to HTML, the common case of doing so is built in.\nA built in HTML compiler is useful because it allows us to check for compliancy\nto CommonMark, the de facto norm of markdown, specified in roughly 650\ninput/output cases.\nThe parsing parts can still be used separately to build ASTs, CSTs, or many\nother output formats.\n\nThe compiler has an interface that accepts lists of events instead of the whole\nat once, but because markdown can’t truly stream, events are buffered before\ncompiling and outputting the final result.\n\n## Examples\n\n### GitHub flavored markdown (GFM)\n\nTo support GFM (autolink literals, strikethrough, tables, and tasklists) use\n[`micromark-extension-gfm`][gfm].\nSay we have a file like this:\n\n```markdown\n# GFM\n\n## Autolink literals\n\nwww.example.com, https://example.com, and contact@example.com.\n\n## Footnote\n\nA note[^1]\n\n[^1]: Big note.\n\n## Strikethrough\n\n~one~ or ~~two~~ tildes.\n\n## Table\n\n| a | b  |  c |  d  |\n| - | :- | -: | :-: |\n\n## Tag filter\n\n\u003cplaintext\u003e\n\n## Tasklist\n\n* [ ] to do\n* [x] done\n```\n\nThen do something like this:\n\n```js\nimport fs from 'node:fs/promises'\nimport {micromark} from 'micromark'\nimport {gfmHtml, gfm} from 'micromark-extension-gfm'\n\nconst doc = await fs.readFile('example.md')\n\nconsole.log(micromark(doc, {extensions: [gfm()], htmlExtensions: [gfmHtml()]}))\n```\n\n\u003cdetails\u003e\n\u003csummary\u003eShow equivalent HTML\u003c/summary\u003e\n\n```html\n\u003ch1\u003eGFM\u003c/h1\u003e\n\u003ch2\u003eAutolink literals\u003c/h2\u003e\n\u003cp\u003e\u003ca href=\"http://www.example.com\"\u003ewww.example.com\u003c/a\u003e, \u003ca href=\"https://example.com\"\u003ehttps://example.com\u003c/a\u003e, and \u003ca href=\"mailto:contact@example.com\"\u003econtact@example.com\u003c/a\u003e.\u003c/p\u003e\n\u003ch2\u003eFootnote\u003c/h2\u003e\n\u003cp\u003eA note\u003csup\u003e\u003ca href=\"#user-content-fn-1\" id=\"user-content-fnref-1\" data-footnote-ref=\"\" aria-describedby=\"footnote-label\"\u003e1\u003c/a\u003e\u003c/sup\u003e\u003c/p\u003e\n\u003ch2\u003eStrikethrough\u003c/h2\u003e\n\u003cp\u003e\u003cdel\u003eone\u003c/del\u003e or \u003cdel\u003etwo\u003c/del\u003e tildes.\u003c/p\u003e\n\u003ch2\u003eTable\u003c/h2\u003e\n\u003ctable\u003e\n\u003cthead\u003e\n\u003ctr\u003e\n\u003cth\u003ea\u003c/th\u003e\n\u003cth align=\"left\"\u003eb\u003c/th\u003e\n\u003cth align=\"right\"\u003ec\u003c/th\u003e\n\u003cth align=\"center\"\u003ed\u003c/th\u003e\n\u003c/tr\u003e\n\u003c/thead\u003e\n\u003c/table\u003e\n\u003ch2\u003eTag filter\u003c/h2\u003e\n\u0026lt;plaintext\u0026gt;\n\u003ch2\u003eTasklist\u003c/h2\u003e\n\u003cul\u003e\n\u003cli\u003e\u003cinput disabled=\"\" type=\"checkbox\"\u003e to do\u003c/li\u003e\n\u003cli\u003e\u003cinput checked=\"\" disabled=\"\" type=\"checkbox\"\u003e done\u003c/li\u003e\n\u003c/ul\u003e\n\u003csection data-footnotes=\"\" class=\"footnotes\"\u003e\u003ch2 id=\"footnote-label\" class=\"sr-only\"\u003eFootnotes\u003c/h2\u003e\n\u003col\u003e\n\u003cli id=\"user-content-fn-1\"\u003e\n\u003cp\u003eBig note. \u003ca href=\"#user-content-fnref-1\" data-footnote-backref=\"\" class=\"data-footnote-backref\" aria-label=\"Back to content\"\u003e↩\u003c/a\u003e\u003c/p\u003e\n\u003c/li\u003e\n\u003c/ol\u003e\n\u003c/section\u003e\n```\n\n\u003c/details\u003e\n\n### Math\n\nTo support math use [`micromark-extension-math`][math].\nSay we have a file like this:\n\n```markdown\nLift($L$) can be determined by Lift Coefficient ($C_L$) like the following equation.\n\n$$\nL = \\frac{1}{2} \\rho v^2 S C_L\n$$\n```\n\nThen do something like this:\n\n```js\nimport fs from 'node:fs/promises'\nimport {micromark} from 'micromark'\nimport {mathHtml, math} from 'micromark-extension-math'\n\nconst doc = await fs.readFile('example.md')\n\nconsole.log(micromark(doc, {extensions: [math], htmlExtensions: [mathHtml()]}))\n```\n\n\u003cdetails\u003e\n\u003csummary\u003eShow equivalent HTML\u003c/summary\u003e\n\n```html\n\u003cp\u003eLift(\u003cspan class=\"math math-inline\"\u003e\u003cspan class=\"katex\"\u003e…\u003c/span\u003e\u003c/span\u003e) can be determined by Lift Coefficient (\u003cspan class=\"math math-inline\"\u003e\u003cspan class=\"katex\"\u003e…\u003c/span\u003e\u003c/span\u003e) like the following equation.\u003c/p\u003e\n\u003cdiv class=\"math math-display\"\u003e\u003cspan class=\"katex-display\"\u003e\u003cspan class=\"katex\"\u003e…\u003c/span\u003e\u003c/span\u003e\u003c/div\u003e\n```\n\n\u003c/details\u003e\n\n### Syntax tree\n\nA higher level project, [`mdast-util-from-markdown`][mdast-util-from-markdown],\ncan give you an AST.\n\n```js\nimport {fromMarkdown} from 'mdast-util-from-markdown' // This wraps micromark.\n\nconst result = fromMarkdown('## Hello, *world*!')\n\nconsole.log(result.children[0])\n```\n\nYields:\n\n```js\n{\n  type: 'heading',\n  depth: 2,\n  children: [\n    {type: 'text', value: 'Hello, ', position: [Object]},\n    {type: 'emphasis', children: [Array], position: [Object]},\n    {type: 'text', value: '!', position: [Object]}\n  ],\n  position: {\n    start: {line: 1, column: 1, offset: 0},\n    end: {line: 1, column: 19, offset: 18}\n  }\n}\n```\n\nAnother level up is [**remark**][remark], which provides a nice interface and\nhundreds of plugins.\n\n## Markdown\n\n### CommonMark\n\nThe first definition of “Markdown” gave several examples of how it worked,\nshowing input Markdown and output HTML, and came with a reference implementation\n(`Markdown.pl`).\nWhen new implementations followed, they mostly followed the first definition,\nbut deviated from the first implementation, and added extensions, thus making\nthe format a family of formats.\n\nSome years later, an attempt was made to standardize the differences between\nimplementations, by specifying how several edge cases should be handled, through\nmore input and output examples.\nThis is known as [CommonMark][commonmark-spec], and many implementations now\nwork towards some degree of CommonMark compliancy.\nStill, CommonMark describes what the output in HTML should be given some\ninput, which leaves many edge cases up for debate, and does not answer what\nshould happen for other output formats.\n\nmicromark passes all tests from CommonMark and has many more tests to match the\nCommonMark reference parsers.\nFinally, it comes with [CMSM][], which describes how to parse markup, instead\nof documenting input and output examples.\n\n### Grammar\n\nThe syntax of markdown can be described in Backus–Naur form (BNF) as:\n\n```abnf\nmarkdown ::= .*\n```\n\nNo, that’s [not a typo](http://trevorjim.com/a-specification-for-markdown/):\nmarkdown has no syntax errors; anything thrown at it renders *something*.\n\n## Project\n\n### Comparison\n\nThere are many other markdown parsers out there and maybe they’re better suited\nto your use case!\nHere is a short comparison of a couple in JavaScript.\nNote that this list is made by the folks who make `micromark` and `remark`, so\nthere is some bias.\n\n**Note**: these are, in fact, not really comparable: micromark (and remark)\nfocus on completely different things than other markdown parsers do.\nSure, you can generate HTML from markdown with them, but micromark (and remark)\nare created for (abstract or concrete) syntax trees—to inspect, transform, and\ngenerate content, so that you can make things like [MDX][], [Prettier][], or\n[Astro][].\n\n###### `micromark`\n\nmicromark can be used in two different ways.\nIt can either be used, optionally with existing extensions, to get HTML easily.\nOr, it can give tremendous power, such as access to all tokens with positional\ninfo, at the cost of being hard to get into.\nIt’s super small, pretty fast, and has 100% CommonMark compliance.\nIt has syntax extensions, such as supporting 100% GFM compliance (with\n`micromark-extension-gfm`), but they’re rather complex to write.\nIt’s the newest parser on the block, which means it’s fresh and well suited for\ncontemporary markdown needs, but it’s also battle-tested, and already the 3rd\nmost popular markdown parser in JavaScript.\n\nIf you’re looking for fine grained control, use micromark.\nIf you just want HTML from markdown, use micromark.\n\n###### `remark`\n\n[remark][] is the most popular markdown parser.\nIt’s built on top of `micromark` and boasts syntax trees.\nFor an analogy, it’s like if Babel, ESLint, and more, were one project.\nIt supports the syntax extensions that micromark has (so it’s 100% CM compliant\nand can be 100% GFM compliant), but most of the work is done in plugins that\ntransform or inspect the tree, and there’s *tons* of them.\nTransforming the tree is relatively easy: it’s a JSON object that can be\nmanipulated directly.\nremark is stable, widely used, and extremely powerful for handling complex data.\n\nYou probably should use [remark][].\n\n###### `marked`\n\n[marked][] is the oldest markdown parser on the block.\nIt’s been around for ages, is battle tested, small, popular, and has a bunch of\nextensions, but doesn’t match CommonMark or GFM, and is unsafe by default.\n\nIf you have markdown you trust and want to turn it into HTML without a fuss, and\ndon’t care about perfect compatibility with CommonMark or GFM, but do appreciate\na small bundle size and stability, use [marked][].\n\n###### `markdown-it`\n\n[markdown-it][] is a good, stable, and essentially CommonMark compliant markdown\nparser, with (optional) support for some GFM features as well.\nIt’s used a lot as a direct dependency in packages, but is rather big.\nIt shines at syntax extensions, where you want to support not just markdown, but\n*your* (company’s) version of markdown.\n\nIf you need a couple of custom syntax extensions to your otherwise\nCommonMark-compliant markdown, and want to get HTML out, use [markdown-it][].\n\n###### Others\n\nThere are lots of other markdown parsers!\nSome say they’re small, or fast, or that they’re CommonMark compliant—but\nthat’s not always true.\nThis list is not supposed to be exhaustive (but it’s the most relevant ones).\nThis list of markdown parsers is a snapshot in time of why (not) to use\n(alternatives to) `micromark`: they’re all good choices, depending on what your\ngoals are.\n\n### Test\n\nmicromark is tested with the \\~650 CommonMark tests and more than 1.2k extra\ntests confirmed with CM reference parsers.\nThese tests reach all branches in the code, which means that this project has\n100% code coverage.\nFinally, we use fuzz testing to ensure micromark is stable, reliable, and\nsecure.\n\nTo build, format, and test the codebase, use `$ npm test` after clone and\ninstall.\nThe `$ npm run test-api` and `$ npm run test-coverage` scripts check either the\nunit tests, or both them and their coverage, respectively.\n\nThe `$ npm run test-fuzz` script does fuzz testing for 30 minutes.\n\n### Size \u0026 debug\n\nmicromark is really small.\nA ton of time went into making sure it minifies well, by the way code is written\nbut also through custom build scripts to pre-evaluate certain expressions.\nFurthermore, care went into making it compress well with gzip and brotli.\n\nNormally, you’ll use the pre-evaluated version of micromark.\nWhile developing, debugging, or testing your code, you *should* switch to use\ncode instrumented with assertions and debug messages:\n\n```sh\nnode --conditions development module.js\n```\n\nTo see debug messages, use a `DEBUG` env variable set to `micromark`:\n\n```sh\nDEBUG=\"*\" node --conditions development module.js\n```\n\n### Version\n\nmicromark adheres to [semver](https://semver.org) since 3.0.0.\n\n### Security\n\nThe typical security aspect discussed for markdown is [cross-site scripting\n(XSS)][xss] attacks.\nMarkdown itself is safe if it does not include embedded HTML or dangerous\nprotocols in links/images (such as `javascript:` or `data:`).\nmicromark makes any markdown safe by default, even if HTML is embedded or\ndangerous protocols are used, as it encodes or drops them.\nTurning on the `allowDangerousHtml` or `allowDangerousProtocol` options for\nuser-provided markdown opens you up to XSS attacks.\n\nAnother security aspect is DDoS attacks.\nFor example, an attacker could throw a 100mb file at micromark, in which case\nthe JavaScript engine will run out of memory and crash.\nIt is also possible to crash micromark with smaller payloads, notably when\nthousands of links, images, emphasis, or strong are opened but not closed.\nIt is wise to cap the accepted size of input (500kb can hold a big book) and to\nprocess content in a different thread or worker so that it can be stopped when\nneeded.\n\nUsing extensions might also be unsafe, refer to their documentation for more\ninformation.\n\nFor more information on markdown sanitation, see\n[`improper-markup-sanitization.md`][improper] by [**@chalker**][chalker].\n\nSee [`security.md`][securitymd] in [`micromark/.github`][health] for how to\nsubmit a security report.\n\n### Contribute\n\nSee [`contributing.md`][contributing] in [`micromark/.github`][health] for ways\nto get started.\nSee [`support.md`][support] for ways to get help.\n\nThis project has a [code of conduct][coc].\nBy interacting with this repository, organisation, or community you agree to\nabide by its terms.\n\n### Sponsor\n\n\u003c!-- Note: this section has to be in sync with the `micromark` readme. --\u003e\n\nSupport this effort and give back by sponsoring on [OpenCollective][]!\n\n\u003ctable\u003e\n\u003ctr valign=\"middle\"\u003e\n\u003ctd width=\"100%\" align=\"center\" colspan=\"10\"\u003e\n  \u003cbr\u003e\n  \u003ca href=\"https://www.salesforce.com\"\u003eSalesforce\u003c/a\u003e 🏅\u003cbr\u003e\u003cbr\u003e\n  \u003ca href=\"https://www.salesforce.com\"\u003e\u003cimg src=\"https://images.opencollective.com/salesforce/ca8f997/logo/512.png\" width=\"256\"\u003e\u003c/a\u003e\n\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr valign=\"middle\"\u003e\n\u003ctd width=\"20%\" align=\"center\" rowspan=\"2\" colspan=\"2\"\u003e\n  \u003ca href=\"https://vercel.com\"\u003eVercel\u003c/a\u003e\u003cbr\u003e\u003cbr\u003e\n  \u003ca href=\"https://vercel.com\"\u003e\u003cimg src=\"https://avatars1.githubusercontent.com/u/14985020?s=256\u0026v=4\" width=\"128\"\u003e\u003c/a\u003e\n\u003c/td\u003e\n\u003ctd width=\"20%\" align=\"center\" rowspan=\"2\" colspan=\"2\"\u003e\n  \u003ca href=\"https://motif.land\"\u003eMotif\u003c/a\u003e\u003cbr\u003e\u003cbr\u003e\n  \u003ca href=\"https://motif.land\"\u003e\u003cimg src=\"https://avatars1.githubusercontent.com/u/74457950?s=256\u0026v=4\" width=\"128\"\u003e\u003c/a\u003e\n\u003c/td\u003e\n\u003ctd width=\"20%\" align=\"center\" rowspan=\"2\" colspan=\"2\"\u003e\n  \u003ca href=\"https://www.hashicorp.com\"\u003eHashiCorp\u003c/a\u003e\u003cbr\u003e\u003cbr\u003e\n  \u003ca href=\"https://www.hashicorp.com\"\u003e\u003cimg src=\"https://avatars1.githubusercontent.com/u/761456?s=256\u0026v=4\" width=\"128\"\u003e\u003c/a\u003e\n\u003c/td\u003e\n\u003ctd width=\"20%\" align=\"center\" rowspan=\"2\" colspan=\"2\"\u003e\n  \u003ca href=\"https://www.gitbook.com\"\u003eGitBook\u003c/a\u003e\u003cbr\u003e\u003cbr\u003e\n  \u003ca href=\"https://www.gitbook.com\"\u003e\u003cimg src=\"https://avatars1.githubusercontent.com/u/7111340?s=256\u0026v=4\" width=\"128\"\u003e\u003c/a\u003e\n\u003c/td\u003e\n\u003ctd width=\"20%\" align=\"center\" rowspan=\"2\" colspan=\"2\"\u003e\n  \u003ca href=\"https://www.gatsbyjs.org\"\u003eGatsby\u003c/a\u003e\u003cbr\u003e\u003cbr\u003e\n  \u003ca href=\"https://www.gatsbyjs.org\"\u003e\u003cimg src=\"https://avatars1.githubusercontent.com/u/12551863?s=256\u0026v=4\" width=\"128\"\u003e\u003c/a\u003e\n\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr valign=\"middle\"\u003e\n\u003c/tr\u003e\n\u003ctr valign=\"middle\"\u003e\n\u003ctd width=\"20%\" align=\"center\" rowspan=\"2\" colspan=\"2\"\u003e\n  \u003ca href=\"https://www.netlify.com\"\u003eNetlify\u003c/a\u003e\u003cbr\u003e\u003cbr\u003e\n  \u003c!--OC has a sharper image--\u003e\n  \u003ca href=\"https://www.netlify.com\"\u003e\u003cimg src=\"https://images.opencollective.com/netlify/4087de2/logo/256.png\" width=\"128\"\u003e\u003c/a\u003e\n\u003c/td\u003e\n\u003ctd width=\"10%\" align=\"center\"\u003e\n  \u003ca href=\"https://www.coinbase.com\"\u003eCoinbase\u003c/a\u003e\u003cbr\u003e\u003cbr\u003e\n  \u003ca href=\"https://www.coinbase.com\"\u003e\u003cimg src=\"https://avatars1.githubusercontent.com/u/1885080?s=256\u0026v=4\" width=\"64\"\u003e\u003c/a\u003e\n\u003c/td\u003e\n\u003ctd width=\"10%\" align=\"center\"\u003e\n  \u003ca href=\"https://themeisle.com\"\u003eThemeIsle\u003c/a\u003e\u003cbr\u003e\u003cbr\u003e\n  \u003ca href=\"https://themeisle.com\"\u003e\u003cimg src=\"https://avatars1.githubusercontent.com/u/58979018?s=128\u0026v=4\" width=\"64\"\u003e\u003c/a\u003e\n\u003c/td\u003e\n\u003ctd width=\"10%\" align=\"center\"\u003e\n  \u003ca href=\"https://expo.io\"\u003eExpo\u003c/a\u003e\u003cbr\u003e\u003cbr\u003e\n  \u003ca href=\"https://expo.io\"\u003e\u003cimg src=\"https://avatars1.githubusercontent.com/u/12504344?s=128\u0026v=4\" width=\"64\"\u003e\u003c/a\u003e\n\u003c/td\u003e\n\u003ctd width=\"10%\" align=\"center\"\u003e\n  \u003ca href=\"https://boostnote.io\"\u003eBoost Note\u003c/a\u003e\u003cbr\u003e\u003cbr\u003e\n  \u003ca href=\"https://boostnote.io\"\u003e\u003cimg src=\"https://images.opencollective.com/boosthub/6318083/logo/128.png\" width=\"64\"\u003e\u003c/a\u003e\n\u003c/td\u003e\n\u003ctd width=\"10%\" align=\"center\"\u003e\n  \u003ca href=\"https://markdown.space\"\u003eMarkdown Space\u003c/a\u003e\u003cbr\u003e\u003cbr\u003e\n  \u003ca href=\"https://markdown.space\"\u003e\u003cimg src=\"https://images.opencollective.com/markdown-space/e1038ed/logo/128.png\" width=\"64\"\u003e\u003c/a\u003e\n\u003c/td\u003e\n\u003ctd width=\"10%\" align=\"center\"\u003e\n  \u003ca href=\"https://www.holloway.com\"\u003eHolloway\u003c/a\u003e\u003cbr\u003e\u003cbr\u003e\n  \u003ca href=\"https://www.holloway.com\"\u003e\u003cimg src=\"https://avatars1.githubusercontent.com/u/35904294?s=128\u0026v=4\" width=\"64\"\u003e\u003c/a\u003e\n\u003c/td\u003e\n\u003ctd width=\"10%\"\u003e\u003c/td\u003e\n\u003ctd width=\"10%\"\u003e\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr valign=\"middle\"\u003e\n\u003ctd width=\"100%\" align=\"center\" colspan=\"8\"\u003e\n  \u003cbr\u003e\n  \u003ca href=\"https://opencollective.com/unified\"\u003e\u003cstrong\u003eYou?\u003c/strong\u003e\u003c/a\u003e\n  \u003cbr\u003e\u003cbr\u003e\n\u003c/td\u003e\n\u003c/tr\u003e\n\u003c/table\u003e\n\n### Origin story\n\nOver the summer of 2018, micromark was planned, and the idea shared in August\nwith a couple of friends and potential sponsors.\nThe problem I (**[@wooorm][]**) had was that issues were piling up in remark and\nother repos, but my day job (teaching) was fun, fulfilling, and deserved time\ntoo.\nIt was getting hard to combine the two.\nThe thought was to feed two birds with one scone: fix the issues in remark with\na new markdown parser (codename marydown) while being financially supported by\nsponsors building fancy stuff on top, such as Gatsby, Contentful, and Vercel\n(ZEIT at the time).\n**[@johno][]** was making MDX on top of remark at the time (important historical\nnote: several other folks were working on JSX + markdown too).\nWe bundled our strengths: MDX was getting some traction and we thought together\nwe could perhaps make something sustainable.\n\nIn November 2018, we launched with the idea for micromark to solve all existing\nbugs, sustaining the existing hundreds of projects, and furthering the exciting\nhigh-level project MDX.\nWe pushed a single name: unified (which back then was a small but essential\npart of the chain).\nGatsby and Vercel were immediate sponsors.\nWe didn’t know whether it would work, and it worked.\nBut now you have a new problem: you are getting some financial support (much\nmore than other open source projects) but it’s not enough money for rent, and\ntoo much money to print stickers with.\nYou still have your job and issues are still piling up.\n\nAt the start of summer 2019, after a couple months of saving up donations, I\nquit my job and worked on unified through fall.\nThat got the number of open issues down significantly and set up a strong\ngovernance and maintenance system for the collective.\nBut when the time came to work on micromark, the money was gone again, so I\ncontracted through winter 2019, and in spring 2020 I could do about half open\nsource, half contracting.\nOne of the contracting gigs was to write a new MDX parser, for which I also\ndocumented how to do that with a state machine [in prose][mdx-cmsm].\nThat gave me the insight into how the same could be done for markdown: I drafted\n[CMSM][], which was some of the core ideas for micromark, but in prose.\n\nIn May 2020, Salesforce reached out: they saw the bugs in remark, how micromark\ncould help, and the initial work on CMSM.\nAnd they had thousands of Markdown files.\nIn a for open source uncharacteristic move, they decided to fund my work on\nmicromark.\nA large part of what maintaining open source means, is putting out fires,\ntriaging issues, and making sure users and sponsors are happy, so it was\namazing to get several months to just focus and make something new.\nI remember feeling that this project would probably be the hardest thing I’d\nwork on: yeah, parsers are pretty difficult, but markdown is on another level.\nMarkdown is such a giant stack of edge cases on edge cases on even more\nweirdness, what a mess.\nOn August 20, 2020, I released [2.0.0][200], the first working version of\nmicromark.\nAnd it’s hard to describe how that moment felt.\nIt was great.\n\nIn 2022, Vercel paid me to make a Rust version: [`markdown-rs`][markdown-rs].\nSuper cool that I got to continue this work and bring it to a new language.\n\n### License\n\n[MIT][license] © [Titus Wormer][author]\n\n\u003c!-- Definitions --\u003e\n\n[200]: https://github.com/micromark/micromark/releases/tag/2.0.0\n\n[@johno]: https://github.com/johno\n\n[@wooorm]: https://github.com/wooorm\n\n[api]: https://github.com/micromark/micromark/blob/main/packages/micromark/readme.md#api\n\n[api-option-extensions]: https://github.com/micromark/micromark/blob/main/packages/micromark/readme.md#extensions\n\n[api-option-htmlextensions]: https://github.com/micromark/micromark/blob/main/packages/micromark/readme.md#htmlextensions\n\n[architecture]: #architecture\n\n[astro]: https://github.com/withastro/astro\n\n[author]: https://wooorm.com\n\n[backers-badge]: https://opencollective.com/unified/backers/badge.svg\n\n[build]: https://github.com/micromark/micromark/actions\n\n[build-badge]: https://github.com/micromark/micromark/workflows/main/badge.svg\n\n[bundle-size]: https://bundlejs.com/?q=micromark\n\n[bundle-size-badge]: https://img.shields.io/badge/dynamic/json?label=minzipped%20size\u0026query=$.size.compressedSize\u0026url=https://deno.bundlejs.com/?q=micromark\n\n[chalker]: https://github.com/ChALkeR\n\n[chat]: https://github.com/micromark/micromark/discussions\n\n[chat-badge]: https://img.shields.io/badge/chat-discussions-success.svg\n\n[cheat]: https://commonmark.org/help/\n\n[cmsm]: https://github.com/micromark/common-markup-state-machine\n\n[coc]: https://github.com/micromark/.github/blob/main/code-of-conduct.md\n\n[commonmark]: #commonmark\n\n[commonmark-spec]: https://commonmark.org\n\n[comparison]: #comparison\n\n[constructs]: /packages/micromark/dev/lib/constructs.js\n\n[content-types]: #content-types\n\n[contribute]: #contribute\n\n[contributing]: https://github.com/micromark/.github/blob/main/contributing.md\n\n[coverage]: https://codecov.io/github/micromark/micromark\n\n[coverage-badge]: https://img.shields.io/codecov/c/github/micromark/micromark.svg\n\n[create-extension]: #creating-a-micromark-extension\n\n[directive-proposal]: https://talk.commonmark.org/t/generic-directives-plugins-syntax/444\n\n[directives]: https://github.com/micromark/micromark-extension-directive\n\n[downloads]: https://www.npmjs.com/package/micromark\n\n[downloads-badge]: https://img.shields.io/npm/dm/micromark.svg\n\n[esm]: https://gist.github.com/sindresorhus/a39789f98801d908bbc7ff3ecc99d99c\n\n[esmsh]: https://esm.sh\n\n[extending-markdown]: #extending-markdown\n\n[extensions]: #list-of-extensions\n\n[frontmatter]: https://github.com/micromark/micromark-extension-frontmatter\n\n[gfm]: https://github.com/micromark/micromark-extension-gfm\n\n[health]: https://github.com/micromark/.github\n\n[html-extension]: #htmlextension\n\n[improper]: https://github.com/ChALkeR/notes/blob/master/Improper-markup-sanitization.md\n\n[license]: https://github.com/micromark/micromark/blob/main/license\n\n[markdown-it]: https://github.com/markdown-it/markdown-it\n\n[markdown-rs]: https://github.com/wooorm/markdown-rs\n\n[marked]: https://github.com/markedjs/marked\n\n[math]: https://github.com/micromark/micromark-extension-math\n\n[mdast]: https://github.com/syntax-tree/mdast\n\n[mdast-util-from-markdown]: https://github.com/syntax-tree/mdast-util-from-markdown\n\n[mdast-util-to-markdown]: https://github.com/syntax-tree/mdast-util-to-markdown\n\n[mdx]: https://github.com/mdx-js/mdx\n\n[mdx-cmsm]: https://github.com/micromark/mdx-state-machine\n\n[mdx-expression]: https://github.com/micromark/micromark-extension-mdx-expression\n\n[mdxjs]: https://github.com/micromark/micromark-extension-mdxjs\n\n[npm]: https://docs.npmjs.com/cli/install\n\n[opencollective]: https://opencollective.com/unified\n\n[packages]: packages/\n\n[postprocess]: #postprocess\n\n[preprocess]: #preprocess\n\n[prettier]: https://github.com/prettier/prettier\n\n[rehype]: https://github.com/rehypejs/rehype\n\n[rehype-plugins]: https://github.com/rehypejs/rehype/blob/main/doc/plugins.md#list-of-plugins\n\n[remark]: https://github.com/remarkjs/remark\n\n[remark-directive]: https://github.com/remarkjs/remark-directive\n\n[remark-plugins]: https://github.com/remarkjs/remark/blob/main/doc/plugins.md#list-of-plugins\n\n[security]: #security\n\n[securitymd]: https://github.com/micromark/.github/blob/main/security.md\n\n[site]: https://unifiedjs.com\n\n[size-debug]: #size--debug\n\n[sponsor]: #sponsor\n\n[sponsors-badge]: https://opencollective.com/unified/sponsors/badge.svg\n\n[support]: https://github.com/micromark/.github/blob/main/support.md\n\n[syntax-extension]: #syntaxextension\n\n[test]: #test\n\n[unified]: https://github.com/unifiedjs/unified\n\n[utilities]: https://github.com/syntax-tree/mdast#list-of-utilities\n\n[xss]: https://en.wikipedia.org/wiki/Cross-site_scripting\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmicromark%2Fmicromark","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmicromark%2Fmicromark","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmicromark%2Fmicromark/lists"}