{"id":13570032,"url":"https://github.com/jgm/djot","last_synced_at":"2025-04-11T23:55:24.443Z","repository":{"id":44471103,"uuid":"512726650","full_name":"jgm/djot","owner":"jgm","description":"A light markup language","archived":false,"fork":false,"pushed_at":"2025-02-14T17:34:46.000Z","size":507,"stargazers_count":1779,"open_issues_count":101,"forks_count":51,"subscribers_count":29,"default_branch":"main","last_synced_at":"2025-04-11T23:55:17.796Z","etag":null,"topics":["commonmark","lua","markdown","markup-language","pandoc"],"latest_commit_sha":null,"homepage":"https://djot.net","language":"Emacs Lisp","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/jgm.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":".github/FUNDING.yml","license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null},"funding":{"github":["jgm"]}},"created_at":"2022-07-11T11:21:03.000Z","updated_at":"2025-04-11T14:56:16.000Z","dependencies_parsed_at":"2023-02-09T04:17:18.364Z","dependency_job_id":"8ef74e72-1695-4bc5-8d99-39cc8f24d897","html_url":"https://github.com/jgm/djot","commit_stats":{"total_commits":488,"total_committers":26,"mean_commits":18.76923076923077,"dds":0.0840163934426229,"last_synced_commit":"f7e12e96ed36c0c0ea2d27b36081da6f0715e352"},"previous_names":[],"tags_count":2,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jgm%2Fdjot","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jgm%2Fdjot/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jgm%2Fdjot/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jgm%2Fdjot/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/jgm","download_url":"https://codeload.github.com/jgm/djot/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248497812,"owners_count":21113984,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["commonmark","lua","markdown","markup-language","pandoc"],"created_at":"2024-08-01T14:00:47.442Z","updated_at":"2025-04-11T23:55:24.420Z","avatar_url":"https://github.com/jgm.png","language":"Emacs Lisp","funding_links":["https://github.com/sponsors/jgm"],"categories":["Emacs Lisp","Lua","Markup languages","HTML","lua","Official Resources"],"sub_categories":["TypeScript"],"readme":"# Djot\n\nDjot is a light markup syntax. It derives most of its features\nfrom [commonmark](https://spec.commonmark.org), but it fixes\na few things that make commonmark's syntax complex and difficult\nto parse efficiently. It is also much fuller-featured than\ncommonmark, with support for definition lists, footnotes,\ntables, several new kinds of inline formatting (insert, delete,\nhighlight, superscript, subscript), math, smart punctuation,\nattributes that can be applied to any element, and generic\ncontainers for block-level, inline-level, and raw content.\n\nThe project began as an attempt to implement some of the\nideas I suggested in my essay [Beyond Markdown](https://johnmacfarlane.net/beyond-markdown.html). (See [Rationale](#rationale), below.)\n\nThis repository contains a\n[Syntax Description](https://htmlpreview.github.io/?https://github.com/jgm/djot/blob/master/doc/syntax.html),\na [Cheatsheet](doc/cheatsheet.md), and a\n[Quick Start for Markdown Users](doc/quickstart-for-markdown-users.md)\nthat outlines the main differences between djot and Markdown.\n\nYou can try djot on the [djot playground](https://djot.net/playground/)\nwithout installing anything locally.\n\n## Rationale\n\nHere are some design goals:\n\n1. It should be possible to parse djot markup in linear time,\n    with no backtracking.\n\n2. Parsing of inline elements should be \"local\" and not depend\n    on what references are defined later. This is not the case\n    in commonmark:  `[foo][bar]` might be \"[foo]\" followed by\n    a link with text \"bar\", or \"[foo][bar]\", or a link with\n    text \"foo\", or a link with text \"foo\" followed by\n    \"[bar]\", depending on whether the references `[foo]` and\n    `[bar]` are defined elsewhere (perhaps later) in the\n    document. This non-locality makes accurate syntax highlighting\n    nearly impossible.\n\n3. Rules for emphasis should be simpler. The fact that doubled\n    characters are used for strong emphasis in commonmark leads to\n    many potential ambiguities, which are resolved by a daunting\n    list of 17 rules. It is hard to form a good mental model\n    of these rules. Most of the time they interpret things the\n    way a human would most naturally interpret them---but not always.\n\n4. Expressive blind spots should be avoided. In commonmark,\n    you're out of luck if you want to produce the HTML\n    `a\u003cem\u003e?\u003c/em\u003eb`, because the flanking rules classify\n    the first asterisk in `a*?*b` as right-flanking. There is a\n    way around this, but it's ugly (using a numerical entity instead\n    of `a`). In djot there should not be expressive blind spots of\n    this kind.\n\n5. Rules for what content belongs to a list item should be simple.\n    In commonmark, content under a list item must be indented as far\n    as the first non-space content after the list marker (or five\n    spaces after the marker, in case the list item begins with indented\n    code). Many people get confused when their indented content is\n    not indented far enough and does not get included in the list item.\n\n6. Parsers should not be forced to recognize unicode character classes,\n    HTML tags, or entities, or perform unicode case folding.\n    That adds a lot of complexity.\n\n7. The syntax should be friendly to hard-wrapping: hard-wrapping\n    a paragraph should not lead to different interpretations, e.g.\n    when a number followed by a period ends up at the beginning of\n    a line. (I anticipate that many will ask, why hard-wrap at\n    all?  Answer:  so that your document is readable just as it\n    is, without conversion to HTML and without special editor\n    modes that soft-wrap long lines. Remember that source readability\n    was one of the prime goals of Markdown and Commonmark.)\n\n8. The syntax should compose uniformly, in the following sense:\n    if a sequence of lines has a certain meaning outside a list\n    item or block quote, it should have the same meaning inside it.\n    This principle is [articulated in the commonmark \n    spec](https://spec.commonmark.org/0.30/#principle-of-uniformity),\n    but the spec doesn't completely abide by it (see\n    commonmark/commonmark-spec#634).\n\n9. It should be possible to attach arbitrary attributes to any\n    element.\n\n10. There should be generic containers for text, inline content,\n    and block-level content, to which arbitrary attributes can be applied.\n    This allows for extensibility using AST transformations.\n\n11. The syntax should be kept as simple as possible, consistent with\n    these goals. Thus, for example, we don't need two different\n    styles of headings or code blocks.\n\nThese goals motivated the following decisions:\n\n\n- Block-level elements can't interrupt paragraphs (or headings),\n  because of goal 7. So in djot the following is a single paragraph, not\n  (as commonmark sees it) a paragraph followed by an ordered list\n  followed by a block quote followed by a section heading:\n\n  ```\n  My favorite number is probably the number\n  1. It's the smallest natural number that is\n  \u003e 0. With pencils, though, I prefer a\n  # 2.\n  ```\n\n  Commonmark does make some concessions to goal 7, by forbidding\n  lists beginning with markers other than `1.` to interrupt paragraphs.\n  But this is a compromise and a sacrifice of regularity and\n  predictability in the syntax. Better just to have a general rule.\n\n- An implication of the last decision is that, although \"tight\"\n  lists are still possible (without blank lines between items),\n  a *sublist* must always be preceded by a blank line. Thus,\n  instead of\n\n  ```\n  - Fruits\n    - apple\n    - orange\n  ```\n\n  you must write\n\n  ```\n  - Fruits\n\n    - apple\n    - orange\n  ```\n\n  (This blank line doesn't count against \"tightness.\")\n  reStructuredText makes the same design decision.\n\n- Also to promote goal 7, we allow headings to \"lazily\"\n  span multiple lines:\n\n  ```\n  ## My excessively long section heading is too\n  long to fit on one line.\n  ``` \n\n  While we're at it, we'll simplify by removing setext-style\n  (underlined) headings. We don't really need two heading\n  syntaxes (goal 11).\n\n- To meet goal 5, we have a very simple rule: anything that is\n  indented beyond the start of the list marker belongs in\n  the list item.\n\n  ```\n  1. list item\n\n    \u003e block quote inside item 1\n\n  2. second item\n  ```\n\n  In commonmark, this would be parsed as two separate lists with\n  a block quote between them, because the block quote is not\n  indented far enough. What kept us from using this simple rule\n  in commonmark was indented code blocks. If list items are\n  going to contain an indented code block, we need to know at\n  what column to start counting the indentation, so we fixed on\n  the column that makes the list look best (the first column of\n  non-space content after the marker):\n\n  ```\n  1.  A commonmark list item with an indented code block in it.\n\n          code!\n  ```\n\n  In djot, we just get rid of indented code blocks. Most people\n  prefer fenced code blocks anyway, and we don't need two\n  different ways of writing code blocks (goal 11).\n\n- To meet goal 6 and to avoid the complex rules commonmark\n  adopted for handling raw HTML, we simply do not allow raw HTML,\n  except in explicitly marked contexts, e.g.\n  `` `\u003ca id=\"foo\"\u003e`{=html} `` or\n\n  ````\n  ``` =html\n  \u003ctable\u003e\n  \u003ctr\u003e\u003ctd\u003efoo\u003c/td\u003e\u003c/tr\u003e\n  \u003c/table\u003e\n  ```\n  ````\n\n  Unlike Markdown, djot is not HTML-centric. Djot documents\n  might be rendered to a variety of different formats, so although\n  we want to provide the flexibility to include raw content in\n  any output format, there is no reason to privilege HTML. For\n  similar reasons we do not interpret HTML entities, as\n  commonmark does.\n\n- To meet goal 2, we make reference link parsing local.\n  Anything that looks like `[foo][bar]` or `[foo][]` gets\n  treated as a reference link, regardless of whether `[foo]`\n  is defined later in the document. A corollary is that we\n  must get rid of shortcut link syntax, with just a single\n  bracket pair, `[like this]`. It must always be clear what is a\n  link without needing to know the surrounding context.\n\n- In support of goal 6, reference links are no longer\n  case-insensitive. Supporting this beyond an ASCII context\n  would require building in unicode case folding to every\n  implementation, and it doesn't seem necessary.\n\n- A space or newline is required after `\u003e` in block quotes,\n  to avoid the violations of the principle of uniformity \n  noted in goal 8:\n\n  ```\n  \u003eThis is not a\n  \u003eblock quote in djot.\n  ```\n\n- To meet goal 3, we avoid using doubled characters for\n  strong emphasis. Instead, we use `_` for emphasis and `*` for\n  strong emphasis. Emphasis can begin with one of these\n  characters, as long as it is not followed by a space,\n  and will end when a similar character is encountered,\n  as long as it is not preceded by a space and some\n  different characters have occurred in between. In the case\n  of overlap, the first one to be closed takes precedence.\n  (This simple rule also avoids the need we had in commonmark to\n  determine unicode character classes---goal 6.)\n\n- Taken just by itself, this last change would introduce a\n  number of expressive blind spots. For example, given the\n  simple rule,\n  ```\n  _(_foo_)_\n  ```\n  parses as\n  ``` html\n  \u003cem\u003e(\u003c/em\u003efoo\u003cem\u003e)\u003c/em\u003e\n  ```\n  rather than\n  ``` html\n  \u003cem\u003e(\u003cem\u003efoo\u003c/em\u003e)\u003c/em\u003e\n  ```\n  If you want the latter\n  interpretation, djot allows you to use the syntax\n  ```\n  _({_foo_})_\n  ```\n  The `{_` is a `_` that can only open emphasis, and the `_}` is\n  a `_` that can only close emphasis. The same can be done with\n  `*` or any other inline formatting marker that is ambiguous\n  between an opener and closer. These curly braces are\n  *required* for certain inline markup, e.g. `{=highlighting=}`,\n  `{+insert+}`, and `{-delete-}`, since the characters `=`, `+`,\n  and `-` are found often in ordinary text.\n\n- In support of goal 1, code span parsing does not backtrack.\n  So if you open a code span and don't close it, it extends to\n  the end of the paragraph. That is similar to the way fenced\n  code blocks work in commonmark.\n\n  ```\n  This is `inline code.\n  ```\n\n- In support of goal 9, a generic attribute syntax is\n  introduced. Attributes can be attached to any block-level\n  element by putting them on the line before it, and to any\n  inline-level element by putting them directly after it.\n\n  ```\n  {#introduction}\n  This is the introductory paragraph, with\n  an identifier `introduction`.\n\n             {.important color=\"blue\" #heading}\n  ## heading\n\n  The word *atelier*{weight=\"600\"} is French.\n  ```\n\n- Since we are going to have generic attributes, we no longer\n  support quoted titles in links. One can add a title\n  attribute if needed, but this isn't very common, so we don't\n  need a special syntax for it:\n\n  ```\n  [Link text](url){title=\"Click me!\"}\n  ```\n\n- Fenced divs and bracketed spans are introduced in order to\n  allow attributes to be attached to arbitrary sequences of\n  block-level or inline-level elements. For example,\n\n  ```\n  {#warning .sidebar}\n  ::: Warning\n  This is a warning.\n  Here is a word in [français]{lang=fr}.\n  :::\n  ```\n\n## Syntax\n\nFor a full syntax reference, see the\n[syntax description](https://htmlpreview.github.io/?https://github.com/jgm/djot/blob/master/doc/syntax.html).\n\nA vim syntax highlighting definition for djot is provided in\n`editors/vim/`.\n\n## Implementations\n\nThere are currently six djot implementations:\n\n- [djot.js (JavaScript/TypeScript)](https://github.com/jgm/djot.js)\n- [djot.lua (Lua)](https://github.com/jgm/djot.lua)\n- [djota (Prolog)](https://github.com/aarroyoc/djota)\n- [jotdown (Rust)](https://github.com/hellux/jotdown)\n- [godjot (Go)](https://github.com/sivukhin/godjot)\n- [djoths (Haskell)](https://github.com/jgm/djoths)\n\n[Here](https://github.com/dcampbell24/djot-implementations) are some benchmarks of these implementations.\n\ndjot.lua was the original reference implementation, but\ncurrent development is focused on djot.js, and it is possible\nthat djot.lua will not be kept up to date with the latest syntax\nchanges.\n\n## Tooling\n\n- [Vim](./editors/vim/) tooling is in this repository\n- [Emacs](./editors/emacs/) tooling is in this repository and requires the tree-sitter grammar\n- [Helix](https://github.com/helix-editor/helix) has built-in syntax highlighting\n- Visual Studio Code\n  - [djot-vscode](https://github.com/ryanabx/djot-vscode)\n  - [Djot-Marker](https://github.com/wisim3000/Djot-Marker)\n- [Treesitter grammar](https://github.com/treeman/tree-sitter-djot)\n- [Djockey](https://steveasleep.com/djockey/) is a static site generator\n  for technical writing and project documentation.\n\n## File extension\n\nThe extension `.dj` may be used to indicate that the contents\nof a file are djot-formatted text.\n\n## License\n\nThe code and documentation are released under the MIT license.\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjgm%2Fdjot","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fjgm%2Fdjot","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjgm%2Fdjot/lists"}