{"id":19604141,"url":"https://github.com/sinclairzx81/parsebox","last_synced_at":"2025-04-06T11:10:47.066Z","repository":{"id":261627628,"uuid":"884856923","full_name":"sinclairzx81/parsebox","owner":"sinclairzx81","description":"Parser Combinators in the TypeScript Type System","archived":false,"fork":false,"pushed_at":"2025-02-19T13:29:20.000Z","size":1271,"stargazers_count":74,"open_issues_count":0,"forks_count":2,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-03-30T05:03:46.860Z","etag":null,"topics":["combinators","parser","type-system","typescript"],"latest_commit_sha":null,"homepage":"","language":"TypeScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/sinclairzx81.png","metadata":{"files":{"readme":"readme.md","changelog":null,"contributing":null,"funding":null,"license":"license","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-11-07T14:11:05.000Z","updated_at":"2025-03-29T06:42:11.000Z","dependencies_parsed_at":"2024-11-22T09:26:00.006Z","dependency_job_id":"85619923-277b-4be2-9882-a263cf4bd7dd","html_url":"https://github.com/sinclairzx81/parsebox","commit_stats":null,"previous_names":["sinclairzx81/parsebox"],"tags_count":3,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sinclairzx81%2Fparsebox","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sinclairzx81%2Fparsebox/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sinclairzx81%2Fparsebox/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sinclairzx81%2Fparsebox/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/sinclairzx81","download_url":"https://codeload.github.com/sinclairzx81/parsebox/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247305930,"owners_count":20917207,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["combinators","parser","type-system","typescript"],"created_at":"2024-11-11T09:35:01.714Z","updated_at":"2025-04-06T11:10:47.050Z","avatar_url":"https://github.com/sinclairzx81.png","language":"TypeScript","readme":"\u003cdiv align='center'\u003e\n\n\u003ch1\u003eParseBox\u003c/h1\u003e\n\n\u003cp\u003eParser Combinators in the TypeScript Type System\u003c/p\u003e\n\n\u003cimg src=\"https://raw.githubusercontent.com/sinclairzx81/parsebox/refs/heads/main/parsebox.png\" /\u003e\n\n\u003cbr /\u003e\n\u003cbr /\u003e\n\n[![npm version](https://badge.fury.io/js/%40sinclair%2Fparsebox.svg)](https://badge.fury.io/js/%40sinclair%2Fparsebox)\n[![Build](https://github.com/sinclairzx81/parsebox/actions/workflows/build.yml/badge.svg)](https://github.com/sinclairzx81/parsebox/actions/workflows/build.yml) \n[![License](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)\n\n\u003c/div\u003e\n\n## Install\n\n```bash\n$ npm install @sinclair/parsebox\n```\n\n## Example\n\nParseBox provides combinators for parsing in Runtime and Static environments.\n\n### Runtime\n\n```typescript\nimport { Runtime } from '@sinclair/parsebox'\n\nconst T = Runtime.Tuple([\n  Runtime.Const('X'),\n  Runtime.Const('Y'),\n  Runtime.Const('Z')\n])\n\nconst R = Runtime.Parse(T, 'X Y Z W')               // const R = [['X', 'Y', 'Z'], ' W']\n```\n\n### Static\n\n```typescript\nimport { Static } from '@sinclair/parsebox'\n\ntype T = Static.Tuple\u003c[\n  Static.Const\u003c'X'\u003e,\n  Static.Const\u003c'Y'\u003e,\n  Static.Const\u003c'Z'\u003e\n]\u003e\n\ntype R = Static.Parse\u003cT, 'X Y Z W'\u003e                 // type R = [['X', 'Y', 'Z'], ' W']\n```\n\n\n## Overview\n\nParseBox is a parsing library designed to embed domain-specific languages (DSLs) within the TypeScript type system. It provides a set of runtime and type-level combinators that enable EBNF notation to be encoded as TypeScript types. These combinators can then be used to parse content at runtime or interactively in editor via static type inference.\n\nThis project was developed as a generalized parsing solution for the [TypeBox](https://github.com/sinclairzx81/typebox) project, where it is currently used to parse TypeScript syntax into runtime types. This project seeks to provide a robust foundation for parsing a variety of domain-specific languages, with information encoded in each language able to be reconciled with TypeScript's type system.\n\nLicense: MIT\n\n## Contents\n\n- [Combinators](#Combinators)\n  - [Const](#Const)\n  - [Tuple](#Tuple)\n  - [Union](#Union)\n  - [Array](#Array)\n  - [Optional](#Optional)\n  - [Epsilon](#Epsilon)\n- [Terminals](#Terminals)\n  - [Number](#Number)\n  - [String](#String)\n  - [Ident](#Ident)\n- [Mapping](#Mapping)\n- [Context](#Context)\n- [Modules](#Modules)\n- [Advanced](#Advanced)\n- [Contribute](#Contribute)\n\n## Combinators\n\nParseBox offers combinators for runtime and static environments, with each combinator based on EBNF constructs. These combinators produce schema fragments that define parse operations, which ParseBox interprets during parsing. As schematics, the fragments can also be reflected as EBNF or remapped to other tools. The following section examines the Runtime combinators and their relation to EBNF.\n\n### Const\n\nThe Const combinator parses the next occurrence of a specified string, ignoring whitespace and newline characters unless explicitly specified as parameters.\n\n**BNF**\n\n```bnf\n\u003cT\u003e ::= \"X\"\n```\n\n**TypeScript**\n\n```typescript\nconst T = Runtime.Const('X')                        // const T = {\n                                                    //   type: 'Const',\n                                                    //   value: 'X'\n                                                    // }\n\nconst R = Runtime.Parse(T, 'X Y Z')                 // const R = ['X', ' Y Z']\n```\n\n### Tuple\n\nThe Tuple parser matches a sequence of parsers, with an empty tuple representing Epsilon (the empty production).\n\n**BNF**\n\n```bnf\n\u003cT\u003e ::= \"X\" \"Y\" \"Z\"\n```\n\n**TypeScript**\n\n```typescript\nconst T = Runtime.Tuple([                           // const T = {\n  Runtime.Const('X'),                               //   type: 'Tuple',\n  Runtime.Const('Y'),                               //   parsers: [\n  Runtime.Const('Z'),                               //     { type: 'Const', value: 'X' },\n])                                                  //     { type: 'Const', value: 'Y' },\n                                                    //     { type: 'Const', value: 'Z' },\n                                                    //   ]\n                                                    // }\n\n\nconst R = Runtime.Parse(T, 'X Y Z W')               // const R = [['X', 'Y', 'Z'], ' W']\n```\n\n### Union\n\nThe Union combinator tries each interior parser in sequence until one matches\n\n**BNF**\n\n```bnf\n\u003cT\u003e ::= \"X\" | \"Y\" | \"Z\"\n```\n\n**TypeScript**\n\n```typescript\nconst T = Runtime.Union([                           // const T = {\n  Runtime.Const('X'),                               //   type: 'Union',\n  Runtime.Const('Y'),                               //   parsers: [\n  Runtime.Const('Z')                                //     { type: 'Const', value: 'X' },\n])                                                  //     { type: 'Const', value: 'Y' },\n                                                    //     { type: 'Const', value: 'Z' }\n                                                    //   ]\n                                                    // }\n\nconst R1 = Runtime.Parse(T, 'X E')                  // const R1 = ['X', ' E']\n\nconst R2 = Runtime.Parse(T, 'Y E')                  // const R2 = ['Y', ' E']\n\nconst R3 = Runtime.Parse(T, 'Z E')                  // const R3 = ['Z', ' E']\n```\n\n### Array\n\nThe Array combinator parses zero or more occurrences of the interior parser, returning an empty array if there are no matches.\n\n**EBNF**\n\n```\n\u003cT\u003e ::= { \"X\" }\n```\n\n**TypeScript**\n\n```typescript\nconst T = Runtime.Array(                             // const T = {\n  Runtime.Const('X')                                 //   type: 'Array',\n)                                                    //   parser: { type: 'Const', value: 'X' } \n                                                     // }\n\nconst R1 = Runtime.Parse(T, 'X Y Z')                 // const R1 = [['X'], ' Y Z']\n\nconst R2 = Runtime.Parse(T, 'X X X Y Z')             // const R2 = [['X', 'X', 'X'], ' Y Z']\n\nconst R3 = Runtime.Parse(T, 'Y Z')                   // const R3 = [[], 'Y Z']\n```\n\n### Optional\n\nThe Optional combinator parses zero or one occurrence of the interior parser, returning a tuple with one element or an empty tuple if there is no match.\n\n**EBNF**\n\n```\n\u003cT\u003e ::= [ \"X\" ]\n```\n\n**TypeScript**\n\n```typescript\nconst T = Runtime.Optional(                         // const T = {\n  Runtime.Const('X')                                //   type: 'Optional',\n)                                                   //   parser: { type: 'Const', value: 'X' }\n                                                    // }\n\nconst R1 = Runtime.Parse(T, 'X Y Z')                // const R1 = [['X'], ' Y Z']\n\nconst R2 = Runtime.Parse(T, 'Y Z')                  // const R2 = [[], 'Y Z']\n```\n\n### Epsilon\n\nParseBox does not have a dedicated combinator for Epsilon; instead, it can be represented using an empty Tuple combinator. Epsilon is typically used as a fall-through case in sequence matching.\n\n**EBNF**\n\n```\n\u003cT\u003e ::= \"X\" \"Y\" | ε\n```\n\n**TypeScript**\n\n```typescript\nconst T = Runtime.Union([\n\n  Runtime.Tuple([Runtime.Const('X'), Runtime.Const('Y')]),\n\n  Runtime.Tuple([])                                 // ε - fall-through case\n\n])\nconst R1 = Runtime.Parse(T, 'X Y Z')                // const R1 = [['X', 'Y'], ' Z']\n\nconst R2 = Runtime.Parse(T, 'Y Z')                  // const R2 = [[], 'Y Z']\n```\n\n## Terminals\n\nParseBox provides combinators for parsing common lexical tokens, such as numbers, identifiers, and strings, enabling static, optimized parsing of typical JavaScript constructs.\n\n### Number\n\nParses numeric literals, including integers, decimals, and floating-point numbers. Invalid formats, like leading zeroes, are not matched.\n\n```typescript\nconst T = Runtime.Number()\n\n// ...\n\nconst R1 = Runtime.Parse(T, '1')                    // const R1 = ['1', '']\n\nconst R2 = Runtime.Parse(T, '3.14')                 // const R2 = ['3.14', '']\n\nconst R3 = Runtime.Parse(T, '.1')                   // const R3 = ['.1', '']\n\nconst E = Runtime.Parse(T, '01')                    // const E = []\n```\n\n### String\n\nThe String combinator parses quoted string literals, accepting an array of permissible quote characters. The result is the interior string.\n\n```typescript\nconst T = Runtime.String(['\"'])\n\n// ...\n\nconst R = Runtime.Parse(T, '\"hello\"')              // const R = ['hello', '']\n```\n\n### Ident\n\nParses valid JavaScript identifiers, typically used to extract variable or function names. The following example demonstrates parsing a `let` statement.\n\n```bnf\n\u003clet\u003e ::= \"let\" \u003cident\u003e \"=\" \u003cnumber\u003e\n```\n\n```typescript\nconst Expression = Runtime.Number()                 //  const Expression = { type: 'Number' }\n\nconst Let = Runtime.Tuple([                         //  const Let = {\n  Runtime.Const('let'),                             //    type: 'Tuple',\n  Runtime.Ident(),                                  //    parsers: [\n  Runtime.Const('='),                               //      { type: 'Const', value: 'let' },\n  Expression                                        //      { type: 'Ident' },\n])                                                  //      { type: 'Const', value: '=' },\n                                                    //      { type: 'Number' },\n                                                    //    ]\n                                                    //  }\n\nconst R = Runtime.Parse(Let, 'let n = 10')          // const R = [[ 'let', 'n', '=', '10'], '' ]\n\n```\n\n\n## Mapping\n\nParseBox supports semantic actions (i.e., mappings) for both static and runtime parsing, enabling parsed content to be transformed into complex structures like abstract syntax trees (ASTs). Below is an explanation of how mapping works in both environments.\n\n### Runtime\n\nRuntime combinators can accept an optional callback as their last argument, which receives the parsed elements and maps them to arbitrary return values. The following example shows how a let statement is parsed and mapped into a syntax node.\n\n```typescript\nconst LetMapping = (_0: 'let', _1: string, _2: '=', _3: string) =\u003e {\n  return {\n    type: 'Let',\n    ident: _1,\n    value: parseFloat(_3)\n  }\n}\nconst Let = Runtime.Tuple([                           \n  Runtime.Const('let'), // _0\n  Runtime.Ident(),      // _1\n  Runtime.Const('='),   // _2\n  Runtime.Number()      // _3\n], values =\u003e LetMapping(...values)) \n\nconst R = Runtime.Parse(Let, 'let n = 10')          // const R = [{\n                                                    //   type: 'Let',\n                                                    //   ident: 'n',\n                                                    //   value: 10\n                                                    // }, '' ]\n```\n\n### Static\n\nStatic combinators accept an optional higher-kinded type, IMapping, as the last generic argument. Static mapping uses the `this['input']` property to read input values, assigning the mapping to the `output` property. The following example demonstrates implementing the Let parser using static actions.\n\n```typescript\ntype ParseFloat\u003cValue extends string\u003e = (\n  Value extends `${infer Value extends number}` ? Value : never\n)\ninterface LetMapping extends Static.IMapping {\n  output: this['input'] extends ['let', infer Ident, '=', infer Value extends string] ? {\n    type: 'Let',\n    ident: Ident\n    value: ParseFloat\u003cValue\u003e\n  } : never\n}\n\ntype Let = Static.Tuple\u003c[\n  Static.Const\u003c'let'\u003e, \n  Static.Ident,     \n  Static.Const\u003c'='\u003e, \n  Static.Number\n], LetMapping\u003e\n\ntype R = Static.Parse\u003cLet, 'let n = 10'\u003e            // type R = [{\n                                                    //   type: 'Let',\n                                                    //   ident: 'n',\n                                                    //   value: 10\n                                                    // }, '' ]\n```\n\n## Context\n\nParseBox allows exterior values to be passed into and referenced within semantic actions. A context is passed as the last argument to the Static and Runtime parse types/functions, and is propagated into each action. The following demonstrates its usage.\n\n### Runtime\n\nThe Runtime Parse function accepts a context as the last argument, which is received as the second argument to the OptionMapping function.\n\n```typescript\nimport { Runtime } from '@sinclair/parsebox'\n\n// Context Received as Second Argument\nconst OptionMapping = (input: 'A' | 'B' | 'C', context: Record\u003cProeprtyKey, string\u003e) =\u003e {\n  return (\n    input in context \n      ? context[input] \n      : undefined\n  )\n}\nconst Option = Runtime.Union([\n  Runtime.Const('A'),\n  Runtime.Const('B'),\n  Runtime.Const('C')\n], OptionMapping)\n\nconst R = Runtime.Parse(Option, 'A', {              // const R = ['Matched Foo', '']\n  A: 'Matched Foo',\n  B: 'Matched Bar',\n  C: 'Matched Baz',\n})\n```\n\n### Static\n\nThe Static Parse type accepts a context as the last generic argument, which is received via the `this['context']` property on the OptionMapping type.\n\n```typescript\nimport { Static } from '@sinclair/parsebox'\n\n// Context Received on Context Property\ninterface OptionMapping extends Static.IMapping {\n  output: (\n    this['input'] extends keyof this['context'] \n      ? this['context'][this['input']] \n      : undefined\n  )\n}\ntype Option = Static.Union\u003c[\n  Static.Const\u003c'A'\u003e,\n  Static.Const\u003c'B'\u003e,\n  Static.Const\u003c'C'\u003e\n], OptionMapping\u003e\n\ntype R = Static.Parse\u003cOption, 'A', {                // type R = ['Matched Foo', '']\n  A: 'Matched Foo',\n  B: 'Matched Bar',\n  C: 'Matched Baz',\n}\u003e\n```\n\n## Modules\n\nParseBox modules act as containers for Runtime parsers, enabling recursion and mutual recursion by allowing parsers to reference each other via string keys. They are only for Runtime parsers, as Static parsers don’t have ordering issues due to TypeScript’s non-order-dependent types.\n\n### List Parsing\n\nIn this example, we define a List parser that recursively parses sequences of Item elements. The List parser is either a tuple of a Value followed by another List (recursive) or an empty tuple (base case). Recursion is achieved by referencing both Item and List parsers within the same module.\n\n```typescript\nimport { Runtime } from '@sinclair/parsebox'\n\n// Item ::= \"X\" \"Y\" \"Z\"\n\nconst Item = Runtime.Union([\n  Runtime.Const('X'),\n  Runtime.Const('Y'),\n  Runtime.Const('Z'),\n])\n\n// List ::= Item List | ε\n\nconst List = Runtime.Union([\n  Runtime.Tuple([Runtime.Ref('Item'), Runtime.Ref('List')]), // Recursive Self\n  Runtime.Tuple([])                                          // Epsilon\n], values =\u003e values.flat())\n\n// Embed inside Module\n\nconst Module = new Runtime.Module({ \n  Item, \n  List \n})\n\n// Use Module.Parse \n\nconst R = Module.Parse('List', 'X Y Z Y X E')       // const R = [['X', 'Y', 'Z', 'Y', 'X'], ' E']\n```\n\n## Advanced\n\nThe following example demonstrates using ParseBox to parse a mathematical expression with LL(1) parsing techniques, avoiding left recursion and respecting operator precedence rules.\n\n```typescript\nimport { Static } from '@sinclair/parsebox'\n\n// Static Mapping Actions to remap Productions\n\ntype BinaryReduce\u003cLeft extends unknown, Right extends unknown[]\u003e = (\n  Right extends [infer Operator, infer Right, infer Rest extends unknown[]]\n    ? { left: Left, operator: Operator, right: BinaryReduce\u003cRight, Rest\u003e }\n    : Left\n)\ninterface BinaryMapping extends Static.IMapping {\n  output: this['input'] extends [infer Left, infer Right extends unknown[]]\n    ? BinaryReduce\u003cLeft, Right\u003e\n    : never\n}\ninterface FactorMapping extends Static.IMapping {\n  output: ( \n    this['input'] extends ['(', infer Expr, ')'] ? Expr :\n    this['input'] extends [infer Operand] ? Operand :\n    never\n  )\n}\n\n// Expression Grammar\n\ntype Operand = Static.Ident\n\ntype Factor = Static.Union\u003c[\n  Static.Tuple\u003c[Static.Const\u003c'('\u003e, Expr, Static.Const\u003c')'\u003e]\u003e,\n  Static.Tuple\u003c[Operand]\u003e,\n], FactorMapping\u003e\n\ntype TermTail = Static.Union\u003c[\n  Static.Tuple\u003c[Static.Const\u003c'*'\u003e, Factor, TermTail]\u003e,\n  Static.Tuple\u003c[Static.Const\u003c'/'\u003e, Factor, TermTail]\u003e,\n  Static.Tuple\u003c[]\u003e,\n]\u003e\n\ntype ExprTail = Static.Union\u003c[\n  Static.Tuple\u003c[Static.Const\u003c'+'\u003e, Term, ExprTail]\u003e,\n  Static.Tuple\u003c[Static.Const\u003c'-'\u003e, Term, ExprTail]\u003e,\n  Static.Tuple\u003c[]\u003e,\n]\u003e\n\ntype Term = Static.Tuple\u003c[Factor, TermTail], BinaryMapping\u003e\n\ntype Expr = Static.Tuple\u003c[Term, ExprTail], BinaryMapping\u003e\n\n// Parse!\n\ntype Result = Static.Parse\u003cExpr, 'x * (y + z)'\u003e     // type R = [{\n                                                    //   left: \"x\";\n                                                    //   operator: \"*\";\n                                                    //   right: {\n                                                    //       left: \"y\";\n                                                    //       operator: \"+\";\n                                                    //       right: \"z\";\n                                                    //   };\n                                                    // }, \"\"]\n```\n\n## Contribute\n\nParseBox is open to community contribution. Please ensure you submit an open issue before submitting your pull request. The ParseBox project prefers open community discussion before accepting new features.\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsinclairzx81%2Fparsebox","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsinclairzx81%2Fparsebox","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsinclairzx81%2Fparsebox/lists"}