{"id":30180921,"url":"https://github.com/streamich/jit-parser","last_synced_at":"2025-08-12T08:06:16.427Z","repository":{"id":307569752,"uuid":"815371367","full_name":"streamich/jit-parser","owner":"streamich","description":null,"archived":false,"fork":false,"pushed_at":"2025-08-10T12:43:43.000Z","size":390,"stargazers_count":2,"open_issues_count":14,"forks_count":0,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-08-10T14:39:47.271Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"TypeScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/streamich.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":".github/FUNDING.yml","license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null},"funding":{"github":"streamich"}},"created_at":"2024-06-15T01:19:29.000Z","updated_at":"2025-08-01T09:40:12.000Z","dependencies_parsed_at":null,"dependency_job_id":"dfa63cfd-a379-4737-b05a-bf8a8afd45b0","html_url":"https://github.com/streamich/jit-parser","commit_stats":null,"previous_names":["streamich/jit-parser"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/streamich/jit-parser","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/streamich%2Fjit-parser","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/streamich%2Fjit-parser/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/streamich%2Fjit-parser/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/streamich%2Fjit-parser/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/streamich","download_url":"https://codeload.github.com/streamich/jit-parser/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/streamich%2Fjit-parser/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":270024697,"owners_count":24514054,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-08-12T02:00:09.011Z","response_time":80,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-08-12T08:06:06.902Z","updated_at":"2025-08-12T08:06:16.418Z","avatar_url":"https://github.com/streamich.png","language":"TypeScript","funding_links":["https://github.com/sponsors/streamich"],"categories":[],"sub_categories":[],"readme":"# JIT Parser\n\nTop-down recursive descent backtracking PEG scanner-less JIT parser combinator generator.\n\nA high-performance parser library that compiles grammar definitions into efficient JavaScript parsing functions at runtime. It generates both Concrete Syntax Trees (CST) and Abstract Syntax Trees (AST) from textual input.\n\n## Table of Contents\n\n- [Installation](#installation)\n- [Quick Start](#quick-start)\n- [Grammar Node Types](#grammar-node-types)\n- [Tree Types](#tree-types)\n- [Grammar Compilation](#grammar-compilation)\n- [Debug Mode](#debug-mode)\n- [Examples](#examples)\n- [API Reference](#api-reference)\n\n## Installation\n\n```bash\nnpm install jit-parser\n```\n\n## Quick Start\n\n```typescript\nimport {CodegenGrammar} from 'jit-parser';\nimport {ParseContext} from 'jit-parser';\n\n// Define a simple grammar\nconst grammar = {\n  start: 'Value',\n  cst: {\n    Value: 'hello'\n  }\n};\n\n// Compile the grammar to JavaScript\nconst parser = CodegenGrammar.compile(grammar);\n\n// Parse input\nconst ctx = new ParseContext('hello', false);\nconst cst = parser(ctx, 0);\nconsole.log(cst); // CST node representing the parse result\n```\n\n## Grammar Node Types\n\nJIT Parser supports five main grammar node types for defining parsing rules. Grammar rules can be fully defined in JSON, making them language-agnostic and easy to serialize.\n\n### 1. RefNode (Reference Node)\n\nReferences a named node defined elsewhere in the grammar.\n\n**Interface:**\n```typescript\ntype RefNode\u003cName extends string = string\u003e = {r: Name};\n```\n\n**Syntax:**\n```typescript\n{r: 'NodeName'}\n```\n\n**Example:**\n```typescript\nconst grammar = {\n  start: 'Program',\n  cst: {\n    Program: {r: 'Statement'},\n    Statement: 'return;'\n  }\n};\n```\n\n### 2. TerminalNode (Terminal Node)\n\nMatches literal strings, regular expressions, or arrays of strings. Terminal nodes are leaf nodes in the parse tree.\n\n**Interface:**\n```typescript\ninterface TerminalNode {\n  type?: string;                           // Type name (default: \"Text\")\n  t: RegExp | string | '' | string[];      // Pattern(s) to match\n  repeat?: '*' | '+';                      // Repetition (only for string arrays)\n  sample?: string;                         // Sample text for generation\n  ast?: AstNodeExpression;                 // AST transformation\n}\n\n// Shorthand: string, RegExp, or empty string\ntype TerminalNodeShorthand = RegExp | string | '';\n```\n\n**Syntax:**\n```typescript\n// String literal\n'hello'\n\n// Regular expression  \n/[a-z]+/\n\n// Array of alternatives\n{t: ['true', 'false']}\n\n// With repetition\n{t: [' ', '\\t', '\\n'], repeat: '*'}\n\n// Full terminal node\n{\n  t: /\\d+/,\n  type: 'Number',\n  sample: '123'\n}\n```\n\n**Examples:**\n```typescript\n// Simple string terminal\nValue: 'null'\n\n// RegExp terminal  \nNumber: /\\-?\\d+(\\.\\d+)?/\n\n// Alternative strings\nBoolean: {t: ['true', 'false']}\n\n// Repeating whitespace\nWS: {t: [' ', '\\t', '\\n'], repeat: '*'}\n```\n\n### 3. ProductionNode (Production Node)\n\nMatches a sequence of grammar nodes in order. All nodes in the sequence must match for the production to succeed.\n\n**Interface:**\n```typescript\ninterface ProductionNode {\n  p: GrammarNode[];                    // Sequence of nodes to match\n  type?: string;                       // Type name (default: \"Production\")\n  children?: Record\u003cnumber, string\u003e;   // Child index to property mapping\n  ast?: AstNodeExpression;             // AST transformation\n}\n\n// Shorthand: array of grammar nodes\ntype ProductionNodeShorthand = GrammarNode[];\n```\n\n**Syntax:**\n```typescript\n// Shorthand array\n['{', {r: 'Content'}, '}']\n\n// Full production node\n{\n  p: ['{', {r: 'Content'}, '}'],\n  type: 'Block',\n  children: {\n    1: 'content'  // Maps index 1 to 'content' property\n  }\n}\n```\n\n**Examples:**\n```typescript\n// Function call: func()\nFunctionCall: ['func', '(', ')']\n\n// Object with named children\nObject: {\n  p: ['{', {r: 'Members'}, '}'],\n  children: {\n    1: 'members'\n  }\n}\n```\n\n### 4. UnionNode (Union Node)\n\nMatches one of several alternative patterns. The first matching alternative is selected (ordered choice).\n\n**Interface:**\n```typescript\ninterface UnionNode {\n  u: GrammarNode[];           // Array of alternative nodes\n  type?: string;              // Type name (default: \"Union\")\n  ast?: AstNodeExpression;    // AST transformation\n}\n```\n\n**Syntax:**\n```typescript\n{\n  u: [pattern1, pattern2, pattern3]\n}\n```\n\n**Examples:**\n```typescript\n// Literal values\nLiteral: {\n  u: ['null', 'true', 'false', {r: 'Number'}, {r: 'String'}]\n}\n\n// Statement types\nStatement: {\n  u: [\n    {r: 'IfStatement'},\n    {r: 'ReturnStatement'}, \n    {r: 'ExpressionStatement'}\n  ]\n}\n```\n\n### 5. ListNode (List Node)\n\nMatches zero or more repetitions of a pattern.\n\n**Interface:**  \n```typescript\ninterface ListNode {\n  l: GrammarNode;              // Node to repeat\n  type?: string;               // Type name (default: \"List\")\n  ast?: AstNodeExpression;     // AST transformation\n}\n```\n\n**Syntax:**\n```typescript\n{\n  l: pattern\n}\n```\n\n**Examples:**\n```typescript\n// Zero or more statements\nStatements: {\n  l: {r: 'Statement'}\n}\n\n// Comma-separated list\nArguments: {\n  l: {\n    p: [',', {r: 'Expression'}],\n    ast: ['$', '/children/1']  // Extract the expression, ignore comma\n  }\n}\n```\n\n## Tree Types\n\nJIT Parser works with four types of tree structures:\n\n### 1. Grammar Nodes\n\nThe grammar definition that describes the parsing rules. These are the node types described above that define how to parse input text.\n\n### 2. CST (Concrete Syntax Tree)\n\nThe parse tree that contains every matched token and maintains the complete structure of the parsed input.\n\n**Interface:**\n```typescript\ninterface CstNode {\n  ptr: Pattern;         // Reference to grammar pattern\n  pos: number;          // Start position in input\n  end: number;          // End position in input  \n  children?: CstNode[]; // Child nodes\n}\n```\n\n**Example CST:**\n```typescript\n// For input: '{\"foo\": 123}'\n{\n  ptr: ObjectPattern,\n  pos: 0,\n  end: 13,\n  children: [\n    {ptr: TextPattern, pos: 0, end: 1},      // '{'\n    {ptr: MembersPattern, pos: 1, end: 12,   // '\"foo\": 123'\n      children: [...]\n    },\n    {ptr: TextPattern, pos: 12, end: 13}     // '}'\n  ]\n}\n```\n\n### 3. AST (Abstract Syntax Tree) \n\nA simplified tree structure derived from the CST, typically containing only semantically meaningful nodes.\n\n**Default AST Interface:**\n```typescript\ninterface CanonicalAstNode {\n  type: string;                                    // Node type\n  pos: number;                                     // Start position\n  end: number;                                     // End position\n  raw?: string;                                    // Raw matched text\n  children?: (CanonicalAstNode | unknown)[];      // Child nodes\n}\n```\n\n**Example AST:**\n```typescript\n// For input: '{\"foo\": 123}' \n{\n  type: 'Object',\n  pos: 0,\n  end: 13,\n  children: [\n    {\n      type: 'Entry',\n      key: {type: 'String', value: 'foo'},\n      value: {type: 'Number', value: 123}\n    }\n  ]\n}\n```\n\n### CST to AST Conversion Rules\n\n1. **Default Conversion**: Each CST node becomes an AST node with `type`, `pos`, `end`, and `children` properties.\n\n2. **AST Expressions**: Use `ast` property in grammar nodes to customize AST generation:\n   - `ast: null` - Skip this node in AST\n   - `ast: ['$', '/children/0']` - Use first child's AST\n   - `ast: {...}` - Custom JSON expression for transformation\n\n3. **Children Mapping**: Use `children` property to map CST child indices to AST properties:\n   ```typescript\n   {\n     children: {\n       0: 'key',      // CST child 0 -\u003e AST property 'key'\n       2: 'value'     // CST child 2 -\u003e AST property 'value'  \n     }\n   }\n   ```\n\n4. **Type Override**: Specify custom `type` property instead of default node type names.\n\n### 4. Debug Trace Tree\n\nIf debug mode is enabled during compilation, the parser captures all grammar node tree paths that were attempted during parsing. This debug trace tree is useful for debugging parser behavior and improving parser performance by understanding which rules were tried and failed.\n\n**Interface:**\n```typescript\ninterface TraceNode {\n  type: string;         // Grammar rule name that was attempted\n  pos: number;          // Start position where rule was tried\n  end?: number;         // End position if rule succeeded  \n  children?: TraceNode[]; // Nested rule attempts\n  success: boolean;     // Whether the rule matched successfully\n}\n```\n\nThe debug trace captures the complete parsing process, including failed attempts, making it invaluable for understanding complex parsing scenarios and optimizing grammar rules.\n\n## Grammar Compilation\n\nGrammars are compiled to efficient JavaScript functions that can parse input strings rapidly.\n\n### Basic Compilation\n\n```typescript\nimport {CodegenGrammar} from 'jit-parser';\n\nconst grammar = {\n  start: 'Value',\n  cst: {\n    Value: {r: 'Number'},\n    Number: /\\d+/\n  }\n};\n\n// Compile to parser function  \nconst parser = CodegenGrammar.compile(grammar);\n```\n\n### Compilation Options\n\n```typescript\nimport {CodegenContext} from 'jit-parser';\n\nconst ctx = new CodegenContext(\n  true,  // positions: Include pos/end in AST\n  true,  // astExpressions: Process AST transformations\n  false  // debug: Generate debug trace code\n);\n\nconst parser = CodegenGrammar.compile(grammar, ctx);\n```\n\n### Viewing Compiled Grammar\n\nYou can print the grammar structure by converting it to a string:\n\n```typescript\nimport {GrammarPrinter} from 'jit-parser';\n\nconst grammarString = GrammarPrinter.print(grammar);\nconsole.log(grammarString);\n```\n\n**Example output:**\n```\nValue (reference)\n└─ Number (terminal): /\\d+/\n```\n\n### Complex Grammar Example\n\n```typescript\nconst jsonGrammar = {\n  start: 'Value',\n  cst: {\n    WOpt: {t: [' ', '\\n', '\\t', '\\r'], repeat: '*', ast: null},\n    Value: [{r: 'WOpt'}, {r: 'TValue'}, {r: 'WOpt'}],\n    TValue: {\n      u: ['null', {r: 'Boolean'}, {r: 'Number'}, {r: 'String'}, {r: 'Object'}, {r: 'Array'}]\n    },\n    Boolean: {t: ['true', 'false']},  \n    Number: /\\-?\\d+(\\.\\d+)?([eE][\\+\\-]?\\d+)?/,\n    String: /\"[^\"\\\\]*(?:\\\\.[^\"\\\\]*)*\"/,\n    Object: ['{', {r: 'Members'}, '}'],\n    Members: {\n      u: [\n        {\n          p: [{r: 'Entry'}, {l: {p: [',', {r: 'Entry'}], ast: ['$', '/children/1']}}],\n          ast: ['concat', ['push', [[]], ['$', '/children/0']], ['$', '/children/1']]\n        },\n        {r: 'WOpt'}\n      ]\n    },\n    Entry: {\n      p: [{r: 'String'}, ':', {r: 'Value'}],\n      children: {0: 'key', 2: 'value'}\n    },\n    Array: ['[', {r: 'Elements'}, ']']\n    // ... more rules\n  },\n  ast: {\n    Value: ['$', '/children/1'],      // Extract middle child (TValue)  \n    Boolean: ['==', ['$', '/raw'], 'true'],  // Convert to boolean\n    Number: ['num', ['$', '/raw']]    // Convert to number\n  }\n};\n\nconst parser = CodegenGrammar.compile(jsonGrammar);\nconsole.log(GrammarPrinter.print(jsonGrammar));\n```\n\n## Debug Mode\n\nDebug mode captures a trace of the parsing process, showing which grammar rules were attempted at each position.\n\n### Enabling Debug Mode\n\n```typescript\nimport {CodegenContext, ParseContext} from 'jit-parser';\n\n// Enable debug during compilation\nconst debugCtx = new CodegenContext(true, true, true); // debug = true\nconst parser = CodegenGrammar.compile(grammar, debugCtx);\n\n// Create trace collection  \nconst rootTrace = {pos: 0, children: []};\nconst parseCtx = new ParseContext('input text', false, [rootTrace]);\n\n// Parse with debug trace\nconst cst = parser(parseCtx, 0);\n\n// Print debug trace\nimport {printTraceNode} from 'jit-parser';\nconsole.log(printTraceNode(rootTrace, '', 'input text'));\n```\n\n### Debug Trace Output\n\nThe debug trace shows:\n- Which grammar rules were attempted\n- At what positions in the input\n- Whether each attempt succeeded or failed\n- The hierarchical structure of rule attempts\n\n**Example trace output:**\n```\nRoot\n└─ Value 0:22 → ' {\"foo\": [\"bar\", 123]}'\n   ├─ WOpt 0:1 → \" \"\n   ├─ TValue 1:22 → '{\"foo\": [\"bar\", 123]}'\n   │  ├─ Null\n   │  ├─ Boolean  \n   │  ├─ String\n   │  └─ Object 1:22 → '{\"foo\": [\"bar\", 123]}'\n   │     ├─ Text 1:2 → \"{\"\n   │     ├─ Members 2:21 → '\"foo\": [\"bar\", 123]'\n   │     │  └─ Production 2:21 → '\"foo\": [\"bar\", 123]'\n   │     │     ├─ Entry 2:21 → '\"foo\": [\"bar\", 123]'\n   │     │     │  ├─ String 2:7 → '\"foo\"'\n   │     │     │  ├─ Text 7:8 → \":\"\n   │     │     │  └─ Value 8:21 → ' [\"bar\", 123]' \n   │     │     │     └─ ...\n   │     │     └─ List 21:21 → \"\"\n   │     └─ Text 21:22 → \"}\"\n   └─ WOpt 22:22 → \"\"\n```\n\n## Examples\n\n### 1. Simple Expression Parser\n\n```typescript\nconst exprGrammar = {\n  start: 'Expression',\n  cst: {\n    Expression: {r: 'Number'},\n    Number: {\n      t: /\\d+/,\n      type: 'Number'\n    }\n  }\n};\n\nconst parser = CodegenGrammar.compile(exprGrammar);\nconst ctx = new ParseContext('42', true);\nconst cst = parser(ctx, 0);\nconst ast = cst.ptr.toAst(cst, '42');\nconsole.log(ast); // {type: 'Number', pos: 0, end: 2, raw: '42'}\n```\n\n### 2. JSON Parser\n\n```typescript\nimport {grammar as jsonGrammar} from 'jit-parser/lib/grammars/json';\n\nconst parser = CodegenGrammar.compile(jsonGrammar);\nconst json = '{\"name\": \"John\", \"age\": 30}';\nconst ctx = new ParseContext(json, true);\nconst cst = parser(ctx, 0);\nconst ast = cst.ptr.toAst(cst, json);\nconsole.log(ast);\n```\n\n### 3. Custom AST Transformation\n\n```typescript\nconst grammar = {\n  start: 'KeyValue', \n  cst: {\n    KeyValue: {\n      p: [{r: 'Key'}, '=', {r: 'Value'}],\n      children: {0: 'key', 2: 'value'},\n      type: 'Assignment'\n    },\n    Key: /[a-zA-Z]+/,\n    Value: /\\d+/\n  },\n  ast: {\n    KeyValue: {\n      type: 'Assignment',\n      key: ['$', '/children/0/raw'],\n      value: ['num', ['$', '/children/2/raw']]\n    }\n  }\n};\n```\n\n### 4. List Parsing\n\n```typescript\nconst listGrammar = {\n  start: 'List',\n  cst: {\n    List: ['[', {r: 'Items'}, ']'],\n    Items: {\n      u: [\n        {\n          p: [{r: 'Item'}, {l: {p: [',', {r: 'Item'}], ast: ['$', '/children/1']}}],\n          ast: ['concat', ['push', [[]], ['$', '/children/0']], ['$', '/children/1']]\n        },\n        ''  // Empty list\n      ]\n    },\n    Item: /\\w+/\n  }\n};\n```\n\n## API Reference\n\n### Core Classes\n\n#### `CodegenGrammar`\n- `static compile(grammar: Grammar, ctx?: CodegenContext): Parser`\n- `compileRule(ruleName: string): Pattern`\n\n#### `ParseContext`  \n- `constructor(str: string, ast: boolean, trace?: RootTraceNode[])`\n\n#### `CodegenContext`\n- `constructor(positions: boolean, astExpressions: boolean, debug: boolean)`\n\n#### `GrammarPrinter`\n- `static print(grammar: Grammar, tab?: string): string`\n\n### Utility Functions\n\n#### `printCst(cst: CstNode, tab: string, src: string): string`\nPrint a formatted CST tree\n\n#### `printTraceNode(trace: RootTraceNode | ParseTraceNode, tab: string, src: string): string`  \nPrint a formatted debug trace\n\n### Type Definitions\n\nSee the [Grammar Node Types](#grammar-node-types) section for complete interface definitions.\n\n---\n\nThis parser generator provides a powerful and efficient way to build custom parsers with minimal code while maintaining high performance through JIT compilation.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fstreamich%2Fjit-parser","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fstreamich%2Fjit-parser","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fstreamich%2Fjit-parser/lists"}