{"id":21915558,"url":"https://github.com/terotests/rangerparser","last_synced_at":"2025-03-22T09:27:12.627Z","repository":{"id":44069427,"uuid":"210322263","full_name":"terotests/RangerParser","owner":"terotests","description":"Opinionated tokenizer and parser for common and custom languages","archived":false,"fork":false,"pushed_at":"2024-11-28T19:02:21.000Z","size":436,"stargazers_count":0,"open_issues_count":11,"forks_count":0,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-01-27T09:25:25.535Z","etag":null,"topics":["ast","generic","parser","tokenizer"],"latest_commit_sha":null,"homepage":null,"language":"TypeScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/terotests.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2019-09-23T09:59:42.000Z","updated_at":"2024-11-28T19:00:48.000Z","dependencies_parsed_at":"2023-02-02T09:46:15.470Z","dependency_job_id":null,"html_url":"https://github.com/terotests/RangerParser","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/terotests%2FRangerParser","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/terotests%2FRangerParser/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/terotests%2FRangerParser/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/terotests%2FRangerParser/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/terotests","download_url":"https://codeload.github.com/terotests/RangerParser/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":244935474,"owners_count":20534811,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ast","generic","parser","tokenizer"],"created_at":"2024-11-28T19:12:49.040Z","updated_at":"2025-03-22T09:27:12.606Z","avatar_url":"https://github.com/terotests.png","language":"TypeScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"# RangerParser - Language Agnostic DSL parser\n\nThe parser is opinionated, zero configuration parser for typical language syntaxes. Unlike\nmany tokenizers and parsers, which require defining grammar first, the RangeParser does not\nrequire any kind of setup to parse most common language structures. You can use it to parse\nsimple GraphQL, SQL, JavaScript or similar syntaxes without any configuration usually associated\nwith generating parsers.\n\nSo how is that possible? The trick is that parse supports some common language elements and structures\nout of the box\n\n```typescript\n{ // 1. curly brackets {} are parsed into block nodes. Actually, empty file is invisible block\n\n  if // 2. all tokens are parsed into expression lists\n\n  if () // 3. all parenthesis () are parsed as expression nodes\n\n  1.456 // 4. numbers are parsed as double of int nodes\n\n  \"hello\" // 5. string literals are parsed as string nodes\n  'hello'\n  `hello`\n}\n```\n\nNewlines inside a block will start a new expression, but iterators ignore that.\n\nSurprisingly, those simple rules are just enough to transform most common language syntaxes into AST tree\nwhich can be used as a basis of a language or configuration files.\n\nThus, the parser only has six (6) different main types:\n\n1. **Block** like `{}`\n2. **Expression** like `()`\n3. **Int** like `1` or `-123`\n4. **Double** like `4.5`, `-.5` or `1e-10`\n5. **String** like `\"hello\\n\"` supporting escape chars too\n6. **Token** like `if` or `while` or `+` (operators are also parsed as tokens)\n\nAdditionally, specific operators are detected and separated, for example `x+y` would be parsed as a\nsingle token. Since `+` is detected as operator, `x`, `+` and `y` are parsed as separate tokens.\n\n# Parser and Iterator\n\nThe parser has two basic components:\n\n1. Parser, which is used to transform string to AST tree\n2. Iterator, which is used to walk the AST tree accroding to language rules\n\nFor example, to match simple `if` statement like this\n\n```typescript\nif (x + y) {\n} else {\n}\n```\n\nYou can create a definition which will match the `if`, `expression` `block` `else` `block`\nstructure like this\n\n```typescript\n[T(\"if\"), E, Bl, T(\"else\"), Bl];\n```\n\nCode example\n\n```typescript\nimport { T, E, Bl, iterator, parse } from \"ranger-parser\";\n\ndescribe(\"Jest test example\", () =\u003e {\n  test(\"Documentation example\", () =\u003e {\n    const IF_THEN_ELSE = [T(\"if\"), E, Bl, T(\"else\"), Bl];\n    const iter = iterator(\n      parse(`\n  if( x + y ) {\n  \n  } else {\n  \n  }`)\n    );\n    let didMatch = false;\n    iter.match(IF_THEN_ELSE, ([, condition, block, , elseBlock]) =\u003e {\n      // if we have match, this callback is called an iterator moves forward\n      const [x, plus, y] = condition.peek(3);\n      expect(x.token).to.equal(\"x\");\n      expect(plus.token).to.equal(\"+\");\n      expect(y.token).to.equal(\"y\");\n      didMatch = true;\n    });\n    // .... the iterator has now consumed the if sentence and is ready to consume more data\n    expect(didMatch).to.be.true;\n  });\n});\n```\n\n# RangerType\n\nDefines type of parsed value.\n\n```typescript\nexport enum RangerType {\n  Double = 1,\n  Int = 2,\n  String = 3,\n  Token = 4\n}\n```\n\n# CodeNode\n\nCodeNode is the primitive building block fo the AST\n\n```typescript\nexport class CodeNode {\n  code: SourceCode;     // source code\n  sp: number;           // start position of parsed node in source code string\n  ep: number;           // end postion of parsed node in source code string\n  nodeType: RangerType;   // type of node, Double, Int, String, Token\n  isExpression: boolean;  // Expression type, like LISP syntax ( + 4 5)\n  isBlock: boolean;       // Block type like { }\n  token: string;          // parsed token\n  doubleValue: number;    // parse value of double type\n  stringValue: string;    // parsed string value\n  intValue: number;       // parsed int value\n  children: Array\u003cCodeNode\u003e = []; // possible child nodes if expression or block\n  parent: CodeNode;       // parent node\n```\n\n# Matching iterators\n\nIterators have some defined matchers to match against common types like\n\n```typescript\niterator(`1234`).m([IsInt()], ([num]) =\u003e {\n  // num.int() should now equal 1234\n});\n```\n\nEach call to `iter.match()` will return boolean indicating whether the iterator did match or not.\nThe callback will have all matched positions filled with corresponding iterators. In case of match the iterator will consume the matched positions.\n\nHere is a list of defined trivial matchers\n\n- `T` or `IsToken` will match a token or list of tokens\n- `D` or `IsDouble` will match double literal\n- `S` or `IsString` will match a defined string literal or any string literal\n- `I` or `IsInt` will match integer literal\n- `E` or `IsExpression` will match expression like `()`\n- `Bl` or `IsBlock` will match block expression like `{}`\n- `A` or `IsAny` will match anything\n\nThere are also some grouping matchers like\n\n- `Optional` which returns match or empty iterator\n- `OneOf` which matches the first one of the given matchers\n- `Sequence` which matches if the given sequence is matched, returned type is the first iterator\n\nCurrently the usage is not well documented, see test folder for examples.\n\n# Usage and examples\n\nSee the `test/` directory\n\n# Limitations\n\nThe set of operators is fixed, ad idea would be to change the operator set dynamic.\n\nKeywords could be optionally separated from operators to support better control over the structure.\n\nComments in the syntax is not supported.\n\n# Test Coverage\n\nStill needs some work to get to the 100%\n\n```\n-----------------|----------|----------|----------|----------|-------------------|\nFile             |  % Stmts | % Branch |  % Funcs |  % Lines | Uncovered Line #s |\n-----------------|----------|----------|----------|----------|-------------------|\nAll files        |     83.5 |    72.06 |    85.94 |    83.26 |                   |\n CodeNode.js     |      100 |      100 |      100 |      100 |                   |\n NodeIterator.js |    83.33 |    67.06 |    80.43 |    83.02 |... 10,413,418,430 |\n RangerParser.js |    79.93 |    74.37 |      100 |    79.93 |... 39,449,453,454 |\n RangerType.js   |      100 |      100 |      100 |      100 |                   |\n SourceCode.js   |      100 |      100 |      100 |      100 |                   |\n-----------------|----------|----------|----------|----------|-------------------|\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fterotests%2Frangerparser","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fterotests%2Frangerparser","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fterotests%2Frangerparser/lists"}