{"id":13529916,"url":"https://github.com/ara3d/parakeet","last_synced_at":"2026-01-16T10:06:59.704Z","repository":{"id":79180798,"uuid":"159062335","full_name":"ara3d/parakeet","owner":"ara3d","description":"A fast and simple .NET parsing library ","archived":false,"fork":false,"pushed_at":"2025-06-01T17:15:56.000Z","size":14360,"stargazers_count":81,"open_issues_count":1,"forks_count":7,"subscribers_count":6,"default_branch":"master","last_synced_at":"2025-07-24T04:28:17.970Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"C#","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ara3d.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2018-11-25T18:34:37.000Z","updated_at":"2025-07-09T22:26:35.000Z","dependencies_parsed_at":null,"dependency_job_id":"f4cb82ee-1414-4d27-8a3f-d4c9915f0da0","html_url":"https://github.com/ara3d/parakeet","commit_stats":null,"previous_names":["ara3d/parakeet"],"tags_count":6,"template":false,"template_full_name":null,"purl":"pkg:github/ara3d/parakeet","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ara3d%2Fparakeet","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ara3d%2Fparakeet/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ara3d%2Fparakeet/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ara3d%2Fparakeet/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ara3d","download_url":"https://codeload.github.com/ara3d/parakeet/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ara3d%2Fparakeet/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28478049,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-16T06:30:42.265Z","status":"ssl_error","status_checked_at":"2026-01-16T06:30:16.248Z","response_time":107,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-08-01T07:00:40.737Z","updated_at":"2026-01-16T10:06:59.695Z","avatar_url":"https://github.com/ara3d.png","language":"C#","funding_links":[],"categories":["Methodology","Parser Library","Identifiers"],"sub_categories":["GUI - other"],"readme":"# Parakeet\n\n[![NuGet Version](https://img.shields.io/nuget/v/Ara3D.Parakeet)](https://www.nuget.org/packages/Ara3D.Parakeet)\n\n**Parakeet** is a text parsing library written in C#. Parakeet is the parsing library being used by the \n[Plato programming language project](https://github.com/cdiggins/plato) to Parse both Plato and C# source code. \n\n![Parakeet1](https://user-images.githubusercontent.com/1759994/222930131-4edeb2ce-757f-4471-8905-8c24ecbc67f8.png)\n\n[Image by Hugo Wai](https://unsplash.com/photos/MEborZA-3Ps)\n\n## Overview \n\nParakeet is a [recursive descent](https://en.wikipedia.org/wiki/Recursive_descent_parser) (RD) parsing library based on the \n[parsing expression grammar](https://en.wikipedia.org/wiki/Parsing_expression_grammar) (PEG) formalization\nintroduced by [Bryan Ford](https://bford.info/pub/lang/peg.pdf). Parakeet parsers are defined directly in C# using operator overloading. \nParakeet combines both lexical analysis (aka tokenization) and syntactic analysis in a single pass. \n\nSee this [CodeProject article](https://www.codeproject.com/Articles/5379232/Introduction-to-Text-Parsing-in-Csharp-using-Parak)\nfor an introduction to the core concepts of Parakeet. \n\n## More Details and Features\n\nParakeet was designed primarily for the challenge of parsing programming languages. It can be used in different contexts of course. \n\nParakeet supports:\n\n* Parsing error recovery  \n* Run-time detection of stuck parsers \n* Line number and column number reporting \n* Immutable data structures \n* Operator overloading\n* Fluent API syntax (aka method chaining)\n* Automated creation of untyped parse trees\n* Code generation for converting untyped parse trees into a strongly typed concrete syntax tree (CST). \n\n## Steps \n\n1. Define a grammar in code: a class with a set of properties that map to rules \n1. Convert the input text into a parser input object \n1. Choose the starting rule of the grammar \n1. Call the match function of the starting rule \n1. Examine the resulting `ParserState` object\n1. If the result is `null` the parser failed to match, and failed to recover \n\t* Consider adding `OnError` to your grammar\n1. Convert  \n\n## Primary Classes\n\n* `ParserInput` - Wraps a string with convenience functions to retrieve row/column, and potentially the source file name. \nCan be implicitly converted from a string \n* `ParserState` - Represents a position in the input and a pointer to the most recently created parse node. \n* `ParserRange` - Contains two `ParserState` objects, one representing the beginning of a range of input and the other the end.\n* `ParserNode` - A named node in parse node list.  \n* `ParserTree` - A tree structure created from a linked list of `ParserNode` objects\n* `ParserCache` - Stores parser errors, and successful parse results to accelerated future lookups. \n* `Rule` - Base class of a parser, provides a match function that accepts a ParserState and a ParserCache.\n* `Grammar` - Base class of a collection of parsing rules, usually defined as properties. \n* `CstNode` - Base class of typed parse trees generated from `Grammar` objects. \n* `CstClassBuilder` - Static class with functions for generating `AstNode` classes and factory functions from `ParserTree` objects. \n* `ParserError` - An error created when a `Sequence` fails to match a child rule after an `OnError` rule.\n* `ParserException` - This represents an internal parser error which usually results from a grammar mistake.  \n\n## Grammar\n\nA `Grammar` is the base class for collections of parsing rules. Usually each parse rule in a grammar is defined \nas a computed property that either returns a `NodeRule` or a `TokenRule`. By using the functions `Node` and `Token`\nto wrap rule definitions, names are automatically assigned to each rule. \n\nThe `GrammarExtensions.cs` file contains a number of helper functions for outputting the definitions of grammars \nor to simplify parsing rules. \n\nWhen defining rules it is important that any cyclical reference from a rule to itself uses at least one `RecursiveRule`\nin the relationship chain. This prevents stack overflow errors from occuring.\n \n## Rules\n\nA Parakeet parser is defined by a class deriving from [`Rule`](https://github.com/cdiggins/parakeet/blob/master/Parakeet/Rule.cs). Some rules are defined by combining rules. \nThose combining rules are called \"combinators\". \n\nEver rule has a single function:\n\n```chsarp\npublic ParserState Match(ParserState state, ParserCache cache)\n```\n\nThe `Match` will return `null` if the Rule failed to match, or a `ParserState` object if successful.\n\n### Fluent Syntax for Rules\n\nRules can be combined using a fluent syntax (aka method chaining). \n\n* `rule.At()` =\u003e `new At(rule)`\n* `rule.NotAt()` =\u003e `new NotAt(rule)`\n* `rule1.Then(rule2)` =\u003e `new Sequence(rule1, rule2)`\n* `rule1.ThenNot(rule2)` =\u003e `new Sequence(rule, rule2.NotAt())`\n* `rule.Optional()` =\u003e `new Optional(rule)`\n* `rule1.Or(rule2)` =\u003e `new Choice(rule1, rule2)`\n* `rule1.Except(rule2)` =\u003e `new Sequence(rule2.NotAt(), rule1)`\n* `rule.ZeroOrMore()` =\u003e `new ZeroOrMore(rule)`\n* `rule.OneOrMore()` =\u003e `new Sequence(rule, rule.ZeroOrMoree)`\n* `char1.To(char2)` =\u003e `new CharRangeRule(char1, char2)`\n\n### Overloaded Operators for Rules\n\nRules can be combined using the following overloaded operators.\n\n* `rule1 + rule2` =\u003e `new SequenceRule(rule1, rule2)`\n* `rule1 | rule2` =\u003e `new ChoiceRule(rule1, rule2)`\n* `!rule` =\u003e `new NotAt(rule)`\n\n### Implicit Casts\n\nThe following implicit casts are defined for Rules: \n\n* `Rule(string s)` =\u003e `new StringMatchRule(s)`\n* `Rule(char c)` =\u003e `new CharMatchRule(c)`\n* `Rule(char[] cs)` =\u003e `new CharSetRule(cs)`\n* `Rule(string[] xs)` =\u003e `new Choice(xs.Select(x =\u003e (Rule)x))`\n\n### Primitive Rules\n\n* `StringMatchRule` - Matches a string\n* `AnyCharRule` - Matches any character, but fails at end of the file \n* `CharRangeRule` - Matches any character within a range  \n* `CharSetRule` - Matches any character within a set\n* `CharMatchRule` - Matches a specific character\n\n### Rule Combinators\n\nRule combinators combine zero or more rules. \n\n* `ZeroOrMore` - Tries to match a child rule as many times as possible, returning the original `ParserState` if not successful at least once. \n* `Optional` - Tries to match a child rule exactly once, returning the original `ParserState` if the child rule fails. \n* `Sequence` - Matches a sequence of rules one by one, returning `null` if not successful.\n* `Choice` - Matches a collection of rules one by one, until one succeeeds, or `null` if not successful.\n* `RecursiveRule` - Matches a child rule defined by a lambda, thus allowing Rule definitions to have cycles.\n* `TokenRule` - A rule that just matches it child without creating a node. Used for defining grammars. \n* `NodeRule` - Creates a new `ParserNode` and adds it to the `ParserState`. May also eat whitespace (true by default). \n\n### Assertion Rules\n\nSeveral rules never advance the parser state:\n\n* `At` - Returns the original parser state if the child rule succeeds, or null otherwise \n* `NotAt` - Returns the original parser state if the child rule fails, or null otherwise \n* `EndOfInput` - Returns the parser state if at the end of input, or null otherwise \n\n### Error Handling Rules\n\n* `OnFail` - Used only within `Sequence` rules. Contains a child rule called the recovery rule. \nWill always succeed and return the `ParserState` when matched. If a sequence encounters an `OnFail` error and \none of the subsequent child rules fails, the parser will then use the recovery rule to try to advance \nto a place where it is likely to be able to continue parsing successfully (e.g. the end of a statement).\n\n## Parse Trees \n\nParse trees generated from Parakeet are untyped. A parse node is created whenever a `NodeRule` rule \nsuccessfully matches its child rule against the input. After parsing the list of parse nodes \nis converted into a tree structure. \n\n## Typed Parse Tree (CST)\n\nA set of classes representing a strongly typed parse tree can be created automatically from a Parakeet grammar. This is called the \nConcrete Syntax Tree. Concrete syntax trees are generated from the `Ara3D.Parakeet.Grammars` project using one of the \nfunctions in the `Ara3d.Parakeet.Tests` project. \n\n## Examples of Using Parakeet \n\nThe following projects use Parakeet:\n\n* \u003chttps://github.com/ara3d/ara3d/tree/main/src/Ara3D.Parsing.Markdown\u003e\n* \u003chttps://github.com/cdiggins/Plato\u003e\n\n## History \n\nParakeet evolved from the [Jigsaw parser](https://www.codeproject.com/Articles/272494/Implementing-Programming-Languages-using-Csharp) \nand applies lessons learned when writing the [Myna parsing library in TypeScript](https://cdiggins.github.io/myna-parser/) \nas well as my first parsing library [YARD](https://www.codeproject.com/Articles/9121/Parsing-XML-in-C-using-the-YARD-Parser\nParakeet is designed to be as fast as possible while retaining a clean and elegant grammar description. \n\n## Related Work\n\n### C# and F# Parsing Libraries \n\n* https://github.com/benjamin-hodgson/Pidgin\n* https://github.com/sprache/Sprache\n* https://github.com/plioi/parsley\n* https://github.com/datalust/superpower\n* https://github.com/IronyProject/Irony\n* https://github.com/teo-tsirpanis/Farkle\n* https://github.com/takahisa/parseq\n* https://github.com/picoe/Eto.Parse\n* https://github.com/b3b00/CSLY\n* https://github.com/stephan-tolksdorf/fparsec\n\n### Parser Generator Tools\n\n* https://github.com/dbremner/peg-sharp  \n* https://github.com/SickheadGames/TinyPG \n* https://github.com/otac0n/Pegasus\n* https://github.com/qwertie/LLLPG-Samples\n* https://github.com/antlr\n\n## References\n\n* https://en.wikipedia.org/wiki/Parser_combinator\n* https://en.wikipedia.org/wiki/Parsing_expression_grammar\n* https://pdos.csail.mit.edu/~baford/packrat/icfp02/packrat-icfp02.pdf\n\n## FAQ \n\nQ: Isn't a parse tree and concrete syntax tree (CST) the same thing? \n\n\u003e Yes. Parakeet uses the term parse tree to refer to untyped parse tree, and CST to refer to the typed parse tree.  \n\nQ: What is the difference between a CST and an AST? \n\n\u003e The CST is generated from parsing the input text as is. \n\u003e An AST is the result of transforming the CST into a form that has the same semantic meaning but is presumably \n\u003e simpler and easier. \n\nQ: Why isn't Parakeet a code generator from the beginning \n\n\u003e I find it easier to learn, understand, use, and debug libraries that aren't generated. \n\u003e Writing an extension to Parakeet that generates parser code would not be very hard.  \n\nQ: So why is the CST code generated? \n\n\u003e Having a strongly typed CST is very beneficial when writing analysis and transformation tools \n\u003e especially for non-trivial grammars like those of programming languages. \n\u003e I don't know of any other way in C# to generate strongly typed libraries other than generating code. \n\nQ: Isn't parsing library X faster? \n\n\u003e Maybe. I designed Parakeet to be fast enough for my needs, but I prioritized making it robust and (hopefully) easy to use. \n\u003e See the related work section for other parsing libraries to consider. \n\nQ: Can you provide some benchmarks? Or implement grammar X? \n\n\u003e I'm kind of busy getting work done. \n\u003e If you are willing to fund this project, then we should talk: email me at \u003ccdiggins@gmail.com\u003e.   \n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fara3d%2Fparakeet","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fara3d%2Fparakeet","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fara3d%2Fparakeet/lists"}