{"id":13496099,"url":"https://github.com/zesterer/chumsky","last_synced_at":"2025-05-13T15:08:13.794Z","repository":{"id":37779992,"uuid":"384787992","full_name":"zesterer/chumsky","owner":"zesterer","description":"Write expressive, high-performance parsers with ease.","archived":false,"fork":false,"pushed_at":"2025-04-29T17:52:12.000Z","size":4102,"stargazers_count":4022,"open_issues_count":97,"forks_count":171,"subscribers_count":23,"default_branch":"main","last_synced_at":"2025-05-06T14:55:25.826Z","etag":null,"topics":["context-free-grammar","errors","lexing","parser","parser-combinators","parsing","peg","recursive-descent-parser"],"latest_commit_sha":null,"homepage":"https://crates.io/crates/chumsky","language":"Rust","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/zesterer.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":".github/FUNDING.yml","license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null},"funding":{"github":["zesterer"]}},"created_at":"2021-07-10T20:48:36.000Z","updated_at":"2025-05-05T19:14:23.000Z","dependencies_parsed_at":"2023-10-12T02:59:43.236Z","dependency_job_id":"8d881561-5193-418b-9f74-92ecec3efde8","html_url":"https://github.com/zesterer/chumsky","commit_stats":{"total_commits":615,"total_committers":40,"mean_commits":15.375,"dds":"0.44552845528455287","last_synced_commit":"033df6c5c925346df6070af5e0990cf57155f5ce"},"previous_names":[],"tags_count":5,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zesterer%2Fchumsky","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zesterer%2Fchumsky/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zesterer%2Fchumsky/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zesterer%2Fchumsky/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/zesterer","download_url":"https://codeload.github.com/zesterer/chumsky/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":253969231,"owners_count":21992262,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["context-free-grammar","errors","lexing","parser","parser-combinators","parsing","peg","recursive-descent-parser"],"created_at":"2024-07-31T19:01:42.257Z","updated_at":"2025-05-13T15:08:13.674Z","avatar_url":"https://github.com/zesterer.png","language":"Rust","funding_links":["https://github.com/sponsors/zesterer"],"categories":["Rust","parsing"],"sub_categories":[],"readme":"[![crates.io](https://img.shields.io/crates/v/chumsky.svg)](https://crates.io/crates/chumsky)\n[![crates.io](https://docs.rs/chumsky/badge.svg)](https://docs.rs/chumsky)\n[![License](https://img.shields.io/crates/l/chumsky.svg)](https://github.com/zesterer/chumsky)\n[![actions-badge](https://github.com/zesterer/chumsky/workflows/Rust/badge.svg?branch=master)](https://github.com/zesterer/chumsky/actions)\n\nChumsky is a parser library for Rust that makes writing expressive, high-performance parsers easy.\n\n\u003ca href = \"https://www.github.com/zesterer/tao\"\u003e\n    \u003cimg src=\"https://raw.githubusercontent.com/zesterer/chumsky/master/misc/example.png\" alt=\"Example usage with my own language, Tao\"/\u003e\n\u003c/a\u003e\n\n*Note: Error diagnostic rendering in this example is performed by [Ariadne](https://github.com/zesterer/ariadne)*\n\nAlthough chumsky is designed primarily for user-facing parsers such as compilers, chumsky is just as much at home\nparsing binary protocols at the networking layer, configuration files, or any other form of complex input validation\nthat you may need. It also has `no_std` support, making it suitable for embedded environments.\n\n## Features\n\n- 🪄 **Expressive combinators** that make writing your parser a joy\n- 🎛️ **Fully generic** across input, token, output, span, and error types\n- 📑 **Zero-copy parsing** minimises allocation by having outputs hold references/slices of the input\n- 🚦 **Flexible error recovery** strategies out of the box\n- ☑️ **Check-only mode** for fast verification of inputs, automatically supported\n- 🚀 **Internal optimiser** leverages the power of [GATs](https://smallcultfollowing.com/babysteps/blog/2022/06/27/many-modes-a-gats-pattern/) to optimise your parser for you\n- 📖 **Text-oriented parsers** for text inputs (i.e: `\u0026[u8]` and `\u0026str`)\n- 👁️‍🗨️ **Context-free grammars** are fully supported, with support for context-sensitivity\n- 🔄 **Left recursion and memoization** have opt-in support\n- 🪺 **Nested inputs** such as token trees are fully supported both as inputs and outputs\n- 🏷️ **Pattern labelling** for dynamic, user-friendly error messages\n- 🗃️ **Caching** allows parsers to be created once and reused many times\n- ↔️ **Pratt parsing** support for simple yet flexible expression parsing\n- 🪛 **no_std** support, allowing chumsky to run in embedded environments\n\n## Example\n\nSee [`examples/brainfuck.rs`](https://github.com/zesterer/chumsky/blob/main/examples/brainfuck.rs) for a full\n[Brainfuck](https://en.wikipedia.org/wiki/Brainfuck) interpreter\n(`cargo run --example brainfuck -- examples/sample.bf`).\n\n```rust,ignore\nuse chumsky::prelude::*;\n\n/// An AST (Abstract Syntax Tree) for Brainfuck instructions\n#[derive(Clone)]\nenum Instr {\n    Left, Right,\n    Incr, Decr,\n    Read, Write,\n    Loop(Vec\u003cSelf\u003e), // In Brainfuck, `[...]` loop instructions contain any number of instructions\n}\n\n/// A function that generates a Brainfuck parser\nfn brainfuck\u003c'a\u003e() -\u003e impl Parser\u003c'a, \u0026'a str, Vec\u003cInstr\u003e\u003e {\n    // Brainfuck syntax is recursive: each instruction can contain many sub-instructions (via `[...]` loops)\n    recursive(|bf| choice((\n        // All of the basic instructions are just single characters\n        just('\u003c').to(Instr::Left),\n        just('\u003e').to(Instr::Right),\n        just('+').to(Instr::Incr),\n        just('-').to(Instr::Decr),\n        just(',').to(Instr::Read),\n        just('.').to(Instr::Write),\n        // Loops are strings of Brainfuck instructions, delimited by square brackets\n        bf.delimited_by(just('['), just(']')).map(Instr::Loop),\n    ))\n        // Brainfuck instructions appear sequentially, so parse as many as we need\n        .repeated()\n        .collect())\n}\n\n// Parse some Brainfuck with our parser\nbrainfuck().parse(\"--[\u003e---\u003e-\u003e-\u003e++\u003e-\u003c\u003c\u003c\u003c\u003c-------]\u003e--.\u003e---------.\u003e--..+++.\u003e----.\u003e+++++++++.\u003c\u003c.+++.------.\u003c-.\u003e\u003e+.\")\n```\n\nYou can find more examples [here](https://github.com/zesterer/chumsky/tree/main/examples).\n\n## Guide and documentation\n\nChumsky has an extensive [guide](https://docs.rs/chumsky/latest/chumsky/guide) that walks you through the library: all\nthe way from setting up and basic theory to advanced uses of the crate. It includes technical details of chumsky's\nbehaviour, examples of uses, a handy index for all of the combinators, technical details about the crate, and even a\ntutorial that leads you through the development of a fully-functioning interpreter for a simple programming language.\n\nThe crate docs should also be similarly useful: most important functions include at least one contextually-relevant\nexample, and all crate items are fully documented.\n\nIn addition, chumsky comes with a suite of fully-fledged\n[example projects](https://github.com/zesterer/chumsky/tree/main/examples). These include:\n\n- Parsers for existing syntaxes like Brainfuck and JSON\n- Integration demos for third-party crates, like [`logos`](https://crates.io/crates/logos)\n- Parsers for new toy programming languages: a Rust-like language and a full-on lexer, parser, type-checker, and\n  interpreter for a minature ML-like language.\n- Examples of parsing non-trivial inputs like token trees, `impl Read`ers, and zero-copy, zero-alloc parsing.\n\n## Cargo features\n\nChumsky contains several optional features that extend the crate's functionality.\n\n- `bytes`: adds support for parsing types from the [`bytes`](https://docs.rs/bytes/) crate.\n\n- `either`: implements `Parser` for `either::Either`, allowing dynamic configuration of parsers at run-time\n\n- `extension`: enables the extension API, allowing you to write your own first-class combinators that integrate with\n  and extend chumsky\n\n- `lexical-numbers`: Enables use of the `Number` parser for parsing various numeric formats\n\n- `memoization`: enables [memoization](https://en.wikipedia.org/wiki/Memoization#Parsers) features\n\n- `nightly`: enable support for features only supported by the nightly Rust compiler\n\n- `pratt`: enables the [pratt parsing](https://matklad.github.io/2020/04/13/simple-but-powerful-pratt-parsing.html)\n  combinator\n\n- `regex`: enables the regex combinator\n\n- `serde`: enables `serde` (de)serialization support for several types\n\n- `stacker` (enabled by default): avoid stack overflows by spilling stack data to the heap via the `stacker` crate\n\n- `std` (enabled by default): support for standard library features\n\n- `unstable`: enables experimental chumsky features (API features enabled by `unstable` are NOT considered to fall\n  under the semver guarantees of chumsky!)\n\n## *What* is a parser combinator?\n\nParser combinators are a technique for implementing parsers by defining them in terms of other parsers. The resulting\nparsers use a [recursive descent](https://en.wikipedia.org/wiki/Recursive_descent_parser) strategy to transform a stream\nof tokens into an output. Using parser combinators to define parsers is roughly analogous to using Rust's\n[`Iterator`](https://doc.rust-lang.org/std/iter/trait.Iterator.html) trait to define iterative algorithms: the\ntype-driven API of `Iterator` makes it more difficult to make mistakes and easier to encode complicated iteration logic\nthan if one were to write the same code by hand. The same is true of parser combinators.\n\n## *Why* use parser combinators?\n\nWriting parsers with good error recovery is conceptually difficult and time-consuming. It requires understanding the\nintricacies of the recursive descent algorithm, and then implementing recovery strategies on top of it. If you're\ndeveloping a programming language, you'll almost certainly change your mind about syntax in the process, leading to some\nslow and painful parser refactoring. Parser combinators solve both problems by providing an ergonomic API that allows\nfor rapidly iterating upon a syntax.\n\nParser combinators are also a great fit for domain-specific languages for which an existing parser does not exist.\nWriting a reliable, fault-tolerant parser for such situations can go from being a multi-day task to a half-hour task\nwith the help of a decent parser combinator library.\n\n## Classification\n\nChumsky's parsers are [recursive descent](https://en.wikipedia.org/wiki/Recursive_descent_parser) parsers and are\ncapable of parsing [parsing expression grammars (PEGs)](https://en.wikipedia.org/wiki/Parsing_expression_grammar), which\nincludes all known context-free languages. However, chumsky doesn't stop there: it also supports context-sensitive\ngrammars via a set of dedicated combinators that integrate cleanly with the rest of the library. This allows it to\nadditionally parse a number of context-sensitive syntaxes like Rust-style raw strings, Python-style semantic\nindentation, and much more.\n\n## Error recovery\n\nChumsky has support for error recovery, meaning that it can encounter a syntax error, report the error, and then\nattempt to recover itself into a state in which it can continue parsing so that multiple errors can be produced at once\nand a partial [AST](https://en.wikipedia.org/wiki/Abstract_syntax_tree) can still be generated from the input for future\ncompilation stages to consume.\n\n## Performance\n\nChumsky allows you to choose your priorities. When needed, it can be configured for high-quality parser errors. It can\nalso be configured for *performance*.\n\nIt's difficult to produce general benchmark results for parser libraries. By their nature, the performance of a parser\nis intimately tied to exactly how the grammar they implement has been specified. That said, here are some numbers for a\nfairly routine JSON parsing benchmark implemented idiomatically in various libraries. As you can see, chumsky ranks\nquite well!\n\n| Ranking | Library                                              | Time (smaller is better) | Throughput |\n|---------|------------------------------------------------------|--------------------------|------------|\n| 1       | `chumsky` (check-only)                               | 140.77 µs                | 797 MB/s   |\n| 2       | [`winnow`](https://github.com/winnow-rs/winnow)      | 178.91 µs                | 627 MB/s   |\n| 3       | `chumsky`                                            | 210.43 µs                | 533 MB/s   |\n| 4       | [`sn`](https://github.com/Jacherr/sn) (hand-written) | 237.94 µs                | 472 MB/s   |\n| 5       | [`serde_json`](https://github.com/serde-rs/json)     | 477.41 µs                | 235 MB/s   |\n| 6       | [`nom`](https://github.com/rust-bakery/nom)          | 526.52 µs                | 213 MB/s   |\n| 7       | [`pest`](https://github.com/pest-parser/pest)        | 1.9706 ms                | 57 MB/s    |\n| 8       | [`pom`](https://github.com/J-F-Liu/pom)              | 13.730 ms                | 8 MB/s     |\n\nWhat should you take from this? It's difficult to say. 'Chumsky is faster than X' or 'chumsky is slower than Y' is too\nstrong a statement: this is just one particular benchmark with one particular set of implementations and one\nparticular workload.\n\nThat said, there is something you can take: chumsky isn't going to be your bottleneck. In this benchmark, chumsky is\nwithin 20% of the performance of the 'pack leader' and has performance comparable to a hand-written parser. The\nperformance standards for Rust libraries are already far above most language ecosystems, so you can be sure that\nchumsky will keep pace with your use-case.\n\nBenchmarks were performed on a single core of an AMD Ryzen 7 3700x.\n\n## Notes\n\nMy apologies to Noam for choosing such an absurd name.\n\n## License\n\nChumsky is licensed under the MIT license (see `LICENSE` in the main repository).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fzesterer%2Fchumsky","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fzesterer%2Fchumsky","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fzesterer%2Fchumsky/lists"}