{"id":23002510,"url":"https://github.com/swiiz/autoparser","last_synced_at":"2025-06-12T23:04:08.154Z","repository":{"id":268152558,"uuid":"903478002","full_name":"Swiiz/autoparser","owner":"Swiiz","description":"🦀 Generate Recursive Descent Parser using Rust macros.","archived":false,"fork":false,"pushed_at":"2024-12-23T17:14:52.000Z","size":92,"stargazers_count":6,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-04-22T20:44:01.966Z","etag":null,"topics":["parser","rust"],"latest_commit_sha":null,"homepage":"","language":"Rust","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Swiiz.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2024-12-14T17:47:09.000Z","updated_at":"2025-01-08T21:12:28.000Z","dependencies_parsed_at":"2024-12-14T18:38:41.622Z","dependency_job_id":"1b0eb410-40c1-414e-84fb-80a04d449fa0","html_url":"https://github.com/Swiiz/autoparser","commit_stats":null,"previous_names":["swiiz/autoparser"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/Swiiz/autoparser","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Swiiz%2Fautoparser","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Swiiz%2Fautoparser/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Swiiz%2Fautoparser/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Swiiz%2Fautoparser/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Swiiz","download_url":"https://codeload.github.com/Swiiz/autoparser/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Swiiz%2Fautoparser/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":259546417,"owners_count":22874561,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["parser","rust"],"created_at":"2024-12-15T07:11:16.010Z","updated_at":"2025-06-12T23:04:08.131Z","avatar_url":"https://github.com/Swiiz.png","language":"Rust","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003ch1 align=center\u003e🤖 Autoparser 💬\u003c/h1\u003e\n\n## Rust library to easily generate *([Recursive Descent](https://en.wikipedia.org/wiki/Recursive_descent_parser))* Parser using macros.\n\nThis can be used to generate parsers for any language easily such as programming, markup, etc.\nAll the parser generation logic (+regex building) is executed at **compile time, ensuring zero runtime overhead**.\n\n## Some context\n\nWhile reading [Crafting Interpreters](https://craftinginterpreters.com) and implementing my own programming language in Rust, I explored ways to make parser creation more efficient as outlined in the book. Writing parsers can be quite repetitive, especially for verbose languages, so automating parts of this process can save significant effort.\n\nIn the book, the author uses Java and demonstrates how to write code that generates Java parser code: a form of [metaprogramming](https://en.wikipedia.org/wiki/Metaprogramming#:~:text=Metaprogramming%20is%20a%20computer%20programming,even%20modify%20itself%2C%20while%20running.). Since I’m working with Rust, I’ve opted to leverage **Rust’s powerful macro as well as type system** to achieve a similar result. To that end, I’ve developed a library that uses macros to generate parsers.\n\nHowever, I didn’t stop at replicating the simple approach of the book. And went for a solution to generate automatically the Scanner, [Abstract-Syntax-Tree](https://en.wikipedia.org/wiki/Abstract_syntax_tree), and Parser logic.\n\n## Usage (Read the [documentation](https://swiiz.github.io/autoparser/) for more information)\n\n- Add the dependency to your `Cargo.toml` file:\n```toml\n[dependencies]\nautoparser = { git = \"https://github.com/Swiiz/autoparser\" }\n```\n\n- Define your tokens, this will generate a `Scanner` struct and a `Token` enum.\n```rust\nautoparser::impl_scanner! {\n  Whitespace @regex =\u003e \"^(?\u003c__\u003e\\\\s)\", // All regexes need to have a named capture group.\n\n  Minus =\u003e \"-\",\n  Plus =\u003e \"+\",\n\n  NumberLiteral { number: u32 } @regex =\u003e \"^(?\u003cnumber\u003e(\\\\d)+)\", // Named capture group can be used as data in Token.\n}\n```\n\n- Define your grammar, each rules will define struct representing a node in the Abstract Syntax Tree. Rules are declared in [Order of precedence](https://en.wikipedia.org/wiki/Order_of_operations).\n```rust\nautoparser::impl_rules! {\n// For performance reason you don't want to parse (.., \u003cToken\u003e, \u003cRule, ..) as the Rule. The parser can stop early at the \"type level\", you may either use only Rules or Token in the pattern.\n  AddOperator =\u003e Token::Plus, \n  SubOperator =\u003e Token::Minus,\n\n// When not in enum mode, On the right side of the `=\u003e` you can use any match pattern. The type of the provided pattern will be parsed.\n// Then your pattern will be tested. You can use @, if and more... for data manipulation, see rust match-pattern docs.\n  Literal { number: u32 } =\u003e Token::NumberLiteral { number },\n\n// Rules can also be an union of rules using the `enum` keyword and the `|` operator.\n  enum Expr =\u003e AddOperation | SubOperation | Unary,\n// Rules can be recursive, however you need to use the Box\u003cT\u003e type to avoid infinite ast node size.\n  AddOperation { left: Unary, right: Box\u003cExpr\u003e } =\u003e (left, AddOperator {}, right),\n  SubOperation { left: Unary, right: Box\u003cExpr\u003e } =\u003e (left, SubOperator {}, right),\n\n  enum Unary =\u003e InverseOperation | Literal,\n  InverseOperation { literal: Literal } =\u003e (SubOperator {}, literal),\n  Literal { number: u32 } =\u003e Token::NumberLiteral { number },\n}\n```\t\n\nThe `Token` enum, `Scanner` struct and each AST Node can now be used together:\n```rust\n  let source = autoparser::Source {\n      name: None,\n      content: \"1 + 2 - 3\".into(),\n  };\n  let scanner = Scanner::new();\n\n  let scan = scanner\n      .scan(source)\n      .into_iter()\n      .filter(|t| t != \u0026Token::Whitespace)\n      .collect::\u003cVec\u003c_\u003e\u003e();\n\n  let mut tokens = autoparser::TokenStream::new(\u0026scan);\n  println!(\"{#?}\", Expr::try_parse(\u0026mut tokens));\n```\n\n## Example(s)\n\n- ### [Calculator](https://github.com/Swiiz/autoparser/tree/master/examples/calculator.rs)\n  **Supporting -, +, \\*, /, parenthesis, variables and operator priority in \u003c100 LOC.**\n\n  Run the example:\n  ```\n  cargo run --example calculator\n  ```\n  See the generated code documentation:\n  ```\n  cargo doc --example calculator --no-deps --open \n  ```\n- ### [JSON](https://github.com/Swiiz/autoparser/tree/master/examples/json.rs)\n  **Supporting table and arrays (with iterators), int, bool and string values in \u003c100 LOC.**\n  Using Vec\\\u003cT\\\u003e and Option\\\u003cT\\\u003e utils to generate a compact AST.\n\n  Run the example:\n  ```\n  cargo run --example json\n  ```\n  See the generated code documentation:\n  ```\n  cargo doc --example json --no-deps --open \n  ```\n\n## How does it work?\n\n- **impl_scanner!()**\n   - The scanner scans the source code for the token strings and regexes. The scanner will always match static strings before trying to match regexes. Return a Vec of tokens.\n   - Each token is defined as a variant on the Token enum.\n   - trait `Parse\u003cSelf\u003e` is implemented for the newly created Token enum.\n  \n- **impl_rules!()**\n    - Generates a struct/enum for each rule with it's data. (representing the ast node)\n    - trait `Parse\u003cToken\u003e` is implemented for each node struct/enum, it first matches the rule, then the data.\n    - Composite rules can be defined using tuples, such as (A, B, C, ...). Unlike enums, composite rules require all child rules to match for the composite rule to succeed.\n\n****************************\n\n\u003e [!NOTE]\n\u003e This is a work in progress. Error reporting needs to be improved.\n\u003e \n\u003e Feel free to open an issue or contribute!\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fswiiz%2Fautoparser","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fswiiz%2Fautoparser","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fswiiz%2Fautoparser/lists"}