{"id":13822574,"url":"https://github.com/matklad/fall","last_synced_at":"2025-04-12T13:52:44.122Z","repository":{"id":38334648,"uuid":"83023949","full_name":"matklad/fall","owner":"matklad","description":null,"archived":false,"fork":false,"pushed_at":"2022-06-06T19:17:36.000Z","size":1922,"stargazers_count":135,"open_issues_count":1,"forks_count":8,"subscribers_count":17,"default_branch":"master","last_synced_at":"2025-04-04T05:37:38.777Z","etag":null,"topics":["ide","parser-generator","syntax-tree"],"latest_commit_sha":null,"homepage":null,"language":"Rust","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/matklad.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2017-02-24T09:19:11.000Z","updated_at":"2024-11-18T02:50:25.000Z","dependencies_parsed_at":"2022-07-25T20:31:06.345Z","dependency_job_id":null,"html_url":"https://github.com/matklad/fall","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/matklad%2Ffall","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/matklad%2Ffall/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/matklad%2Ffall/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/matklad%2Ffall/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/matklad","download_url":"https://codeload.github.com/matklad/fall/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248576313,"owners_count":21127369,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ide","parser-generator","syntax-tree"],"created_at":"2024-08-04T08:02:06.794Z","updated_at":"2025-04-12T13:52:44.099Z","avatar_url":"https://github.com/matklad.png","language":"Rust","readme":"# Fall: Not Yet Another Parser Generator\n\nThis is a work in progress hobby project. If you are looking for a production ready parser generator for Rust,\nconsider [pest](https://github.com/pest-parser/pest), [lalrpop](https://github.com/nikomatsakis/lalrpop) or\n[nom](https://github.com/Geal/nom). If you are looking for a production grade IDE-ready parser generator, take a look\nat [Grammar Kit](https://github.com/JetBrains/Grammar-Kit) or [Papa Carlo](https://github.com/Eliah-Lakhin/papa-carlo).\nYou might also find [tree-sitter](https://github.com/tree-sitter/tree-sitter) to be interesting.\n\n## Scope\n\nThe ambitious goal is to create a parsing framework, suitable for tools interacting with the source code, such as\neditors, IDEs, refactoring tools or code formatters.\n\n## Design constraints\n\nThe syntax tree must not be abstract, it should include all the whitespace and comments and be a lossless representation\nof the original text.\n\nAll the languages should share the same syntax tree data structure. That is, it should be possible to write non-generic\ncode for processing syntax of any language. It should also be possible to provide a single C API to interact with a\nsyntax tree of any language.\n\nParser should be able to deal with incomplete input gracefully. It should be able to recognize syntactic constructs\neven if some required elements are missing and it should attempt to resynchronize input after an error.\n\n## Non goals\n\nParser need not guarantee that the input grammar is unambiguous.\n\nParser need not guarantee sane worse case performance for any grammar. Nevertheless, it is expected that most sane\nprogramming languages could be parsed efficiently.\n\n## Nice to haves\n\nImplementing parsers should be interactive: user should see the grammar, the example input and the parse tree\nsimultaneously.\n\nParsing should be incremental: changing something inside the code block should cause only the block to be reparsed.\n\nParsing should be fast: even with incrementally, there are certain bad cases (unclosed quote), where one has to reparse\nthe whole input.\n\n## Code structure\n\n\n### Tree Model\n\nThe entry point is `fall/tree/src/node/mod.rs`. It defines the structure of the syntax tree which roughly looks like this:\n\n```rust\ntype NodeType = 32;\n\nstruct File { ... }\n\n#[derive(Clone, Copy)]\nstruct Node\u003c'f\u003e {\n    file: \u0026'f File,\n    ...\n}\n\nimpl\u003c'f\u003e Node\u003c'f'\u003e {\n    fn ty(\u0026self) -\u003e NodeType { ... }\n    fn parent(\u0026self) -\u003e Node\u003c'f\u003e { ... }\n    fn children(\u0026self) -\u003e impl Iterator\u003cItem=Node\u003c'f'\u003e\u003e { ... }\n    fn text_range(\u0026self) -\u003e (usize, usize) { ... }\n    fn text(\u0026self) -\u003e \u0026str { ... }\n}\n```\n\nThe main element is a non-generic `Node` which is a `Copy` handle representing some range in the input text, together\nwith its type (which is just an integer constant) and subranges. It is the main API that the consumers of the syntax\ntree would use.\n\nWhile having an untyped API is needed for working with several different languages together, for each particular\nlanguage a typed API is easier to work with. You can layer a typed API on top of Nodes easily, using the following\npattern\n\n```rust\n\nstruct RustFunction {\n    node: Node\n}\n\nimpl RustFunction {\n    fn new(node: Node) -\u003e RustFunction {\n        assert_eq!(node.ty(), RUST_FUNCTION);\n        RustFunction { node: node }\n    }\n\n    fn name(\u0026self) -\u003e \u0026str {\n        let ident_child = child_of_type_exn(self.node, IDENT);\n        ident_child.text()\n    }\n}\n```\n\nSuch typed wrappers are generated automatically. See `fall/tree/src/ast.rs` and `fall/tree/visitor.rs` for a generic\nimplementation of this pattern and how it can be used to travers trees in a type-safe manner (imo, this is the most\nbeautiful piece of code here so far:) ). It's also interesting that you can create a single typed wrapper around\n*several* node types, which allows to express an arbitrary [non-]hierarchy of node types. See `AstClass` for details.\n\n\n### Parsing\n\nBy itself, `fall_tree` does not impose any particular way of constructing trees. It should be possible to connect it to\na hand written, a generated or an external parser. Currently a specific parser generator is the main way to create\ntrees. `fall/parse` contains runtime for the parser (currently, parser is mostly interpreted), and `fall_/gen`\ncontains the corresponding generator, which generates a lexer, a parser and the AST. The parser is roughly a\n\"hand-written recursive descent\" plus (to be implemented) Pratt parser for expressions. Some call this style\nof parsing PEG.\n\n### Grammar\n\nTo learn the details of the grammar spec, it's best to read the (literalish) [grammar of the fall language itself](./lang/fall/syntax/src/fall.fall)\nOther examples are also in the `lang` subdirectory, look for the `*.fall` files.\n\nHere are some interesting highlights of the grammar.\n\nThe `\u003ccommit\u003e` specifier allows parser to recognize incomplete syntactic constructs. For example, for the\n\n```\nrule function {\n  'fn' ident \u003ccommit\u003e '(' fn_args ')' '-\u003e' type expr\n}\n```\n\nthe parser would recognize `fn foo` as an incomplete function, and would give the following tree:\n\n```\nFUNCTION\n  FN\n  IDENT \"foo\"\n  ERROR '(' expected\n```\n\nThe `\u003cwith_skip to_skip rule\u003e` function allows to skip some tokens to resynchronize input. For example,\n`\u003cwith_skip 'fn' function\u003e` would skip the tokens (creating an error node) until the `fn` keyword, and then launch\n`function` parser.\n\nThe `\u003clayer cover contents\u003e` rule allows to \"approximately\" parse a fragment of input, which helps with error recovery\nand incremental and lazy reparsing. Let's look at the concrete example:\n\n```\npub rule block_expr {\n  '{' \u003clayer block_body {seq_expr? {'|' seq_expr}*}\u003e '}'\n}\n\nrule block_body { \u003crep balanced\u003e }\nrule balanced {\n  '{' \u003ccommit\u003e block_body '}'\n| \u003cnot '}'\u003e\n}\n```\n\nHere, `block_body` parses an arbitrary sequence of tokens with the sole restriction that `{` and `}` are balanced. When\nparsing the innards of `block_expr`, the parser would first find the borders of the `bock_body`, and than it would parse\nthe contents of the `block_body` with the more detailed `{\u003copt seq_expr\u003e \u003crep {'|' seq_expr}\u003e}`. Crucially, if the\ndetailed rule fails, than all the remaining tokens inside the block body will be marked as an errors, but the parsing\noutside of the blocks will continue as usual. Moreover, if the user types anything inside the block, the parser will\ncheck if the block's borders do not change (this would be the case unless `{` or `}` is typed) and if it is the case,\nit will only reparse the block itself.\n\nThe `test` blocks allow to quickly get feedback about the current grammar. You can write something like\n\n```\npub rule struct_def {\n  \u003copt 'pub'\u003e 'struct' \u003ccommit\u003e ident\n  '{' \u003clayer block_body struct_field*\u003e'}'\n}\n\ntest r\"\n  struct Foo {\n    a: A,\n    pub b: B,\n  }\n\"\n```\n\nand then run `cargo run --bin gen --example rust.fall` to render the syntax tree of the example block. `watch.sh`\nwraps this into convenient \"rerender example on save\" script. In the VS Code plugin, you can place cursor on the example\nand run a Quick Fix (`Ctrl+.` by default) to render the syntax tree of the test.\n\n### VS Code plugin\n\nThere is a VS Code plugin in the `code` director, which demonstrates how `fall` can be used from an editor. The plugin\ncurrently supports only the `fall` language itself. All features are implemented in Rust in an editor agnostic way in\n`lang/fall/src/editor_api.rs`. It should be possible to hook up this code with any editor, by either dynamically or\nstatically linking in the Rust crate, or by wrapping it into an RPC.\n\n## Current status\n\nSomething works :)\n\nUsing fall, I've implemented a more-or-less complete Rust parser (see `lang/rust/syntax`) and a library with various IDE\nfeatures implemented (see `lang/rust`). This library is then used to implement a VS code plugin for rust (see `code/rust`,\ninstall by running `just code-rust`). Features include\n\n  * extend selection (pressing `ctrl+shift+right` will expand selection precisely, covering larger syntactic structures,\n    and not just braced blocks)\n  * parse-tree based syntax highlighting\n  * breadcrumbs (at the bottom of the screen, current function/impl/mod etc are shown)\n  * file struture (`ctrl+shift+o` shows a list of symbols in files)\n  * navigate to symbol (`ctrl+T` shows the list of symbols in the current project. This is CTAGS done right, with parser\n    instead of regex, and with incremental update on editing. Indexing `rust-lang/rust` repo takes about 30 seconds,\n    using single core).\n  * rudimentary postfix templates (`foo().pd$` expands to `eprintln!(\"foo() = {:?}\", foo)`)\n  * rudimentary code-actions support (`ctrl+.` on a struct defintion suggests adding an `impl` with all generics and\n    lifetimes filled-in)\n\nIn general the plugin is definitely unpolished, but is workable. Reliable symbol navigation, breadcrumbs and extend\nselection are particularly useful features! However, if you like them, just use IntelliJ Rust plugin ;)\n\n\nAnd of course the VS code plugin for `fall` is implemented in `fall` itself. See `lang/fall/syntax` for parser,\n`lang/fall/src/analysis` for \"brains\", `lang/fall/src/editor` for IDE library and `code/fall` for the actual plugin.\n`just code-fall` installs the plugin.\n\nHere's a screenshoot showing [Rust grammar](https://github.com/matklad/fall/blob/master/lang/rust/src/rust.fall),\ninline test and the resulting syntax tree.\n\n![Rust grammar](https://user-images.githubusercontent.com/1711539/28753615-abc20a4e-753f-11e7-886d-6f1c7ddea2db.png)\n\n\n## Contributing\n\nAt the moment, there's no clear plan and set of issues to work on, however there's a lot of interesting projects to do :)\n\n* Writing grammars and tests for more languages\n\n* Actually exposing a C-API and integrating parser with Emacs and Vim\n\n* Using xi-rope instead of string\n\n* Implementing incremental relexing\n\n* Improving the VS Code plugin\n\nWe use [just](https://github.com/casey/just) to automate code generation tasks:\n\n* `generate-parsers` -- updates the generated parser code\n\n* `update-test-data` -- fixes expected syntax trees in tests after grammar update\n\n* `code-rust`, `code-fall` -- builds VS Code extension\n\n","funding_links":[],"categories":["Rust"],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmatklad%2Ffall","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmatklad%2Ffall","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmatklad%2Ffall/lists"}