{"id":13880542,"url":"https://github.com/lib-ruby-parser/lib-ruby-parser","last_synced_at":"2025-07-16T17:30:40.553Z","repository":{"id":37439785,"uuid":"298071274","full_name":"lib-ruby-parser/lib-ruby-parser","owner":"lib-ruby-parser","description":"Ruby parser written in Rust","archived":true,"fork":false,"pushed_at":"2024-10-02T05:55:59.000Z","size":3312,"stargazers_count":242,"open_issues_count":4,"forks_count":10,"subscribers_count":5,"default_branch":"master","last_synced_at":"2025-07-09T12:51:52.852Z","etag":null,"topics":["parser","ruby","rust"],"latest_commit_sha":null,"homepage":"","language":"Rust","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/lib-ruby-parser.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2020-09-23T19:17:44.000Z","updated_at":"2025-04-05T20:55:31.000Z","dependencies_parsed_at":"2023-01-21T07:16:03.511Z","dependency_job_id":"9c37d5ba-be97-42f8-a755-83ee9e8e2e75","html_url":"https://github.com/lib-ruby-parser/lib-ruby-parser","commit_stats":{"total_commits":749,"total_committers":4,"mean_commits":187.25,"dds":0.03204272363150873,"last_synced_commit":"fc1f845c4a7f5211d4909c6e2b77b69ef2633328"},"previous_names":[],"tags_count":45,"template":false,"template_full_name":null,"purl":"pkg:github/lib-ruby-parser/lib-ruby-parser","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lib-ruby-parser%2Flib-ruby-parser","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lib-ruby-parser%2Flib-ruby-parser/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lib-ruby-parser%2Flib-ruby-parser/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lib-ruby-parser%2Flib-ruby-parser/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/lib-ruby-parser","download_url":"https://codeload.github.com/lib-ruby-parser/lib-ruby-parser/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lib-ruby-parser%2Flib-ruby-parser/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":265527541,"owners_count":23782480,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["parser","ruby","rust"],"created_at":"2024-08-06T08:03:08.885Z","updated_at":"2025-07-16T17:30:40.224Z","avatar_url":"https://github.com/lib-ruby-parser.png","language":"Rust","readme":"# lib-ruby-parser\n\n[![test](https://github.com/lib-ruby-parser/lib-ruby-parser/actions/workflows/test.yml/badge.svg)](https://github.com/lib-ruby-parser/lib-ruby-parser/actions/workflows/test.yml)\n[![unsafe forbidden](https://img.shields.io/badge/unsafe-forbidden-success.svg)](https://github.com/rust-secure-code/safety-dance/)\n[![Crates.io](https://img.shields.io/crates/v/lib-ruby-parser?color=orange)](https://crates.io/crates/lib-ruby-parser)\n[![codecov](https://codecov.io/gh/lib-ruby-parser/lib-ruby-parser/branch/master/graph/badge.svg)](https://codecov.io/gh/lib-ruby-parser/lib-ruby-parser)\n[![MIT Licence](https://badges.frapsoft.com/os/mit/mit.svg?v=103)](https://opensource.org/licenses/mit-license.php)\n[![dependency status](https://deps.rs/repo/github/lib-ruby-parser/lib-ruby-parser/status.svg)](https://deps.rs/repo/github/lib-ruby-parser/lib-ruby-parser)\n[![Docs](https://img.shields.io/docsrs/lib-ruby-parser)](https://docs.rs/lib-ruby-parser)\n\n\n`lib-ruby-parser` is a Ruby parser written in Rust.\n\nBasic usage:\n\n```rust\nuse lib_ruby_parser::{Parser, ParserOptions};\n\nfn main() -\u003e Result\u003c(), Box\u003cdyn std::error::Error\u003e\u003e {\n    let options = ParserOptions {\n        buffer_name: \"(eval)\".to_string(),\n        ..Default::default()\n    };\n    let mut parser = Parser::new(b\"2 + 2\".to_vec(), options);\n\n    println!(\"{:#?}\", parser.do_parse());\n\n    Ok(())\n}\n```\n\n[Full documentation](https://docs.rs/lib-ruby-parser)\n\n## Features\n\nTLDR; it's fast, it's precise, and it has a beautiful interface.\n\nComparison with `Ripper`/`RubyVM::AST`:\n1. It's based on MRI's `parse.y`, and so it returns **exactly** the same sequence of tokens.\n2. It's been tested on top 300 gems (by total downloads, that's about 4M LOC), `rubyspec` and `ruby/ruby` repos and there's no difference with `Ripper.lex`.\n3. It's ~5 times faster than `Ripper`, Ripper parses 4M LOC in ~24s, `lib-ruby-parser` does it in ~4.5s. That's ~950K LOC/s. You can find benchmarks in the `bench/` directory, they don't include any IO or GC.\n4. It has a much, much better interface. AST is strongly typed and well documented.\n5. It doesn't throw away information about tokens. All nodes have information about their source locations.\n\nComparison with [whitequark/parser](https://github.com/whitequark/parser):\n1. It's much faster (the same corpus of 4M LOC can be parsed in 245s on the same machine)\n1. It has a very similar interface (both in terms of AST structure and errors reporting)\n3. However, AST is strongly typed, and so if something is nullable it's explicitly defined and documented.\n4. What's important, it doesn't depend on Ruby\n\nTesting corpus has `4,176,379` LOC and `170,114,575` bytes so approximate parsing speed on my local machine is:\n\n| Parser            | Total time | Bytes per second | Lines per second |\n| ----------------- | ---------- | ---------------- | ---------------- |\n| lib-ruby-parser   | ~4.4s      | ~38,000,000      | ~950,000         |\n| ripper            | ~24s       | ~7,000,000       | ~175,000         |\n| whitequark/parser | ~245s      | ~700,000         | ~17,000          |\n\n## Grammar versioning\n\n`lib-ruby-parser` follows MRI/master. There are no plans to support multiple versions like it's done in `whitequark/parser`.\n\n## Library versioning\n\n| Ruby version | lib-ruby-parser version |\n|--------------|-------------------------|\n| 3.0.0        | 3.0.0+                  |\n| 3.1.0        | 4.0.0+ruby-3.1.0        |\n\nStarting from `4.0.0` lib-ruby-parser follows SemVer. Base version increments according to API changes,\nwhile metadata matches current Ruby version, i.e. `X.Y.Z+ruby-A.B.C` means:\n\n+ `X.Y.Z` base version\n+ that parses Ruby `A.B.C`\n\nBoth versions bump separately.\n\n## Encodings\n\nBy default `lib-ruby-parser` can only parse source files encoded in `UTF-8` or `ASCII-8BIT/BINARY`.\n\nIt's possible to pass a `decoder` function in `ParserOptions` that takes a recognized (by the library) encoding and a byte array. It must return a UTF-8 encoded byte array or an error:\n\n```rust\nuse lib_ruby_parser::source::{InputError, Decoder, DecoderResult};\nuse lib_ruby_parser::{Parser, ParserOptions, ParserResult, LocExt};\n\nfn decode(encoding: String, input: Vec\u003cu8\u003e) -\u003e DecoderResult {\n    if \"US-ASCII\" == encoding.to_uppercase() {\n        // reencode and return Ok(result)\n        return DecoderResult::Ok(b\"# encoding: us-ascii\\ndecoded\".to_vec());\n    }\n    DecoderResult::Err(InputError::DecodingError(\n        \"only us-ascii is supported\".to_string(),\n    ))\n}\n\nlet options = ParserOptions {\n    decoder: Some(Decoder::new(Box::new(decode))),\n    ..Default::default()\n};\nlet mut parser = Parser::new(b\"# encoding: us-ascii\\n3 + 3\".to_vec(), options);\nlet ParserResult { ast, input, .. } = parser.do_parse();\n\nassert_eq!(ast.unwrap().expression().source(\u0026input).unwrap(), \"decoded\".to_string())\n```\n\n## Invalid string values\n\nRuby doesn't require string literals to be valid in their encodings. This is why the following code is valid:\n\n```ruby\n# encoding: utf-8\n\n\"\\xFF\"\n```\n\nByte sequence `255` is invalid in UTF-8, but MRI ignores it.\n\nBut not all languages support it, and this is why string and symbol nodes encapsulate a custom `StringValue` instead of a plain `String`.\n\nIf your language supports invalid strings you can use raw `.bytes` of this `StringValue`. For example, a Ruby wrapper for this library could do that.\n\nIf your language doesn't support it, better call `.to_string_lossy()` that replaces all unsupported chars with a special `U+FFFD REPLACEMENT CHARACTER (�)`.\n\n## Regexes\n\nRuby constructs regexes from literals during parsing to:\n1. validate them\n2. declare local variables if regex is used for matching AND it contains named captures\n\nTo mirror this behavior `lib-ruby-parser` uses Onigurama to compile, validate and parse regex literals.\n\nThis feature is disabled by default, but you can add it by enabling `\"onig\"` feature.\n\n## Bison\n\nThe grammar of `lib-ruby-parser` is built using a [custom bison skeleton](https://github.com/iliabylich/rust-bison-skeleton) that was written for this project.\n\nFor development you need the latest version of Bison installed locally. Of course, it's not necessary for release builds from crates.io (because compiled `parser.rs` is included into release build AND `build.rs` that converts it is excluded).\n\nIf you use it from GitHub directly you also need Bison (because `parser.rs` is under gitignore)\n\n## Bindings for other languages\n\n+ [C](https://github.com/lib-ruby-parser/c-bindings)\n+ [C++](https://github.com/lib-ruby-parser/cpp-bindings)\n+ [Node.js](https://github.com/lib-ruby-parser/node-bindings)\n+ [Ruby](https://github.com/lib-ruby-parser/ruby-bindings)\n+ [WASM](https://github.com/lib-ruby-parser/wasm-bindings) (with live demo)\n\n## Profiling\n\nYou can use `parse` example:\n\n```sh\n$ cargo run --bin parse --features=bin-parse -- --print=N --run-profiler --glob \"blob/**/*.rb\"\n```\n\n## Benchmarking\n\nA codebase of 4M LOCs can be generated using a `download.rb` script:\n\n```sh\n$ ruby gems/download.rb\n```\n\nThen, run a script that compares `Ripper` and `lib-ruby-parser` (attached results are from Mar 2024):\n\n```sh\n$ ./scripts/bench.sh\nRunning lib-ruby-parser\nRun 1:\nTime taken: 4.4287733330 (total files: 17895)\nRun 2:\nTime taken: 4.4292764170 (total files: 17895)\nRun 3:\nTime taken: 4.4460961250 (total files: 17895)\nRun 4:\nTime taken: 4.4284508330 (total files: 17895)\nRun 5:\nTime taken: 4.4695665830 (total files: 17895)\n--------\nRunning MRI/ripper\nRun 1:\nTime taken: 24.790103999897838 (total files: 17894)\nRun 2:\nTime taken: 23.145863000303507 (total files: 17894)\nRun 3:\nTime taken: 25.50493900012225 (total files: 17894)\nRun 4:\nTime taken: 24.570900999940932 (total files: 17894)\nRun 5:\nTime taken: 26.0963700003922 (total files: 17894)\n```\n\n## Fuzz testing\n\nFirst, make sure to switch to nightly:\n\n```sh\n$ rustup default nightly\n```\n\nThen install `cargo-fuzz`:\n\n```sh\n$ cargo install cargo-fuzz\n```\n\nAnd run the fuzzer (change the number of `--jobs` as you need or remove it to run only 1 parallel process):\n\n```sh\n$ RUST_BACKTRACE=1 cargo fuzz run parse --jobs=8 -- -max_len=50\n```\n","funding_links":[],"categories":["Rust"],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flib-ruby-parser%2Flib-ruby-parser","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Flib-ruby-parser%2Flib-ruby-parser","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flib-ruby-parser%2Flib-ruby-parser/lists"}