{"id":25668014,"url":"https://github.com/DiscreteTom/whitehole","last_synced_at":"2025-02-24T10:02:47.305Z","repository":{"id":275079124,"uuid":"742829953","full_name":"DiscreteTom/whitehole","owner":"DiscreteTom","description":"A simple, fast, intuitive parser combinator framework for Rust.","archived":false,"fork":false,"pushed_at":"2025-02-23T12:43:24.000Z","size":2175,"stargazers_count":21,"open_issues_count":0,"forks_count":1,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-02-23T13:36:47.120Z","etag":null,"topics":["grammar","parse","parser","rust"],"latest_commit_sha":null,"homepage":"https://docs.rs/whitehole/","language":"Rust","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/DiscreteTom.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-01-13T13:53:15.000Z","updated_at":"2025-02-21T08:25:02.000Z","dependencies_parsed_at":"2025-01-31T04:23:41.275Z","dependency_job_id":"206ae023-48d2-4117-b2f1-65c3db0c9fb0","html_url":"https://github.com/DiscreteTom/whitehole","commit_stats":null,"previous_names":["discretetom/whitehole"],"tags_count":9,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DiscreteTom%2Fwhitehole","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DiscreteTom%2Fwhitehole/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DiscreteTom%2Fwhitehole/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DiscreteTom%2Fwhitehole/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/DiscreteTom","download_url":"https://codeload.github.com/DiscreteTom/whitehole/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":240457966,"owners_count":19804489,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["grammar","parse","parser","rust"],"created_at":"2025-02-24T10:02:04.807Z","updated_at":"2025-02-24T10:02:47.292Z","avatar_url":"https://github.com/DiscreteTom.png","language":"Rust","funding_links":[],"categories":["Rust"],"sub_categories":[],"readme":"# whitehole\n\n![license](https://img.shields.io/github/license/DiscreteTom/whitehole?style=flat-square)\n[![Crates.io Version](https://img.shields.io/crates/v/whitehole?style=flat-square)](https://crates.io/crates/whitehole)\n[![docs.rs](https://img.shields.io/docsrs/whitehole?style=flat-square)](https://docs.rs/whitehole/)\n[![Codecov](https://img.shields.io/codecov/c/github/DiscreteTom/whitehole?style=flat-square)](https://codecov.io/gh/DiscreteTom/whitehole)\n\nA simple, fast, intuitive parser combinator framework for Rust.\n\n## Features\n\n- Simple: only a handful of combinators to remember: `eat`, `take`, `next`, `till`, `wrap`, `recur`.\n- Operator overloading: use `+` and `|` to compose combinators, use `*` to repeat a combinator.\n- Almost zero heap allocation: this framework only uses stack memory, except `recur` which uses some pointers for recursion.\n- Re-usable heap memory: store accumulated values in a parser-managed heap, instead of re-allocation for each iteration.\n- Stateful-able: control the parsing flow with an optional custom state.\n- Safe by default, with `unsafe` variants for performance.\n- Provide both string (`\u0026str`) and bytes (`\u0026[u8]`) support.\n\n## Installation\n\n```bash\ncargo add whitehole\n```\n\n## Examples\n\nSee the [examples](./examples) directory for more examples.\n\nHere is a simple example to parse [hexadecimal color codes](./examples/hex_color.rs):\n\n```rust\nuse whitehole::{\n  combinator::{eat, next},\n  parser::Parser,\n};\n\nlet double_hex = || {\n  // Repeat a combinator with `*`.\n  (next(|c| c.is_ascii_hexdigit()) * 2)\n    // Convert the matched content to `u8`.\n    .select(|accept, _| u8::from_str_radix(accept.content(), 16).unwrap())\n    // Wrap `u8` to `(u8,)`, this is required by `+` below.\n    .tuple()\n};\n\n// Concat multiple combinators with `+`.\n// Tuple values will be concatenated into a single tuple.\n// Here `() + (u8,) + (u8,) + (u8,)` will be `(u8, u8, u8)`.\nlet entry = eat('#') + double_hex() + double_hex() + double_hex();\n\nlet mut parser = Parser::builder().entry(entry).build(\"#FFA500\");\nlet output = parser.next().unwrap();\nassert_eq!(output.digested, 7);\nassert_eq!(output.value, (0xFF, 0xA5, 0x00));\n```\n\n## How to Debug\n\n### With Logging\n\nThe easiest way is to apply `.log(name)` to any combinator you need to inspect.\n\n\u003cdetails\u003e\n\n\u003csummary\u003e\nExample\n\u003c/summary\u003e\n\n```rust\nuse whitehole::{\n  combinator::{eat, next},\n  parser::Parser,\n};\n\nlet double_hex = || {\n  (next(|c| c.is_ascii_hexdigit()).log(\"hex\") * 2)\n    .log(\"double_hex\")\n    .select(|accept, _| u8::from_str_radix(accept.content(), 16).unwrap())\n    .tuple()\n};\n\nlet entry =\n  (eat('#').log(\"hash\") + double_hex().log(\"R\") + double_hex().log(\"G\") + double_hex().log(\"B\"))\n    .log(\"entry\");\n\nlet mut parser = Parser::builder().entry(entry).build(\"#FFA500\");\nparser.next().unwrap();\n```\n\nOutput:\n\n```text\n(entry) input: \"#FFA500\"\n| (hash) input: \"#FFA500\"\n| (hash) output: Some(\"#\")\n| (R) input: \"FFA500\"\n| | (double_hex) input: \"FFA500\"\n| | | (hex) input: \"FFA500\"\n| | | (hex) output: Some(\"F\")\n| | | (hex) input: \"FA500\"\n| | | (hex) output: Some(\"F\")\n| | (double_hex) output: Some(\"FF\")\n| (R) output: Some(\"FF\")\n| (G) input: \"A500\"\n| | (double_hex) input: \"A500\"\n| | | (hex) input: \"A500\"\n| | | (hex) output: Some(\"A\")\n| | | (hex) input: \"500\"\n| | | (hex) output: Some(\"5\")\n| | (double_hex) output: Some(\"A5\")\n| (G) output: Some(\"A5\")\n| (B) input: \"00\"\n| | (double_hex) input: \"00\"\n| | | (hex) input: \"00\"\n| | | (hex) output: Some(\"0\")\n| | | (hex) input: \"0\"\n| | | (hex) output: Some(\"0\")\n| | (double_hex) output: Some(\"00\")\n| (B) output: Some(\"00\")\n(entry) output: Some(\"#FFA500\")\n```\n\n\u003c/details\u003e\n\nIf you need to inspect your custom state and heap, you can use combinator decorators or write your own combinator extensions to achieve this.\n\n### With Breakpoints\n\nBecause of the high level abstraction, it's hard to set breakpoints to combinators.\n\nOne workaround is to use `wrap` to wrap your combinator in a closure or function and manually call `Action::exec`.\n\n\u003cdetails\u003e\n\n\u003csummary\u003e\nExample\n\u003c/summary\u003e\n\n```rust\nuse whitehole::{\n  combinator::{eat, next},\n  parser::Parser,\n};\n\nlet double_hex = || {\n  (next(|c| c.is_ascii_hexdigit()) * 2)\n    .select(|accept, _| u8::from_str_radix(accept.content(), 16).unwrap())\n    .tuple()\n};\n// wrap the original combinator\nlet double_hex = || {\n  use whitehole::{action::Action, combinator::wrap};\n  let c = double_hex();\n  wrap(move |instant, ctx| {\n    // set a breakpoint here\n    c.exec(instant, ctx)\n  })\n};\n\nlet entry = eat('#') + double_hex() + double_hex() + double_hex();\n\nlet mut parser = Parser::builder().entry(entry).build(\"#FFA500\");\nparser.next().unwrap();\n```\n\n\u003c/details\u003e\n\n## [Documentation](https://docs.rs/whitehole/)\n\n## [Benchmarks](https://github.com/DiscreteTom/whitehole-bench)\n\n## Related\n\n- [`in_str`](https://github.com/DiscreteTom/in_str/): a procedural macro to generate a closure that checks if a character is in the provided literal string.\n\n## Credits\n\nThis project is inspired by:\n\n- [nom](https://github.com/rust-bakery/nom)\n- [combine](https://github.com/Marwes/combine)\n- [tree-sitter](https://github.com/tree-sitter/tree-sitter)\n- [retsac](https://github.com/DiscreteTom/retsac)\n\n## [CHANGELOG](./CHANGELOG.md)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FDiscreteTom%2Fwhitehole","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FDiscreteTom%2Fwhitehole","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FDiscreteTom%2Fwhitehole/lists"}