{"id":18742721,"url":"https://github.com/j-f-liu/pom","last_synced_at":"2025-05-14T06:12:10.163Z","repository":{"id":39908872,"uuid":"77516069","full_name":"J-F-Liu/pom","owner":"J-F-Liu","description":"PEG parser combinators using operator overloading without macros.","archived":false,"fork":false,"pushed_at":"2025-01-24T13:02:20.000Z","size":363,"stargazers_count":510,"open_issues_count":12,"forks_count":34,"subscribers_count":8,"default_branch":"master","last_synced_at":"2025-04-12T22:16:43.773Z","etag":null,"topics":["parser-combinators","parsing","peg","rust"],"latest_commit_sha":null,"homepage":null,"language":"Rust","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/J-F-Liu.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2016-12-28T08:22:51.000Z","updated_at":"2025-04-03T06:37:20.000Z","dependencies_parsed_at":"2024-01-20T18:04:43.989Z","dependency_job_id":"b88bbc46-dd70-4298-b6f7-04986c6960c7","html_url":"https://github.com/J-F-Liu/pom","commit_stats":{"total_commits":122,"total_committers":19,"mean_commits":6.421052631578948,"dds":"0.23770491803278693","last_synced_commit":"10a49c508d39bd8ca90775c37e54c7c6757ac904"},"previous_names":[],"tags_count":18,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/J-F-Liu%2Fpom","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/J-F-Liu%2Fpom/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/J-F-Liu%2Fpom/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/J-F-Liu%2Fpom/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/J-F-Liu","download_url":"https://codeload.github.com/J-F-Liu/pom/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248637787,"owners_count":21137538,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["parser-combinators","parsing","peg","rust"],"created_at":"2024-11-07T16:09:04.788Z","updated_at":"2025-04-12T22:16:49.109Z","avatar_url":"https://github.com/J-F-Liu.png","language":"Rust","funding_links":[],"categories":[],"sub_categories":[],"readme":"# pom\n\n[![Crates.io](https://img.shields.io/crates/v/pom.svg)](https://crates.io/crates/pom)\n[![Build Status](https://travis-ci.org/J-F-Liu/pom.png)](https://travis-ci.org/J-F-Liu/pom)\n[![Docs](https://docs.rs/pom/badge.svg)](https://docs.rs/pom)\n[![Discord](https://img.shields.io/badge/discord-pom-red.svg)](https://discord.gg/CVy85pg)\n\nPEG parser combinators created using operator overloading without macros.\n\n## Document\n\n- [Tutorial](https://github.com/J-F-Liu/pom/blob/master/doc/article.md)\n- [API Reference](https://docs.rs/crate/pom/)\n- [Learning Parser Combinators With Rust](https://bodil.lol/parser-combinators/) - By Bodil Stokke\n\n## What is PEG?\n\nPEG stands for parsing expression grammar, is a type of analytic formal grammar, i.e. it describes a formal language in terms of a set of rules for recognizing strings in the language.\nUnlike CFGs, PEGs cannot be ambiguous; if a string parses, it has exactly one valid parse tree.\nEach parsing function conceptually takes an input string as its argument, and yields one of the following results:\n- success, in which the function may optionally move forward or consume one or more characters of the input string supplied to it, or\n- failure, in which case no input is consumed.\n\nRead more on [Wikipedia](https://en.wikipedia.org/wiki/Parsing_expression_grammar).\n\n## What is parser combinator?\n\nA parser combinator is a higher-order function that accepts several parsers as input and returns a new parser as its output.\nParser combinators enable a recursive descent parsing strategy that facilitates modular piecewise construction and testing.\n\nParsers built using combinators are straightforward to construct, readable, modular, well-structured and easily maintainable.\nWith operator overloading, a parser combinator can take the form of an infix operator, used to glue different parsers to form a complete rule.\nParser combinators thereby enable parsers to be defined in an embedded style, in code which is similar in structure to the rules of the formal grammar.\nAnd the code is easier to debug than macros.\n\nThe main advantage is that you don't need to go through any kind of code generation step, you're always using the vanilla language underneath.\nAside from build issues (and the usual issues around error messages and debuggability, which in fairness are about as bad with macros as with code generation), it's usually easier to freely intermix grammar expressions and plain code.\n\n## List of predefined parsers and combinators\n\n| Basic Parsers    | Description                                                     |\n|------------------|-----------------------------------------------------------------|\n| empty()          | Always succeeds, consume no input.                              |\n| end()            | Match end of input.                                             |\n| any()            | Match any symbol and return the symbol.                         |\n| sym(t)           | Match a single terminal symbol _t_.                             |\n| seq(s)           | Match sequence of symbols.                                      |\n| list(p,s)        | Match list of _p_, separated by _s_.                            |\n| one_of(set)      | Success when current input symbol is one of the set.            |\n| none_of(set)     | Success when current input symbol is none of the set.           |\n| is_a(predicate)  | Success when predicate return true on current input symbol.     |\n| not_a(predicate) | Success when predicate return false on current input symbol.    |\n| take(n)          | Read _n_ symbols.                                               |\n| skip(n)          | Skip _n_ symbols.                                               |\n| call(pf)         | Call a parser factory, can be used to create recursive parsers. |\n\n| Parser Combinators | Description                                                                                                                                                                                    |\n|--------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|\n| p \u0026#124; q         | Match p or q, return result of the first success.                                                                                                                                              |\n| p + q              | Match p and q, if both succeed return a pair of results.                                                                                                                                       |\n| p - q              | Match p and q, if both succeed return result of p.                                                                                                                                             |\n| p \\* q             | Match p and q, if both succeed return result of q.                                                                                                                                             |\n| p \u003e\u003e q             | Parse p and get result P, then parse q and return result of q(P).                                                                                                                              |\n| -p                 | Success when p succeeds, doesn't consume input.                                                                                                                                                |\n| !p                 | Success when p fails, doesn't consume input.                                                                                                                                                   |\n| p.opt()            | Make parser optional. Returns an `Option`.                                                                                                                                                     |\n| p.repeat(m..n)     | `p.repeat(0..)` repeat p zero or more times\u003cbr\u003e`p.repeat(1..)` repeat p one or more times\u003cbr\u003e`p.repeat(1..4)` match p at least 1 and at most 3 times\u003cbr\u003e`p.repeat(5)` repeat p exactly 5 times |\n| p.map(f)           | Convert parser result to desired value.                                                                                                                                                        |\n| p.convert(f)       | Convert parser result to desired value, fails in case of conversion error.                                                                                                                     |\n| p.pos()            | Get input position after matching p.                                                                                                                                                           |\n| p.collect()        | Collect all matched input symbols.                                                                                                                                                             |\n| p.discard()        | Discard parser output.                                                                                                                                                                         |\n| p.name(\\_)         | Give parser a name to identify parsing errors.\u003cbr\u003eIf the `trace` feature is enabled then a basic trace for the parse and parse result is made to stderr.                                       |\n| p.expect(\\_)       | Mark parser as expected, abort early when failed in ordered choice.                                                                                                                            |\n\nThe choice of operators is established by their operator precedence, arity and \"meaning\".\nUse `*` to ignore the result of first operand on the start of an expression, `+` and `-` can fulfill the need on the rest of the expression.\n\nFor example, `A * B * C - D + E - F` will return the results of C and E as a pair.\n\n## Example code\n\n```rust\nuse pom::parser::*;\n\nlet input = b\"abcde\";\nlet parser = sym(b'a') * none_of(b\"AB\") - sym(b'c') + seq(b\"de\");\nlet output = parser.parse(input);\nassert_eq!(output, Ok( (b'b', vec![b'd', b'e'].as_slice()) ) );\n```\n\n### Example JSON parser\n\n```rust\nextern crate pom;\nuse pom::parser::*;\nuse pom::Parser;\n\nuse std::collections::HashMap;\nuse std::str::{self, FromStr};\n\n#[derive(Debug, PartialEq)]\npub enum JsonValue {\n\tNull,\n\tBool(bool),\n\tStr(String),\n\tNum(f64),\n\tArray(Vec\u003cJsonValue\u003e),\n\tObject(HashMap\u003cString,JsonValue\u003e)\n}\n\nfn space() -\u003e Parser\u003cu8, ()\u003e {\n\tone_of(b\" \\t\\r\\n\").repeat(0..).discard()\n}\n\nfn number() -\u003e Parser\u003cu8, f64\u003e {\n\tlet integer = one_of(b\"123456789\") - one_of(b\"0123456789\").repeat(0..) | sym(b'0');\n\tlet frac = sym(b'.') + one_of(b\"0123456789\").repeat(1..);\n\tlet exp = one_of(b\"eE\") + one_of(b\"+-\").opt() + one_of(b\"0123456789\").repeat(1..);\n\tlet number = sym(b'-').opt() + integer + frac.opt() + exp.opt();\n\tnumber.collect().convert(str::from_utf8).convert(|s|f64::from_str(\u0026s))\n}\n\nfn string() -\u003e Parser\u003cu8, String\u003e {\n\tlet special_char = sym(b'\\\\') | sym(b'/') | sym(b'\"')\n\t\t| sym(b'b').map(|_|b'\\x08') | sym(b'f').map(|_|b'\\x0C')\n\t\t| sym(b'n').map(|_|b'\\n') | sym(b'r').map(|_|b'\\r') | sym(b't').map(|_|b'\\t');\n\tlet escape_sequence = sym(b'\\\\') * special_char;\n\tlet string = sym(b'\"') * (none_of(b\"\\\\\\\"\") | escape_sequence).repeat(0..) - sym(b'\"');\n\tstring.convert(String::from_utf8)\n}\n\nfn array() -\u003e Parser\u003cu8, Vec\u003cJsonValue\u003e\u003e {\n\tlet elems = list(call(value), sym(b',') * space());\n\tsym(b'[') * space() * elems - sym(b']')\n}\n\nfn object() -\u003e Parser\u003cu8, HashMap\u003cString, JsonValue\u003e\u003e {\n\tlet member = string() - space() - sym(b':') - space() + call(value);\n\tlet members = list(member, sym(b',') * space());\n\tlet obj = sym(b'{') * space() * members - sym(b'}');\n\tobj.map(|members|members.into_iter().collect::\u003cHashMap\u003c_,_\u003e\u003e())\n}\n\nfn value() -\u003e Parser\u003cu8, JsonValue\u003e {\n\t( seq(b\"null\").map(|_|JsonValue::Null)\n\t| seq(b\"true\").map(|_|JsonValue::Bool(true))\n\t| seq(b\"false\").map(|_|JsonValue::Bool(false))\n\t| number().map(|num|JsonValue::Num(num))\n\t| string().map(|text|JsonValue::Str(text))\n\t| array().map(|arr|JsonValue::Array(arr))\n\t| object().map(|obj|JsonValue::Object(obj))\n\t) - space()\n}\n\npub fn json() -\u003e Parser\u003cu8, JsonValue\u003e {\n\tspace() * value() - end()\n}\n\nfn main() {\n\tlet input = br#\"\n\t{\n        \"Image\": {\n            \"Width\":  800,\n            \"Height\": 600,\n            \"Title\":  \"View from 15th Floor\",\n            \"Thumbnail\": {\n                \"Url\":    \"http://www.example.com/image/481989943\",\n                \"Height\": 125,\n                \"Width\":  100\n            },\n            \"Animated\" : false,\n            \"IDs\": [116, 943, 234, 38793]\n        }\n    }\"#;\n\n\tprintln!(\"{:?}\", json().parse(input));\n}\n```\n\nYou can run this example with the following command:\n\n```\ncargo run --example json\n```\n\n## Benchmark\n\n| Parser                                               | Time to parse the same JSON file |\n|------------------------------------------------------|----------------------------------|\n| pom: json_byte                                       | 621,319 ns/iter (+/- 20,318)     |\n| pom: json_char                                       | 627,110 ns/iter (+/- 11,463)     |\n| [pest](https://github.com/dragostis/pest): json_char | 13,359 ns/iter (+/- 811)         |\n\n### Lifetimes and files\n\nString literals have a static lifetime so they can work with the static version of Parser\nimported from `pom::Parser`. Input read from a file has a shorter lifetime. In this case you\nshould import `pom::parser::Parser` and declare lifetimes on your parser functions. So\n\n```rust\nfn space() -\u003e Parser\u003cu8, ()\u003e {\n    one_of(b\" \\t\\r\\n\").repeat(0..).discard()\n}\n\n```\n\nwould become\n\n```rust\nfn space\u003c'a\u003e() -\u003e Parser\u003c'a, u8, ()\u003e {\n    one_of(b\" \\t\\r\\n\").repeat(0..).discard()\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fj-f-liu%2Fpom","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fj-f-liu%2Fpom","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fj-f-liu%2Fpom/lists"}