{"id":17053698,"url":"https://github.com/kareman/patterns","last_synced_at":"2025-08-09T01:44:36.025Z","repository":{"id":63913758,"uuid":"189468567","full_name":"kareman/Patterns","owner":"kareman","description":"A Swift PEG parser","archived":false,"fork":false,"pushed_at":"2022-08-11T21:15:43.000Z","size":2507,"stargazers_count":27,"open_issues_count":5,"forks_count":1,"subscribers_count":3,"default_branch":"master","last_synced_at":"2025-03-17T22:34:03.822Z","etag":null,"topics":["grammar","grammars","parsing-expression-grammar","parsing-expression-grammars","peg","peg-parser","regexes","regexps","regular-expression","swift"],"latest_commit_sha":null,"homepage":"","language":"Swift","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/kareman.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2019-05-30T19:10:48.000Z","updated_at":"2024-11-09T06:31:29.000Z","dependencies_parsed_at":"2023-01-14T13:31:13.403Z","dependency_job_id":null,"html_url":"https://github.com/kareman/Patterns","commit_stats":null,"previous_names":[],"tags_count":4,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kareman%2FPatterns","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kareman%2FPatterns/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kareman%2FPatterns/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kareman%2FPatterns/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/kareman","download_url":"https://codeload.github.com/kareman/Patterns/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":244995142,"owners_count":20544293,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["grammar","grammars","parsing-expression-grammar","parsing-expression-grammars","peg","peg-parser","regexes","regexps","regular-expression","swift"],"created_at":"2024-10-14T10:13:00.085Z","updated_at":"2025-03-22T17:31:22.983Z","avatar_url":"https://github.com/kareman.png","language":"Swift","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cp align=\"center\"\u003e\n   \u003ca href=\"https://github.com/apple/swift-package-manager\"\u003e\n      \u003cimg src=\"https://img.shields.io/badge/Swift%20Package%20Manager-compatible-brightgreen.svg\" alt=\"SPM\"\u003e\n   \u003c/a\u003e\n   \u003cimg src=\"https://img.shields.io/badge/Linux-compatible-brightgreen\" alt=\"Linux\"\u003e\n\u003c/p\u003e \n\n\n# Patterns\n\nPatterns is a Swift library for Parser Expression Grammars (PEGs). It can be used to create expressions similar to regular expressions (like regex’es) and grammars (for parsers).\n\nFor general information about PEGs, see [the original paper](https://dl.acm.org/doi/10.1145/982962.964011) or [Wikipedia](https://en.wikipedia.org/wiki/Parsing_expression_grammar).\n\n## Example\n\n```swift\nlet text = \"This is a point: (43,7), so is (0, 5). But my final point is (3,-1).\"\n\nlet number = (\"+\" / \"-\" / \"\") • digit+\nlet point = \"(\" • Capture(name: \"x\", number)\n\t• \",\" • \" \"¿ • Capture(name: \"y\", number) • \")\"\n\nstruct Point: Codable {\n\tlet x, y: Int\n}\n\nlet points = try Parser(search: point).decode([Point].self, from: text)\n// points == [Point(x: 43, y: 7), Point(x: 0, y: 5), Point(x: 3, y: -1)]\n```\n\nSee also:\n- [Parsing Unicode property data files](https://nottoobadsoftware.com/blog/textpicker/patterns/parsing_unicode_property_data_files/)\n\n## Usage\n\nPatterns are defined directly in code, instead of in a text string.\n\n**Note**: Long patterns can give the Swift type checker a lot to think about, especially long series of `a / b / c / etc...`. To improve build times, try to split a long pattern into multiple shorter ones.\n\n### Standard PEG\n\n##### `\"text\"`\n\nText within double quotes matches that exact text, no need to escape special letters with `\\`. If you want to turn a string variable `s` into a pattern, use `Literal(s)`.\n\n##### `OneOf(...)`\n\nThis is like character classes (`[...]`) from regular expressions, and matches 1 character. `OneOf(\"aeiouAEIOU\")` matches any single character in that string, and `OneOf(\"a\"...\"e\")` matches any of \"abcde\". They can also be combined, like `OneOf(\"aeiou\", punctuation, \"x\"...\"z\")`. To match any character _except_ ..., use `OneOf(not: ...)`.\n\n\nYou can also implement one yourself:\n\n```swift\nOneOf(description: \"ten\") { character in\n\tcharacter.wholeNumberValue == 10\n}\n```\n\nIt takes a closure `@escaping (Character) -\u003e Bool` and matches any character for which the closure returns `true`. The description parameter is only used when creating a textual representation of the pattern.\n\n##### `a • b • c`\n\nThe • operator (Option-8 on U.S. keyboards, Option-Q on Norwegian ones) first matches `a`, then `b` and then `c`. It is used to create a pattern from a sequence of other patterns.\n\n##### `a*`  \n\nmatches 0 or more, as many as it can (it is greedy, like the regex  `a*?`). So a pattern like `a* • a` will never match anything because the `a*` pattern will always match all it can, leaving nothing left for the last `a`.\n\n##### `a+`\n\n matches 1 or more, also as many as it can (like the regex  `a+?`).\n\n##### `a¿`\n\nmakes `a` optional, but it always matches if it can (the `¿` character is Option-Shift-TheKeyWith?OnIt on most keyboards).\n\n##### `a / b`\n\nThis first tries the pattern on the left. If that fails it tries the pattern on the right. This is _ordered choice_, once `a` has matched it will never go back and try `b` if a later part of the expression fails. This is the main difference between PEGs and most other grammars and regex'es.\n\n##### `\u0026\u0026a • b`\n\nThe \"and predicate\" first verifies that `a` matches, then moves the position in the input back to where `a` began and continues with `b`. In other words it verifies that both `a` and `b` match from the same position. So to match one ASCII letter you can use `\u0026\u0026ascii • letter`.\n\n##### `!a • b`\n\nThe \"not predicate\" verifies that `a` does _not_ match, then just like above it moves the position in the input back to where `a` began and continues with `b`. You can read it like \"b and not a\".\n\n#### Grammars\n\nThe main advantage of PEGs over regular expressions is that they support recursive expressions. These expressions can contain themselves, or other expressions that in turn contain them. Here is how you can parse simple arithmetic expressions:\n\n```swift\nlet arithmetic = Grammar { g in\n\tg.all     \u003c- g.expr • !any\n\tg.expr    \u003c- g.sum\n\tg.sum     \u003c- g.product • ((\"+\" / \"-\") • g.product)*\n\tg.product \u003c- g.power • ((\"*\" / \"/\") • g.power)*\n\tg.power   \u003c- g.value • (\"^\" • g.power)¿\n\tg.value   \u003c- digit+ / \"(\" • g.expr • \")\"\n}\n```\n\nThis will parse expressions like \"1+2-3^(4*3)/2\".\n\nThe top expression is called first. `• !any` means it must match the entire string, because only at the end of the string is there no characters. If you want to match multiple arithmetic expressions in a string, comment out the first expression. Grammars use dynamic properties so there is no auto-completion for the expression names.\n\n### Additions\n\nThere are predefined OneOf patterns for all the boolean `is...` properties of Swift's `Character`: `letter`, `lowercase`, `uppercase`, `punctuation`, `whitespace`, `newline`, `hexDigit`, `digit`, `ascii`, `symbol`, `mathSymbol`, `currencySymbol`.\n\nThey all have the same name as the last part of the property, except for `wholeNumber`, which is renamed to `digit` because `wholeNumber` sounds more like an entire number than a single digit.\n\nThere is also `alphanumeric`, which is a `letter` or a `digit`.\n\n##### `any`\n\nMatches any character. `!any` matches only the end of the text.\n\n##### `Line()` \n\nMatches a single line, not including the newline characters. So `Line() • Line()` will never match anything, but `Line() • \"\\n\" • Line()` matches 2 lines.\n\n`Line.Start()` matches at the beginning of the text, and after any newline characters. `Line.End()` matches at the end of the text, and right before any newline characters. They both have a length of 0, which means the next pattern will start at the same position in the text.\n\n##### `Word.Boundary()` \n\nMatches the position right before or right after a word. Like `Line.Start()` and `Line.End()` it also has a length of 0.\n\n##### `a.repeat(...)`\n\n`a.repeat(2)` matches 2 of that pattern in a row. `a.repeat(...2)` matches 0, 1 or 2, `a.repeat(2...)` matches 2 or more and `a.repeat(3...6)` between 3 and 6. \n\n##### `Skip() • a • b`\n\nFinds the first match of `a • b` from the current position.\n\n### Parsing\n\nTo actually use a pattern, pass it to a Parser:\n\n```swift\nlet parser = try Parser(search: a)\nfor match in parser.matches(in: text) {\n\t// ...\n}\n```\n\n`Parser(search: a)` searches for the first match for `a`. It is the same as `Parser(Skip() • a)`.\n\nThe `.matches(in: String)` method returns a lazy sequence of `Match` instances.\n\nOften we are only interested in parts of a pattern. You can use the `Capture` pattern to assign a name to those parts:\n\n```swift\nlet text = \"This is a point: (43,7), so is (0, 5). But my final point is (3,-1).\"\n\nlet number = (\"+\" / \"-\" / \"\") • digit+\nlet point = \"(\" • Capture(name: \"x\", number)\n\t• \",\" • \" \"¿ • Capture(name: \"y\", number) • \")\"\n\nstruct Point: Codable {\n\tlet x, y: Int\n}\n\nlet parser = try Parser(search: point)\nlet points = try parser.decode([Point].self, from: text)\n```\n\nOr you can use subscripting:\n\n```swift\nlet pointsAsSubstrings = parser.matches(in: text).map { match in\n\t(text[match[one: \"x\"]!], text[match[one: \"y\"]!])\n}\n```\n\nYou can also use `match[multiple: name]` to get an array if captures with that name may be matched multiple times. `match[one: name]` only returns the first capture of that name.\n\n### Inputs\n\nBy default, patterns have `String` as their input type. But you can use any `BidirectionalCollection` with `Hashable` elements for input. Just explicitly specify the input type of the first pattern, and the rest should get it automatically:\n\n```swift\nlet text = \"This is a point: (43,7), so is (0, 5). But my final point is (3,-1).\".utf8\n\nlet digit = OneOf\u003cString.UTF8View\u003e(UInt8(ascii: \"0\")...UInt8(ascii: \"9\"))\nlet number = (\"+\" / \"-\" / \"\") • digit+\nlet point = \"(\" • Capture(name: \"x\", number)\n\t• \",\" • \" \"¿ • Capture(name: \"y\", number) • \")\"\n\nstruct Point: Codable {\n\tlet x, y: Int\n}\n\nlet parser = try Parser(search: point)\nlet pointsAsSubstrings = parser.matches(in: text).map { match in\n\t(text[match[one: \"x\"]!], text[match[one: \"y\"]!])\n}\n```\n\n`Parser.decode` can (currently) only take String as input, but `.matches` handles all types.\n\n## Setup\n\n### [Swift Package Manager](https://swift.org/package-manager/)\n\nAdd this to your `Package.swift` file:\n\n```swift\ndependencies: [\n    .package(url: \"https://github.com/kareman/Patterns.git\", from: \"0.1.0\"),\n]\n```\n\nor choose “Add Package Dependency” from within Xcode.\n\n## Implementation\n\nPatterns is implemented using a virtual parsing machine, similar to how [LPEG](http://www.inf.puc-rio.br/~roberto/lpeg/) is [implemented](http://www.inf.puc-rio.br/~roberto/docs/peg.pdf), and the `backtrackingvm` function described [here](https://swtch.com/~rsc/regexp/regexp2.html).\n\n## Contributing\n\nContributions are most welcome 🙌.\n\n## License\n\nMIT\n\n```text\nPatterns\nCopyright © 2019\n\nPermission is hereby granted, free of charge, to any person obtaining a copy\nof this software and associated documentation files (the \"Software\"), to deal\nin the Software without restriction, including without limitation the rights\nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell\ncopies of the Software, and to permit persons to whom the Software is\nfurnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in\nall copies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN\nTHE SOFTWARE.\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkareman%2Fpatterns","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fkareman%2Fpatterns","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkareman%2Fpatterns/lists"}