{"id":22964885,"url":"https://github.com/susji/mre","last_synced_at":"2025-04-02T04:22:45.546Z","repository":{"id":45988398,"uuid":"430088559","full_name":"susji/mre","owner":"susji","description":"Toy regular expression library","archived":false,"fork":false,"pushed_at":"2021-11-22T17:31:14.000Z","size":32,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-02-07T19:13:48.124Z","etag":null,"topics":["regex","regular-expression"],"latest_commit_sha":null,"homepage":"","language":"Go","has_issues":false,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/susji.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2021-11-20T11:49:16.000Z","updated_at":"2021-12-27T08:46:04.000Z","dependencies_parsed_at":"2022-08-23T14:31:07.057Z","dependency_job_id":null,"html_url":"https://github.com/susji/mre","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/susji%2Fmre","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/susji%2Fmre/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/susji%2Fmre/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/susji%2Fmre/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/susji","download_url":"https://codeload.github.com/susji/mre/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":246753370,"owners_count":20828137,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["regex","regular-expression"],"created_at":"2024-12-14T20:12:47.384Z","updated_at":"2025-04-02T04:22:45.532Z","avatar_url":"https://github.com/susji.png","language":"Go","funding_links":[],"categories":[],"sub_categories":[],"readme":"# MRE\n\n`MRE` is a simple regular expression library. It provides basic capabilities\nfor matching and extracting string contents. We assume that our input is\nUTF-8-encoded.\n\n## Technical details\n\nOur regular expressions support only regular languages, that is, we do not\nsupport backreferences. All matchers are greedy, that is, there is no `?`\nsuffix.\n\nSubexpressions (`(..)`) imply capturing.\n\nRanges in set expressions are treated directly with their `uint32` codepoint\nvalues.\n\nIf a `regexp` does not begin with `^`, it will be evaluated as containing an\nimplicit `.*?` in the very beginning. Similarly, if `regexp` does not end with\n`$`, it will understood as implicit `.*?` in the very end.\n\nBy default, if any of the special characters are to be used for matching\nliteral runes outside bracketed expressions (sets, they must be escaped with\n`\\`. Runes within set expressions (`[..]`) are treated literally with the\nexception of rune ranges (`-`) and negations (`^`) -- to match them literally,\nplace them accordingly in bracketed expressions. Otherwise set runes are\nmatched literally.\n\nXXX Add `]` like POSIX ERE to set matching, ie. for it to be matched as a rune,\nit needs to be placed right after `[` or `[^`.\n\nWe want alternation (`|`) to bind very loosely and thus we use the traditional\nprecedence-via-nonterminal-levels approach. The grammar below only deals with\nparsing and for this reason escapes are not included as they are treated in the\nlexing phase.\n\n\nOur grammar is roughly the following:\n\n```ebnf\nregexp \t= [ \"^\" ], { or-expr, } [ \"$\" ]\nor-expr = atoms, { \"|\", atoms }\natoms   = { atom, [ times ] }\natom    = subexpr\n        | set\n        | \".\"\n        | rune\nsubexpr = \"(\", expr, \")\"\nset     = \"[\", { \"^\" }, { rune, [ \"-\", rune ] }, \"]\"\ntimes   = \"+\"\n        | \"*\"\n        | \"?\"\n        | \"{\", posnum, \"}\"\n        | \"{\", posnum, \",\", [ posnum ], \"}\"\nposnum  = \"0\" | digit, { digit }\ndigit \t= \"0\" | ... | \"9\"\nrune \t= any-unicode-codepoint\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsusji%2Fmre","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsusji%2Fmre","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsusji%2Fmre/lists"}