{"id":15629008,"url":"https://github.com/pwwang/regexr","last_synced_at":"2026-01-31T08:01:51.682Z","repository":{"id":48522049,"uuid":"516924216","full_name":"pwwang/regexr","owner":"pwwang","description":"Regular expressions for humans","archived":false,"fork":false,"pushed_at":"2022-07-26T05:11:50.000Z","size":511,"stargazers_count":4,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-06-20T23:43:21.059Z","etag":null,"topics":["regex","regular-expression","regular-expressions"],"latest_commit_sha":null,"homepage":"https://pwwang.github.io/regexr","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/pwwang.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2022-07-23T00:57:50.000Z","updated_at":"2024-10-26T20:11:53.000Z","dependencies_parsed_at":"2022-08-31T19:21:13.737Z","dependency_job_id":null,"html_url":"https://github.com/pwwang/regexr","commit_stats":null,"previous_names":[],"tags_count":2,"template":false,"template_full_name":null,"purl":"pkg:github/pwwang/regexr","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pwwang%2Fregexr","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pwwang%2Fregexr/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pwwang%2Fregexr/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pwwang%2Fregexr/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/pwwang","download_url":"https://codeload.github.com/pwwang/regexr/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pwwang%2Fregexr/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28934612,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-31T07:49:44.436Z","status":"ssl_error","status_checked_at":"2026-01-31T07:49:34.274Z","response_time":128,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["regex","regular-expression","regular-expressions"],"created_at":"2024-10-03T10:25:19.013Z","updated_at":"2026-01-31T08:01:51.640Z","avatar_url":"https://github.com/pwwang.png","language":"Python","readme":"# regexr\n\nRegular expressions for humans\n\nInstead of writing a regular expression to match an URL:\n\n```python\n# need to be compiled with re.X\nregex = r'''\n    ^(?P\u003cprotocol\u003ehttp|https|ftp|mailto|file|data|irc)://\n    (?P\u003cdomain\u003e[A-Za-z0-9-]{0,63}(?:\\.[A-Za-z0-9-]{0,63})+)\n    (?::(?P\u003cport\u003e\\d{1,4}))?\n    (?P\u003cpath\u003e/*(?:/*[A-Za-z0-9\\-._]+/*)*)\n    (?:\\?(?P\u003cquery\u003e.*?))?\n    (?:\\#(?P\u003cfragment\u003e.*))?$\n'''\n```\n\nYou can write:\n\n```python\nregexr = Regexr(\n    START,\n    ## match the protocol\n    Or('http', 'https', 'ftp', 'mailto', 'file', 'data', 'irc', capture=\"protocol\"),\n    '://',\n    ## match the domain\n    Capture(\n        Repeat(OneOfChars('A-Z', 'a-z', '0-9', '-'), m=0, n=63),\n        OneOrMore(DOT, Repeat(OneOfChars('A-Z', 'a-z', '0-9', '-'), m=0, n=63)),\n        name=\"domain\",\n    ),\n    ## match the port\n    Maybe(':', Capture(Repeat(DIGIT, m=1, n=4), name=\"port\")),\n    ## match the path\n    Capture(\n        ZeroOrMore('/'),\n        ZeroOrMore(\n            ZeroOrMore('/'),\n            OneOrMore(OneOfChars('A-Z', 'a-z', '0-9', r'\\-._')),\n            ZeroOrMore('/'),\n        ),\n        name=\"path\",\n    ),\n    ## match the query\n    Maybe(\"?\", Capture(Lazy(MAYBE_ANYCHARS), name=\"query\")),\n    ## and finally the fragment\n    Maybe(\"#\", Capture(MAYBE_ANYCHARS, name=\"fragment\")),\n    END,\n)\n```\n\nInspired by [rex](https://github.com/r-lib/rex) for R and [Regularity](https://github.com/andrewberls/regularity) for Ruby.\n\n## Why?\n\nWe have `re.X` to compile a regular expression in verbose mode, but sometimes it is still difficult to read/write and error-prone.\n\n- Easy to read/write regular expressions\n\n  - For example, `[]]` might need a second to understand it. But we can write it as `OneOfChars(\"]\")` and it will be easier to read.\n\n- Easy to write regular expressions with autocompletions from IDEs\n\n  - When we write raw regex, we can't get any hints from IDEs\n\n- Non-capturing for groups whether possible\n\n  - For example, with `Maybe(Maybe(\"a\", \"b))` we get `(?:(?:ab)?)?`\n\n- Easy to avoid unintentional errors\n\n  - For example, sometimes it's difficult to debug with `r\"(?P\u003ca\u003e\u003e\\d+)\\D+\\a` when we accidentally put one more `\u003e` after the capturing name.\n\n- Easy to avoid ambiguity\n\n  - For example, `?` could be a quantifier meaning `0` or `1` match. It could also be a non-greedy (lazy) modifier for quantifiers. It's easy to be distinguished by `Maybe(...)` and `Lazy(...)` (or quantifiers with `lazy=True`).\n\n- Easily avoid unbalanced parentheses/brackets/braces\n\n  - Especially when we want to match them. For example, `Capture(\"(\")` instead of `(\\()`.\n\n## Usage\n### More examples\n\n- Matching a phone number like `XXX-XXX-XXXX` or `(XXX) XXX XXXX`\n\n    ```python\n    Regexr(\n        START,\n        # match the first part\n        Maybe(Capture('(', name=\"open_paren\")),\n        RepeatExact(DIGIT, m=3),\n        Conditional(\"open_paren\", yes=\")\"),\n\n        Maybe(OneOfChars('- ')),\n\n        # match the second part\n        RepeatExact(DIGIT, m=3),\n\n        Maybe(OneOfChars('- ')),\n\n        # match the third part\n        RepeatExact(DIGIT, m=4),\n        END,\n    )\n\n    # compiles to\n    # ^(?P\u003copen_paren\u003e\\()?\\d{3}(?(open_paren)\\))[- ]?\\d{3}[- ]?\\d{4}$\n    ```\n\n- Matching an IP address\n\n    ```python\n    # Define the pattern for one part of xxx.xxx.xxx.xxx\n    ip_part = Or(\n        # Use Concat instead of NonCapture to avoid brackets\n        # 250-255\n        Concat(\"25\", OneOfChars('0-5')),\n        # 200-249\n        Concat(\"2\", OneOfChars('0-4'), DIGIT),\n        # 000-199\n        Concat(Or(\"0\", \"1\"), RepeatExact(DIGIT, m=2)),\n        # 00-99\n        Repeat(DIGIT, m=1, n=2),\n    )\n\n    Regexr(\n        START,\n        ip_part,\n        RepeatExact(DOT, ip_part, m=3),\n        END,\n    )\n    # compiles to\n    # ^(?:25[0-5]|2[0-4]\\d|(?:0|1)\\d{2}|\\d{1,2})(?:\\.(?:25[0-5]|2[0-4]\\d|(?:0|1)\\d{2}|\\d{1,2})){3}$\n    ```\n\n- Matching an HTML tag roughly (without attributes)\n\n    ```python\n    Regexr(\n        START,\n        \"\u003c\", Capture(WORDS, name=\"tag\"), \"\u003e\",\n        Lazy(ANYCHARS),\n        \"\u003c/\", Captured(\"tag\"), \"\u003e\",\n        END,\n    )\n    # compiles to\n    # ^\u003c(?P\u003ctag\u003e\\w+)\u003e.+?\u003c/(?P=tag)\u003e$\n    ```\n\n### Pretty print a `Regexr` object\n\nWith the example at the very beginning (matching an URL), we can pretty print it:\n\n```\n# print(regexr.pretty())\n# prints:\n\n^\n(?P\u003cprotocol\u003ehttp|https|ftp|mailto|file|data|irc)\n://\n(?P\u003cdomain\u003e\n  [A-Za-z0-9-]{0,63}\n  (?:\\.[A-Za-z0-9-]{0,63})+\n)\n(?::(?P\u003cport\u003e\\d{1,4}))?\n(?P\u003cpath\u003e\n  /*\n  (?:/*[A-Za-z0-9\\-._]+/*)*\n)\n(?:\\?(?P\u003cquery\u003e.*?))?\n(?:\\#(?P\u003cfragment\u003e.*))?\n$\n```\n\n### Compile a `Regexr` directly\n\n```python\nRegexr(\"a\").compile(re.I).match(\"A\")\n# \u003cre.Match object; span=(0, 1), match='A'\u003e\n```\n\n## API documentation\n\n\u003chttps://pwwang.github.io/regexr/\u003e\n\n## TODO\n\n- Support bytes\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpwwang%2Fregexr","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fpwwang%2Fregexr","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpwwang%2Fregexr/lists"}