{"id":28834722,"url":"https://github.com/gruhn/regex-utils","last_synced_at":"2026-04-19T18:04:39.910Z","repository":{"id":291969968,"uuid":"918955067","full_name":"gruhn/regex-utils","owner":"gruhn","description":"TypeScript library for regex intersection, complement and other utilities that go beyond string matching.","archived":false,"fork":false,"pushed_at":"2025-05-28T21:50:41.000Z","size":303,"stargazers_count":0,"open_issues_count":2,"forks_count":0,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-06-16T10:58:23.315Z","etag":null,"topics":["javascript","regex","regexp","regular-expression","regular-expressions","typescript"],"latest_commit_sha":null,"homepage":"https://gruhn.github.io/regex-utils/","language":"TypeScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/gruhn.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-01-19T10:28:22.000Z","updated_at":"2025-05-28T21:50:07.000Z","dependencies_parsed_at":"2025-05-07T13:41:19.593Z","dependency_job_id":"90d3cae1-7cd2-412f-addd-533bd634ccdf","html_url":"https://github.com/gruhn/regex-utils","commit_stats":null,"previous_names":["gruhn/regex-utils"],"tags_count":6,"template":false,"template_full_name":null,"purl":"pkg:github/gruhn/regex-utils","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gruhn%2Fregex-utils","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gruhn%2Fregex-utils/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gruhn%2Fregex-utils/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gruhn%2Fregex-utils/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/gruhn","download_url":"https://codeload.github.com/gruhn/regex-utils/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gruhn%2Fregex-utils/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":260726757,"owners_count":23053232,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["javascript","regex","regexp","regular-expression","regular-expressions","typescript"],"created_at":"2025-06-19T09:41:52.794Z","updated_at":"2026-04-19T18:04:39.902Z","avatar_url":"https://github.com/gruhn.png","language":"TypeScript","funding_links":[],"categories":["JavaScript regex libraries"],"sub_categories":["Regex processors, utilities, and more"],"readme":"# Regex Utils\n\nZero-dependency TypeScript library for regex utilities that go beyond string matching.\nThese are surprisingly hard to come by for any programming language. ✨\n\n- [Documentation](https://gruhn.github.io/regex-utils/interfaces/RegexBuilder.html)\n- Online demos:\n  - [RegExp Equivalence Checker](https://gruhn.github.io/regex-utils/equiv-checker.html)\n  - [Random Password Generator](https://gruhn.github.io/regex-utils/password-generator.html?constraints=%5E%5B%5Cx21-%5Cx7E%5D%7B16%7D%24%0A%5BA-Z%5D%0A%5Ba-z%5D%0A%5B0-9%5D)\n\n## API Overview 🚀\n\n- 🔗 Set-style operations:\n  - [.and(...)](https://gruhn.github.io/regex-utils/interfaces/RegexBuilder.html#and) - Compute intersection of two regex.\n  - [.not()](https://gruhn.github.io/regex-utils/interfaces/RegexBuilder.html#not) - Compute the complement of a regex.\n  - [.without(...)](https://gruhn.github.io/regex-utils/interfaces/RegexBuilder.html#without) - Compute the difference of two regex.\n- ✅ Set-style predicates:\n  - [.isEquivalent(...)](https://gruhn.github.io/regex-utils/interfaces/RegexBuilder.html#isEquivalent) - Check whether two regex match the same strings.\n  - [.isSubsetOf(...)](https://gruhn.github.io/regex-utils/interfaces/RegexBuilder.html#isSubsetOf)\n  - [.isSupersetOf(...)](https://gruhn.github.io/regex-utils/interfaces/RegexBuilder.html#isSupersetOf)\n  - [.isDisjointFrom(...)](https://gruhn.github.io/regex-utils/interfaces/RegexBuilder.html#isDisjointFrom)\n  - [.isEmpty()](https://gruhn.github.io/regex-utils/interfaces/RegexBuilder.html#isEmpty) - Check whether a regex matches no strings.\n- 📜 Generate strings:\n  - [.sample(...)](https://gruhn.github.io/regex-utils/interfaces/RegexBuilder.html#sample) - Generate random strings matching a regex.\n  - [.enumerate()](https://gruhn.github.io/regex-utils/interfaces/RegexBuilder.html#enumerate) - Exhaustively enumerate strings matching a regex.\n- 🔧 Miscellaneous:\n  - [.size()](https://gruhn.github.io/regex-utils/interfaces/RegexBuilder.html#size) - Count the number of strings that a regex matches.\n  - [.derivative(...)](https://gruhn.github.io/regex-utils/interfaces/RegexBuilder.html#derivative) - Compute a Brzozowski derivative of a regex.\n- and others...\n\n## Installation 📦\n\n```bash\nnpm install @gruhn/regex-utils\n```\n```typescript\nimport { RB } from '@gruhn/regex-utils'\n```\n\n## Syntax Support\n\n| Feature | Support | Examples |\n|---------|---------|-------------|\n| Quantifiers | ✅ | `a*`, `a+`, `a{3,10}`, `a?` |\n| Lazy Quantifiers | ✅ | `a*?`, `a+?`, `a{3,10}?`, `a??` |\n| Alternation | ✅ | `a\\|b` |\n| Character classes | ✅ | `.`, `\\w`, `[a-zA-Z]`, ... |\n| Escaping | ✅ | `\\$`, `\\.`, ... |\n| (Non-)capturing groups | ✅ | `(?:...)`, `(...)` |\n| Start/end anchors | ⚠️\u003csup\u003e1\u003c/sup\u003e | `^`, `$` |\n| Lookahead | ⚠️\u003csup\u003e2\u003c/sup\u003e | `(?=...)`, `(?!...)` |\n| Lookbehind | ⚠️\u003csup\u003e2\u003c/sup\u003e | `(?\u003c=...)`, `(?\u003c!...)` |\n| Word boundary | ❌ | `\\b`, `\\B` |\n| Unicode property escapes | ❌ | `\\p{...}`, `\\P{...}` |\n| Backreferences | ❌ | `\\1` `\\2` ... |\n| `dotAll` flag | ✅ | `/.../s`, `(?s:...)` |\n| `global` flag | ✅ | `/.../g` |\n| `hasIndices` flag | ✅ | `/.../d` |\n| `ignoreCase` flag | ❌ | `/.../i` `(?i:...)` |\n| `multiline` flag | ❌ | `/.../m` `(?m:...)` |\n| `unicode` flag | ❌ | `/.../u` |\n| `unicodeSets` flag | ❌ | `/.../v` |\n| `sticky` flag | ❌ | `/.../y` |\n\n1. Some complex patterns are not supported like anchors inside quantifiers `(^a)+` or anchors inside lookaheads `(?=^a)`.\n2. Not supported are nested lookaheads/lookbehinds like `(?=a(?=b))` and lookaheads/lookbehinds combinations like `(?=a)b(?\u003c=c)`.\n\nAn `UnsupportedSyntaxError` is thrown when unsupported patterns are detected.\nThe library **SHOULD ALWAYS** either throw an error or respect the regex specification exactly.\nPlease report a bug if the library silently uses a faulty interpretation.\n\nHandling syntax-related errors:\n```typescript\nimport { RB, ParseError, UnsupportedSyntaxError } from '@gruhn/regex-utils'\n\ntry {\n  RB(/^[a-z]*$/)\n} catch (error) {\n  if (error instanceof SyntaxError) {\n    // Invalid regex syntax! Native error, not emitted by this library.\n    // E.g. this will also throw a `SyntaxError`: new RegExp(')')\n  } else if (error instanceof ParseError) {\n    // The regex syntax is valid but the internal parser could not handle it.\n    // If this happens it's a bug in this library.\n  } else if (error instanceof UnsupportedSyntaxError) {\n    // Regex syntax is valid but not supported by this library.\n  }\n}\n```\n\n## Example use cases 💡\n\n### Generate test data from regex 📜\n\nGenerate 5 random email addresses:\n```typescript\nconst email = RB(/^[a-z]+@[a-z]+\\.[a-z]{2,3}$/)\nfor (const str of email.sample().take(5)) {\n  console.log(str)\n}\n```\n```\nky@e.no\ncc@gg.gaj\nz@if.ojk\nvr@y.ehl\ne@zx.hzq\n```\n\nGenerate 5 random email addresses, which have exactly 20 characters:\n```typescript\nconst emailLength20 = email.and(/^.{20}$/)\nfor (const str of emailLength20.sample().take(5)) {\n  console.log(str)\n}\n```\n```\nkahragjijttzyze@i.mv\ngnpbjzll@cwoktvw.hhd\nknqmyotxxblh@yip.ccc\nkopfpstjlnbq@lal.nmi\nvrskllsvblqb@gemi.wc\n```\n\n### Refactor regex then check equivalence 🔄\n\n[**ONLINE DEMO**](https://gruhn.github.io/regex-utils/equiv-checker.html?regexp1=%5Ea%7Cb%24\u0026regexp2=%5E%5Bab%5D%24)\n\nSay we found this incredibly complicated regex somewhere in the codebase:\n```typescript\nconst oldRegex = /^a|b$/\n```\n\nThis can be simplified, right?\n```typescript\nconst newRegex = /^[ab]$/\n```\n\nBut to double-check we can use `.isEquivalent` to verify that the new version matches exactly the same strings as the old version.\nThat is, whether `oldRegex.test(str) === newRegex.test(str)` for every possible input string:\n\n```typescript\nRB(oldRegex).isEquivalent(newRegex) // false\n```\n\nLooks like we made some mistake.\nWe can generate counterexamples using `.without(...)` and `.sample(...)`.\nFirst, we derive new regex that match exactly what `newRegex` matches but not `oldRegex` and vice versa:\n```typescript\nconst onlyNew = RB(newRegex).without(oldRegex)\nconst onlyOld = RB(oldRegex).without(newRegex)\n```\n`onlyNew` turns out to be empty (`onlyNew.isEmpty() === true`) but `onlyOld` has some matches:\n```typescript\nfor (const str of onlyOld.sample().take(5)) {\n  console.log(str)\n}\n```\n```\naaba\naa\naba\nbab\naababa\n```\nWhy does `oldRegex` match all these strings with multiple characters?\nShouldn't it only match \"a\" or \"b\" like `newRegex`?\nTurns out we thought that  `oldRegex` is the same as `^(a|b)$`\nbut in reality it's the same as `(^a)|(b$)`.\n\n### Comment regex using complement 💬\n\nHow do you write a regex that matches HTML comments like:\n```\n\u003c!-- This is a comment --\u003e\n```\nA straightforward attempt would be:\n```typescript\n\u003c!--.*--\u003e\n```\nThe problem is that `.*` also matches the end marker `--\u003e`,\nso this is also a match:\n```typescript\n\u003c!-- This is a comment --\u003e and this shouldn't be part of it --\u003e\n```\nWe need to specify that the inner part can be any string that does not contain `--\u003e`.\nWith `.not()` (aka. regex complement) this is easy:\n\n```typescript\nimport { RB } from '@gruhn/regex-utils'\n\nconst commentStart = RB('\u003c!--')\nconst commentInner = RB(/^.*--\u003e.*$/).not()\nconst commentEnd = RB('--\u003e')\n\nconst comment = commentStart.concat(commentInner).concat(commentEnd)\n```\n\nWith `.toRegExp()` we can convert back to a native JavaScript regex:\n```typescript\ncomment.toRegExp()\n```\n```\n/^\u003c!--(---*[^-\u003e]|-?[^-])*---*\u003e$/\n```\n\n### Password regex using intersections 🔐\n\n[**ONLINE DEMO**](https://gruhn.github.io/regex-utils/password-generator.html?constraints=%5E.%7B16%2C32%7D%24%0A%5E%5B%5Cx21-%5Cx7E%5D*%24%0A%5B0-9%5D%0A%5Ba-z%5D%0A%5BA-Z%5D)\n\nIt's difficult to write a single regex for multiple independent constraints.\nFor example, to specify a valid password.\nBut with regex intersections it's very natural:\n\n```typescript\nimport { RB } from '@gruhn/regex-utils'\n\nconst passwordRegex = RB(/^[a-zA-Z0-9]{12,32}$/) // 12-32 alphanumeric characters\n  .and(/[0-9]/) // contains a number\n  .and(/[A-Z]/) // contains an upper case letter\n  .and(/[a-z]/) // contains a lower case letter\n```\n\nWe can convert this back to a native JavaScript RegExp with:\n```typescript\npasswordRegex.toRegExp()\n```\n\u003e [!NOTE]\n\u003e The output `RegExp` can be very large.\n\nWe can also use other utilities like `.size()` to determine how many potential passwords match this regex:\n```typescript\nconsole.log(passwordRegex.size())\n```\n```\n2301586451429392354821768871006991487961066695735482449920n\n```\n\nWith `.sample()` we can generate some of these matches:\n```typescript\nfor (const str of passwordRegex.sample().take(10)) {\n  console.log(str)\n}\n```\n```\nNEWJIAXQISWT0Wwm\nlxoegadrzeynezkmtfcIBzzQ9e\nypzvhvtwpWk4u6\nMSZXXKIKEKWKXLQ8HQ7Ds\nBCBSFBSMNOLKlgQN5L\n8950244600709IW1pg\nUOTQBLVOTZQWFSAJYBXZNQBEeom0l\n520302447164378435bv4dp4ysC\n71073970686490eY2Jt4\nafgpnxqwUK5B\n```\n\n### Solve _Advent Of Code 2023 - Day 12_ 🎄\n\nIn the coding puzzle [Advent Of Code 2023 - Day 12](https://adventofcode.com/2023/day/12)\nyou are given pairs of string patterns.\nAn example pair is `.??..??...?##.` and `1,1,3`.\nBoth patterns describe a class of strings and the task is to count the number of strings that match both patterns.\n\nIn the first pattern, `.` and `#` stand for the literal characters \"dot\" and \"hash\".\nThe `?` stands for either `.` or `#`.\nThis can be written as a regular expression:\n\n - for `#` we simply write `#`\n - for `.` we write `o` (since `.` is a reserved symbol in regular expressions)\n - for `?` we write `(o|#)`\n\nSo the pattern `.??..??...?##.` would be written as:\n```typescript\nconst firstRegex = /^o(o|#)(o|#)oo(o|#)(o|#)ooo(o|#)##o$/\n```\n\nIn the second pattern, each digit stands for a sequence of `#` separated by at least one `o`.\nThis can also be written as a regular expression:\n\n - For a digit like `3` we write `#{3}`.\n - Between digits we write `o+`.\n - Additionally, arbitrary many `o` are allowed at the start and end,\n   so we add `o*` at the start and end.\n\nThus, `1,1,3` would be written as:\n```typescript\nconst secondRegex = /^o*#{1}o+#{1}o+#{3}o*$/\n```\n\nTo solve the task and find the number of strings that match both regex,\nwe can use `.and(...)` and `.size()` from `regex-utils`.\n`.and(...)` computes the intersection of two regular expressions.\nThat is, it creates a new regex which exactly matches the strings matched by both input regex.\n```typescript\nconst intersection = RB(firstRegex).and(secondRegex)\n```\nWith `.size()` we can then determine the number of matched strings:\n```typescript\nconsole.log(intersection.size())\n```\n```\n4n\n```\n\nWhile at it, we can also try `.enumerate()` to list all these matches:\n```typescript\nfor (const str of intersection.enumerate()) {\n  console.log(str)\n}\n```\n```\noo#ooo#ooo###o\no#oooo#ooo###o\noo#oo#oooo###o\no#ooo#oooo###o\n```\n\nFor a full solution checkout: [./benchmark/aoc2023-day12.ts](./benchmark/aoc2023-day12.ts).\n\n## References 📖\n\nHeavily informed by these papers:\n- https://www.khoury.northeastern.edu/home/turon/re-deriv.pdf\n- https://courses.grainger.illinois.edu/cs374/fa2017/extra_notes/01_nfa_to_reg.pdf\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgruhn%2Fregex-utils","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fgruhn%2Fregex-utils","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgruhn%2Fregex-utils/lists"}