{"id":13747308,"url":"https://github.com/jaynetics/js_regex","last_synced_at":"2025-04-15T03:49:30.698Z","repository":{"id":1759557,"uuid":"44188065","full_name":"jaynetics/js_regex","owner":"jaynetics","description":"Converts Ruby regexes to JavaScript regexes.","archived":false,"fork":false,"pushed_at":"2025-01-27T19:01:36.000Z","size":472,"stargazers_count":75,"open_issues_count":9,"forks_count":12,"subscribers_count":4,"default_branch":"main","last_synced_at":"2025-04-15T03:49:26.941Z","etag":null,"topics":["javascript","regular-expression","ruby"],"latest_commit_sha":null,"homepage":"","language":"Ruby","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/jaynetics.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2015-10-13T16:00:43.000Z","updated_at":"2025-03-19T17:59:29.000Z","dependencies_parsed_at":"2024-06-18T16:41:37.134Z","dependency_job_id":"ee161ab0-5652-4e34-80e0-d8fd9454ef89","html_url":"https://github.com/jaynetics/js_regex","commit_stats":{"total_commits":240,"total_committers":6,"mean_commits":40.0,"dds":"0.15416666666666667","last_synced_commit":"1b70a23e7c79e26392503770bf76b80644f9b501"},"previous_names":[],"tags_count":32,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jaynetics%2Fjs_regex","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jaynetics%2Fjs_regex/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jaynetics%2Fjs_regex/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jaynetics%2Fjs_regex/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/jaynetics","download_url":"https://codeload.github.com/jaynetics/js_regex/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":249003942,"owners_count":21196794,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["javascript","regular-expression","ruby"],"created_at":"2024-08-03T06:01:24.584Z","updated_at":"2025-04-15T03:49:30.682Z","avatar_url":"https://github.com/jaynetics.png","language":"Ruby","readme":"# JsRegex\r\n\r\n[![Gem Version](https://badge.fury.io/rb/js_regex.svg)](http://badge.fury.io/rb/js_regex)\r\n[![Build Status](https://github.com/jaynetics/js_regex/workflows/tests/badge.svg)](https://github.com/jaynetics/js_regex/actions)\r\n[![Build Status](https://github.com/jaynetics/js_regex/workflows/gouteur/badge.svg)](https://github.com/jaynetics/js_regex/actions)\r\n[![Coverage](https://codecov.io/gh/jaynetics/js_regex/branch/main/graph/badge.svg?token=jYoA3bnAKY)](https://codecov.io/gh/jaynetics/js_regex)\r\n\r\nThis is a Ruby gem that translates Ruby's regular expressions to various JavaScript flavors.\r\n\r\nIt can handle [almost all of Ruby's regex features](#SF), unlike a [search-and-replace approach](https://github.com/rails/rails/blob/b67043393b5ed6079989513299fe303ec3bc133b/actionpack/lib/action_dispatch/routing/inspector.rb#L42). If any incompatibilities remain, it returns [helpful warnings](#HW) to indicate them.\r\n\r\n## Installation\r\n\r\nAdd it to your gemfile or run\r\n\r\n    gem install js_regex\r\n\r\n## Usage\r\n\r\n### Basic usage\r\n\r\nIn Ruby:\r\n\r\n```ruby\r\nrequire 'js_regex'\r\n\r\nruby_hex_regex = /0x\\h+/i\r\n\r\njs_regex = JsRegex.new(ruby_hex_regex)\r\n\r\njs_regex.warnings # =\u003e []\r\njs_regex.source # =\u003e '0x[0-9A-F]+'\r\njs_regex.options # =\u003e 'i'\r\n```\r\n\r\nTo inject the result directly into JavaScript, use `#to_s` or String interpolation. E.g. in inline JavaScript in HAML or SLIM you can simply do:\r\n\r\n```javascript\r\nvar regExp = #{js_regex};\r\n```\r\n\r\nUse `#to_json` if you want to send it as JSON or `#to_h` to include it as a data attribute of a DOM element.\r\n\r\n```ruby\r\nrender json: js_regex\r\n\r\njs_regex.to_h # =\u003e { source: '[0-9A-F]+', options: 'i' }\r\n```\r\n\r\nTo turn the data attribute or parsed JSON back into a RegExp in JavaScript, use the `new RegExp()` constructor:\r\n\r\n```javascript\r\nvar regExp = new RegExp(jsonObj.source, jsonObj.options);\r\n```\r\n\r\n\u003ca name='HW'\u003e\u003c/a\u003e\r\n### Heed the Warnings\r\n\r\nYou might have noticed the empty `warnings` array in the example above:\r\n\r\n```ruby\r\njs_regex = JsRegex.new(ruby_hex_regex)\r\njs_regex.warnings # =\u003e []\r\n```\r\n\r\nIf this array isn't empty, that means that your Ruby regex contained some stuff that can't be carried over to JavaScript. You can still use the result, but this is not recommended. Most likely it won't match the same strings as your Ruby regex.\r\n\r\n```ruby\r\nadvanced_ruby_regex = /(?\u003c!fizz)buzz/\r\n\r\njs_regex = JsRegex.new(advanced_ruby_regex)\r\njs_regex.warnings # =\u003e [\"Dropped unsupported negated lookbehind '(?\u003c!fizz)' at index 0 (requires at least `target: 'ES2018'`)\"]\r\njs_regex.source # =\u003e 'buzz'\r\n```\r\n\r\nThere is also a strict initializer, `JsRegex::new!`, which raises a `JsRegex::Error` if there are incompatibilites. This is particularly useful if you use JsRegex to convert regex-like strings, e.g. strings entered by users, as a `JsRegex::Error` might also occur if the given regex is invalid:\r\n\r\n```ruby\r\nbegin\r\n  user_input = '('\r\n  JsRegex.new(user_input)\r\nrescue JsRegex::Error =\u003e e\r\n  e.message # =\u003e \"Premature end of pattern (missing group closing parenthesis)\"\r\nend\r\n```\r\n\r\n### Modifying RegExp options/flags\r\n\r\nAn `options:` argument lets you append options (a.k.a. \"flags\") to the output:\r\n\r\n```ruby\r\nJsRegex.new(/x/i, options: 'g').to_h\r\n# =\u003e { source: 'x', options: 'gi' }\r\n```\r\n\r\nSet the [g flag](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp/global) like this if you want to use the regex to find or replace multiple matches per string.\r\n\r\n### Converting for modern JavaScript\r\n\r\nA `target:` argument can be given to target more recent versions of JS and unlock extra features or nicer output. `'ES2009'` is the default target. `'ES2015'` and `'ES2018'` are also available.\r\n\r\n```ruby\r\n# ES2015 and greater use the u-flag to avoid lengthy escape sequences\r\nJsRegex.new(/😋/, target: 'ES2009').to_s # =\u003e \"/(?:\\\\uD83D\\\\uDE0B)/\"\r\nJsRegex.new(/😋/, target: 'ES2015').to_s # =\u003e \"/😋/u\"\r\nJsRegex.new(/😋/, target: 'ES2018').to_s # =\u003e \"/😋/u\"\r\n\r\n# ES2018 adds support for lookbehinds, properties etc.\r\nJsRegex.new(/foo\\K\\p{ascii}/, target: 'ES2015').to_s # =\u003e \"/foo[\\x00-\\x7f]/\"\r\nJsRegex.new(/foo\\K\\p{ascii}/, target: 'ES2018').to_s # =\u003e \"/(?\u003c=foo)\\p{ASCII}/\"\r\n```\r\n\r\n\u003ca name='SF'\u003e\u003c/a\u003e\r\n## Supported Features\r\n\r\nThese are the supported features by target.\r\n\r\nUnsupported features are at the bottom of this list.\r\n\r\nWhen converting a Regexp that contains unsupported features, corresponding parts of the pattern are dropped from the result and warnings are emitted.\r\n\r\n\r\n| Description                 | Example              | ES2009 | ES2015 | ES2018 |\r\n|-----------------------------|----------------------|--------|--------|--------|\r\n| anchors                     | \\A, \\z, ^, $         | ✓ [6]  | ✓ [6]  | ✓      |\r\n| escaped meta chars          | \\\\\\A                 | ✓      | ✓      | ✓      |\r\n| dot matching astral chars   | /./ =~ '😋'          | ✓      | ✓      | ✓      |\r\n| Ruby's multiline mode [1]   | /.+/m                | ✓      | ✓      | ✓      |\r\n| Ruby's free-spacing mode    | / http (s?) /x       | ✓      | ✓      | ✓      |\r\n| possessive quantifiers [2]  | ++, *+, ?+           | ✓      | ✓      | ✓      |\r\n| atomic groups [2]           | a(?\u003ebc\\|b)c          | ✓      | ✓      | ✓      |\r\n| conditionals [2]            | (?('a')b\\|c)         | ✓      | ✓      | ✓      |\r\n| option groups/switches      | (?i-m:..), (?x)..    | ✓      | ✓      | ✓      |\r\n| local encoding options      | (?u:\\w)              | ✓      | ✓      | ✓      |\r\n| absence groups              | /\\\\\\*(?~\\\\\\*/)\\\\\\*/  | ✓      | ✓      | ✓      |\r\n| chained quantifiers         | /A{2}{4}/ =~ 'A' * 8 | ✓      | ✓      | ✓      |\r\n| hex types \\h and \\H         | \\H\\h{6}              | ✓      | ✓      | ✓      |\r\n| bell and escape shortcuts   | \\a, \\e               | ✓      | ✓      | ✓      |\r\n| all literals, including \\n  | eval(\"/\\n/\")         | ✓      | ✓      | ✓      |\r\n| newline-ready anchor \\Z     | last word\\Z          | ✓      | ✓      | ✓      |\r\n| generic linebreak \\R        | data.split(/\\R/)     | ✓      | ✓      | ✓      |\r\n| meta and control escapes    | /\\M-\\C-X/            | ✓      | ✓      | ✓      |\r\n| numeric backreferences      | \\1, \\k\u0026lt;1\u0026gt;      | ✓      | ✓      | ✓      |\r\n| relative backreferences     | \\k\u0026lt;-1\u0026gt;         | ✓      | ✓      | ✓      |\r\n| named backreferences        | \\k\u0026lt;foo\u0026gt;        | ✓      | ✓      | ✓      |\r\n| numeric subexp calls        | \\g\u0026lt;1\u0026gt;          | ✓      | ✓      | ✓      |\r\n| relative subexp calls       | \\g\u0026lt;-1\u0026gt;         | ✓      | ✓      | ✓      |\r\n| named subexp calls          | \\g\u0026lt;foo\u0026gt;        | ✓      | ✓      | ✓      |\r\n| recursive subexp calls [3]  | \\g\u003c0\u003e                | ✓      | ✓      | ✓      |\r\n| nested sets                 | [a-z[A-Z]]           | ✓      | ✓      | ✓      |\r\n| types in sets               | [a-z\\h]              | ✓      | ✓      | ✓      |\r\n| properties in sets          | [a-z\\p{sc}]          | ✓      | ✓      | ✓      |\r\n| set intersections           | [\\w\u0026amp;\u0026amp;[^a]]   | ✓      | ✓      | ✓      |\r\n| recursive set negation      | [^a[^b]]             | ✓      | ✓      | ✓      |\r\n| posix types                 | [[:alpha:]]          | ✓      | ✓      | ✓      |\r\n| posix negations             | [[:^alpha:]]         | ✓      | ✓      | ✓      |\r\n| codepoint lists             | \\u{61 63 1F601}      | ✓      | ✓      | ✓      |\r\n| unicode properties          | \\p{Dash}, \\p{Thai}   | ✓      | ✓      | ✓      |\r\n| unicode abbreviations       | \\p{Mong}, \\p{Sc}     | ✓      | ✓      | ✓      |\r\n| unicode negations           | \\p{^L}, \\P{L}        | ✓      | ✓      | ✓      |\r\n| astral plane properties [2] | \\p{emoji}            | ✓      | ✓      | ✓      |\r\n| astral plane literals [2]   | 😁                   | ✓      | ✓      | ✓      |\r\n| astral plane ranges [2]     | [😁-😲]              | ✓      | ✓      | ✓      |\r\n| capturing group names [4]   | (?\u0026lt;a\u0026gt;, (?'a'   | X      | X      | ✓      |\r\n| extended grapheme type      | \\X                   | X      | X      | ✓      |\r\n| lookbehinds                 | (?\u003c=a), (?\u003c!a)       | X      | X      | ✓      |\r\n| keep marks                  | \\K                   | X      | X      | ✓      |\r\n| sane word boundaries        | \\b, \\B               | X [5]  | X [5]  | ✓      |\r\n| nested keep mark            | /a(b\\Kc)d/           | X      | X      | X      |\r\n| backref by recursion level  | \\k\u003c1+1\u003e              | X      | X      | X      |\r\n| previous match anchor       | \\G                   | X      | X      | X      |\r\n| variable length absence     | (?~(a+\\|bar))        | X      | X      | X      |\r\n| comment groups [4]          | (?#comment)          | X      | X      | X      |\r\n| inline comments [4]         | /[a-z] # comment/x   | X      | X      | X      |\r\n\r\n[1] Keep in mind that [Ruby's multiline mode](http://ruby-doc.org/core-2.1.1/Regexp.html#class-Regexp-label-Options) is more of a \"dot-all mode\" and totally different from [JavaScript's multiline mode](http://javascript.info/regexp-multiline-mode).\r\n\r\n[2] See [here](#EX) for information about how this is achieved.\r\n\r\n[3] Limited to 5 levels of depth.\r\n\r\n[4] These are dropped without warning because they can be removed without affecting the matching behavior.\r\n\r\n[5] When targetting ES2018, \\b and \\B are replaced with a lookbehind/lookahead solution. For other targets, they are carried over as is, but generate a warning. They only recognize ASCII word chars in JavaScript, and neither the `u` nor the `v` flag makes them behave correctly.\r\n\r\n[6] `^` only matches at the beginning of the string for the targets ES2009 and ES2015. See https://github.com/jaynetics/js_regex/issues/30\r\n\r\n\u003ca name='EX'\u003e\u003c/a\u003e\r\n## How it Works\r\n\r\nJsRegex uses the gem [regexp_parser](https://github.com/ammar/regexp_parser) to parse a Ruby Regexp.\r\n\r\nIt traverses the AST returned by `regexp_parser` depth-first, and converts it to its own tree of equivalent JavaScript RegExp tokens, marking some nodes for treatment in a second pass.\r\n\r\nThe second pass then carries out all modifications that require knowledge of the complete tree.\r\n\r\nAfter the second pass, JsRegex flat-maps the final tree into a new source string.\r\n\r\nMany Regexp tokens work in JavaScript just as they do in Ruby, or allow for a straightforward replacement, but some conversions are a little more involved.\r\n\r\n**Atomic groups and possessive quantifiers** are missing in JavaScript, so the only way to emulate their behavior is by substituting them with [backreferenced lookahead groups](http://instanceof.me/post/52245507631/regex-emulate-atomic-grouping-with-lookahead).\r\n\r\n**Astral plane characters** convert to ranges of [surrogate pairs](https://dmitripavlutin.com/what-every-javascript-developer-should-know-about-unicode/#24surrogatepairs) when targetting ES2009 (which doesn't support astral plane chars).\r\n\r\n**Properties and posix classes** expand to equivalent character sets, or surrogate pair alternations if necessary. The gem [regexp_property_values](https://github.com/jaynetics/regexp_property_values) helps by reading out their codepoints from Onigmo.\r\n\r\n**Character sets a.k.a. bracket expressions** offer many more features in Ruby compared to JavaScript. To work around this, JsRegex calls on the gem [character_set](https://github.com/jaynetics/character_set) to calculate the matched codepoints of the whole set and build a completely new set string for all except the most simple cases.\r\n\r\n**Conditionals** expand to equivalent alternations in the second pass, e.g. `(\u003c)?foo(?(1)\u003e)` expands to `(?:\u003cfoo\u003e|foo)` (simplified example).\r\n\r\n**Subexpression calls** are replaced with the conversion result of their target, e.g. `(.{3})\\g\u003c1\u003e` expands to `(.{3})(.{3})`.\r\n\r\nThe tricky bit here is that these expressions may be nested, and that their expansions may increase the capturing group count. This means that any following backreferences need an update. E.g. \u003ccode\u003e(.{3})\\g\u003c1\u003e(.)\u003cb\u003e\\2\u003c/b\u003e\u003c/code\u003e (which matches strings like \"FooBarXX\") converts to \u003ccode\u003e(.{3})(.{3})(.)\u003cb\u003e\\3\u003c/b\u003e\u003c/code\u003e.\r\n\r\n## Contributions\r\n\r\nFeel free to send suggestions, point out issues, or submit pull requests.\r\n\r\n## Outlook\r\n\r\nThe gem is pretty feature-complete at this point. The remaining unsupported features listed above are either impossible or impractical to replicate in JavaScript. The generated output could still be made more concise in some cases, through usage of the newer `s` or `v` flags. Finally, `ES2018` might become the default target at some point.\r\n\r\n## Similar projects\r\n\r\n- [Oniguruma-To-ES](https://github.com/slevithan/oniguruma-to-es) is an Oniguruma to JavaScript regex transpiler, written in JavaScript. Please note that it is not fully compatible with Ruby regexes as Ruby uses Onigmo, a fork of Oniguruma.\r\n- [regex-translator](https://github.com/Anadian/regex-translator) is a regex transpiler written in JavaScript that covers various formats, not including Ruby.\r\n","funding_links":[],"categories":["Ruby"],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjaynetics%2Fjs_regex","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fjaynetics%2Fjs_regex","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjaynetics%2Fjs_regex/lists"}