{"id":15673995,"url":"https://github.com/sshine/regex-benchmarks","last_synced_at":"2025-10-16T18:18:47.536Z","repository":{"id":63107102,"uuid":"565296581","full_name":"sshine/regex-benchmarks","owner":"sshine","description":null,"archived":false,"fork":false,"pushed_at":"2022-11-12T23:52:43.000Z","size":3,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-02-05T08:09:32.618Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Rust","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/sshine.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2022-11-12T23:52:29.000Z","updated_at":"2022-11-12T23:52:47.000Z","dependencies_parsed_at":"2022-11-13T01:01:02.448Z","dependency_job_id":null,"html_url":"https://github.com/sshine/regex-benchmarks","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sshine%2Fregex-benchmarks","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sshine%2Fregex-benchmarks/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sshine%2Fregex-benchmarks/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sshine%2Fregex-benchmarks/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/sshine","download_url":"https://codeload.github.com/sshine/regex-benchmarks/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":246281217,"owners_count":20752208,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-10-03T15:43:21.200Z","updated_at":"2025-10-16T18:18:42.516Z","avatar_url":"https://github.com/sshine.png","language":"Rust","funding_links":[],"categories":[],"sub_categories":[],"readme":"# regex-benchmarks\n\nThis is a benchmark of two regular expression quantifiers:\n\n- The [possessive quantifier][possessive], the `+` in e.g. `.*+`\n- The [non-greedy quantifier][non-greedy], the `?` in e.g. `.*?`\n\n[possessive]: https://www.regular-expressions.info/possessive.html\n[non-greedy]: https://www.regular-expressions.info/repeat.html\n\nIn particular, I'd like to know if using them makes regex matching faster or\nslower.\n\nDisclaimer: We're testing with Rust's [`regex` crate][regex-crate], which is\nloosely based on the RE2 library. The conclusion likely only applies to this\ncrate and `#notallregexlibraries`! Different regex libraries may apply vastly\ndifferent optimizations and therefore have different costs associated.\n\n[regex-crate]: https://docs.rs/regex/latest/regex/\n\n## The experiment\n\nThree regular expressions are being compared:\n\n- `'([^']*)'`\n- `'([^']*+)'`\n- `'(.*?)'`\n\nagainst two types of quoted strings with `...` being 500 random alphanumeric bytes:\n\n- `'...'` which succeeds matching all three regexes\n- `'...` which fails matching all three regexes\n\nThe crux of these six benchmarks is the expression `regex.captures(\u0026input)`.\n\nAs a control experiment, three non-capturing variants are also benchmarked:\n\n- `'[^']*'`\n- `'[^']*+'`\n- `'.*?'`\n\nThese are tested both as successful and failing matches. This helps assess\nwhether the performance difference between the three is less than the cost of\ncapturing the result.\n\nThe crux of the six control benchmarks is the expression `regex.is_match(\u0026input)`.\n\n## Questions\n\n- The possessive quantifier is expectably faster, but how much?\n- The non-greedy quantifier is expectably slower, but how much?\n- Is the speed difference above or below the cost of capturing groups?\n\n## Answers\n\nCondensing the [`cargo bench`](#appendix-cargo-bench) results,\n\n- The possessive quantifier is actually insignificantly slower than omitting it.\n- The non-greedy quantifier is twice as slow, but still insignificantly slower.\n- Capturing groups have approximately a factor 7 bigger impact on the performance\n  of a regular expression than a quantifier\n\n```\ncapturing regexes that succeed/normal/500 time:     [4.7930 µs 4.7942 µs 4.7956 µs]\ncapturing regexes that succeed/possessive/500 time: [4.8243 µs 4.8256 µs 4.8269 µs]\ncapturing regexes that succeed/non-greedy/500 time: [8.7937 µs 8.7961 µs 8.7986 µs]\n\ncapturing regexes that fail/normal/500 time:        [676.01 ns 676.15 ns 676.29 ns]\ncapturing regexes that fail/possessive/500 time:    [684.92 ns 685.06 ns 685.20 ns]\ncapturing regexes that fail/non-greedy/500 time:    [695.49 ns 695.65 ns 695.82 ns]\n\nnon-capturing regexes that succeed/normal/500 time:     [679.22 ns 679.41 ns 679.59 ns]\nnon-capturing regexes that succeed/possessive/500 time: [683.18 ns 683.34 ns 683.51 ns]\nnon-capturing regexes that succeed/non-greedy/500 time: [685.60 ns 685.78 ns 685.97 ns]\n\nnon-capturing regexes that fail/normal/500 time:        [703.49 ns 703.69 ns 703.90 ns]\nnon-capturing regexes that fail/possessive/500 time:    [692.07 ns 692.25 ns 692.44 ns]\nnon-capturing regexes that fail/non-greedy/500 time:    [694.48 ns 694.64 ns 694.80 ns]\n```\n\n## Conclusions\n\n- Don't bother writing `'[^']*+'` instead of `'[^']*'`: Slightly slower and less readable.\n- If you think `'.*?'` is more readable, it's probably not slower at a scale that matters.\n- Remember to use [non-capturing groups][non-capture] if you don't actually need to extract the contents.\n\n[non-capture]: https://www.regular-expressions.info/brackets.html\n\n## Appendix: `cargo bench`\n\n```\n     Running benches/regex.rs (target/release/deps/regex-84f60dd518617320)\ncapturing regexes that succeed/normal/500\n                        time:   [4.7930 µs 4.7942 µs 4.7956 µs]\nFound 132 outliers among 10000 measurements (1.32%)\n  97 (0.97%) high mild\n  35 (0.35%) high severe\ncapturing regexes that succeed/possessive/500\n                        time:   [4.8243 µs 4.8256 µs 4.8269 µs]\nFound 143 outliers among 10000 measurements (1.43%)\n  141 (1.41%) high mild\n  2 (0.02%) high severe\ncapturing regexes that succeed/non-greedy/500\n                        time:   [8.7937 µs 8.7961 µs 8.7986 µs]\nFound 17 outliers among 10000 measurements (0.17%)\n  14 (0.14%) high mild\n  3 (0.03%) high severe\n\ncapturing regexes that fail/normal/500\n                        time:   [676.01 ns 676.15 ns 676.29 ns]\nFound 1912 outliers among 10000 measurements (19.12%)\n  733 (7.33%) high mild\n  1179 (11.79%) high severe\ncapturing regexes that fail/possessive/500\n                        time:   [684.92 ns 685.06 ns 685.20 ns]\nFound 1172 outliers among 10000 measurements (11.72%)\n  720 (7.20%) low mild\n  432 (4.32%) high mild\n  20 (0.20%) high severe\ncapturing regexes that fail/non-greedy/500\n                        time:   [695.49 ns 695.65 ns 695.82 ns]\nFound 13 outliers among 10000 measurements (0.13%)\n  13 (0.13%) high mild\n\nnon-capturing regexes that succeed/normal/500\n                        time:   [679.22 ns 679.41 ns 679.59 ns]\nFound 4488 outliers among 10000 measurements (44.88%)\n  2096 (20.96%) low severe\n  30 (0.30%) low mild\n  422 (4.22%) high mild\n  1940 (19.40%) high severe\nnon-capturing regexes that succeed/possessive/500\n                        time:   [683.18 ns 683.34 ns 683.51 ns]\nFound 2279 outliers among 10000 measurements (22.79%)\n  27 (0.27%) low severe\n  1 (0.01%) low mild\n  142 (1.42%) high mild\n  2109 (21.09%) high severe\nnon-capturing regexes that succeed/non-greedy/500\n                        time:   [685.60 ns 685.78 ns 685.97 ns]\nFound 137 outliers among 10000 measurements (1.37%)\n  137 (1.37%) high mild\n\nnon-capturing regexes that fail/normal/500\n                        time:   [703.49 ns 703.69 ns 703.90 ns]\nFound 283 outliers among 10000 measurements (2.83%)\n  279 (2.79%) high mild\n  4 (0.04%) high severe\nnon-capturing regexes that fail/possessive/500\n                        time:   [692.07 ns 692.25 ns 692.44 ns]\nFound 4801 outliers among 10000 measurements (48.01%)\n  2380 (23.80%) low severe\n  21 (0.21%) low mild\n  654 (6.54%) high mild\n  1746 (17.46%) high severe\nnon-capturing regexes that fail/non-greedy/500\n                        time:   [694.48 ns 694.64 ns 694.80 ns]\nFound 2516 outliers among 10000 measurements (25.16%)\n  846 (8.46%) low severe\n  22 (0.22%) low mild\n  334 (3.34%) high mild\n  1314 (13.14%) high severe\n```\n\n## \n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsshine%2Fregex-benchmarks","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsshine%2Fregex-benchmarks","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsshine%2Fregex-benchmarks/lists"}