{"id":13646318,"url":"https://github.com/mariomka/regex-benchmark","last_synced_at":"2026-02-16T06:36:12.669Z","repository":{"id":40365361,"uuid":"103036720","full_name":"mariomka/regex-benchmark","owner":"mariomka","description":"It's just a simple regex benchmark of different programming languages.","archived":false,"fork":false,"pushed_at":"2024-04-12T14:30:51.000Z","size":2361,"stargazers_count":321,"open_issues_count":12,"forks_count":56,"subscribers_count":13,"default_branch":"master","last_synced_at":"2025-06-16T10:04:45.664Z","etag":null,"topics":["bench","benchmark","languages","regex","regexp"],"latest_commit_sha":null,"homepage":"","language":"Dockerfile","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/mariomka.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2017-09-10T14:41:21.000Z","updated_at":"2025-06-10T22:44:31.000Z","dependencies_parsed_at":"2024-08-02T01:38:31.221Z","dependency_job_id":null,"html_url":"https://github.com/mariomka/regex-benchmark","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/mariomka/regex-benchmark","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mariomka%2Fregex-benchmark","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mariomka%2Fregex-benchmark/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mariomka%2Fregex-benchmark/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mariomka%2Fregex-benchmark/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/mariomka","download_url":"https://codeload.github.com/mariomka/regex-benchmark/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mariomka%2Fregex-benchmark/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":29501919,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-02-16T05:57:17.024Z","status":"ssl_error","status_checked_at":"2026-02-16T05:56:49.929Z","response_time":115,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bench","benchmark","languages","regex","regexp"],"created_at":"2024-08-02T01:02:52.786Z","updated_at":"2026-02-16T06:36:12.654Z","avatar_url":"https://github.com/mariomka.png","language":"Dockerfile","readme":"# Languages Regex Benchmark\n\nIt's just a simple regex benchmark for different programming languages.\n\nMeasures how long it takes to find and count non-overlapping occurrences with **default settings**.\n\n\u003e All benchmarks are wrong, but some are useful - [Szilard](https://github.com/szilard), [benchm-ml](https://github.com/szilard/benchm-ml)\n\nI hope this benchmark can be helpful, but it's not only about performance, but each language also has its engine and offers different features (like UTF support, backreferences, capturing groups ...)\n\n## Input text\n\nThe [input text](input-text.txt) is a concatenation of [Learn X in Y minutes](https://github.com/adambard/learnxinyminutes-docs) repository.\n\n*Maybe isn't the best representative text. I'm searching other texts to add to the benchmark.*\n\n## Regex patterns\n\n- Email: ``[\\w\\.+-]+@[\\w\\.-]+\\.[\\w\\.-]+``\n- URI: ``[\\w]+://[^/\\s?#]+[^\\s?#]+(?:\\?[^\\s#]*)?(?:#[^\\s]*)?``\n- IPv4: ``(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9])\\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9])``\n\nThe above regex patterns aren't the best or the optimal. The focus is the benchmark, not the matching.\n\nThe patterns are applied to the whole file.\n\n## Measure\n\nMeasuring is done inside the programs to avoid include startup, reading and writing times on results.\n\nElapsed time include pattern compilation, find and count occurrences.\n\n## Performance\n\nDocker image was run on: MacBook Pro (16-inch, 2019), 2.4 GHz Intel Core i9, 32 GB 2667 Mhz DDR4 with macOS Big Sur 11.2.3.\n\nLanguage | Email(ms) | URI(ms) | IP(ms) | Total(ms)\n--- | ---: | ---: | ---: | ---:\n**Nim Regex** | 1.32 | 26.92 | 7.84 | 36.09\n**Nim** | 22.70 | 21.49 | 6.75 | 50.94\n**Rust** | 26.66 | 25.70 | 5.28 | 57.63\n**PHP** | 42.87 | 46.30 | 5.17 | 94.33\n**C++ Boost** | 44.97 | 44.13 | 15.13 | 104.23\n**Javascript** | 59.00 | 47.23 | 1.50 | 107.73\n**Perl** | 94.92 | 63.96 | 20.37 | 179.25\n**Julia** | 104.58 | 86.55 | 5.01 | 196.14\n**C PCRE2** | 126.10 | 112.17 | 13.10 | 251.37\n**Crystal** | 128.19 | 112.70 | 13.18 | 254.07\n**C# .Net Core** | 115.05 | 106.05 | 42.71 | 263.81\n**Dart** | 104.10 | 107.64 | 76.51 | 288.25\n**D ldc** | 165.46 | 165.20 | 4.85 | 335.51\n**D dmd** | 187.94 | 189.92 | 5.32 | 383.18\n**Ruby** | 233.88 | 208.85 | 43.14 | 485.86\n**Python PyPy2** | 158.34 | 139.70 | 253.77 | 551.81\n**Dart Native** | 278.54 | 307.54 | 5.77 | 591.85\n**Python 2** | 197.92 | 131.74 | 294.42 | 624.08\n**Kotlin** | 186.20 | 223.05 | 287.49 | 696.74\n**Java** | 198.33 | 221.87 | 287.81 | 708.01\n**Python PyPy3** | 258.78 | 221.89 | 257.35 | 738.03\n**Python 3** | 273.86 | 190.79 | 319.13 | 783.78\n**Go** | 248.14 | 241.28 | 360.90 | 850.32\n**C++ STL** | 433.09 | 344.74 | 245.66 | 1023.49\n**C# Mono** | 2859.05 | 2431.87 | 145.82 | 5436.75\n\n### Optimized\n\n\u003e The following results are for the [optimized version](https://github.com/mariomka/regex-benchmark/tree/optimized).\n\nLanguage | Email(ms) | URI(ms) | IP(ms) | Total(ms)\n--- | ---: | ---: | ---: | ---:\n**Rust** | 11.43 | 11.40 | 5.11 | 27.94\n**Nim Regex** | 1.37 | 25.51 | 7.27 | 34.15\n**Nim** | 22.79 | 21.64 | 6.77 | 51.21\n**C PCRE2** | 46.22 | 36.92 | 4.73 | 87.87\n**PHP** | 43.18 | 46.71 | 5.23 | 95.12\n**C++ Boost** | 44.68 | 44.50 | 15.10 | 104.28\n**Javascript** | 59.20 | 47.67 | 1.61 | 108.48\n**C# .Net Core** | 61.76 | 47.86 | 11.63 | 121.25\n**Perl** | 96.00 | 63.39 | 20.59 | 179.99\n**Julia** | 104.31 | 87.98 | 5.16 | 197.45\n**Crystal** | 129.52 | 116.33 | 13.12 | 258.97\n**Dart** | 105.82 | 107.78 | 78.18 | 291.78\n**D ldc** | 167.60 | 165.71 | 5.07 | 338.37\n**D dmd** | 187.66 | 192.16 | 5.55 | 385.37\n**Ruby** | 236.93 | 206.51 | 43.70 | 487.14\n**Python PyPy2** | 161.33 | 143.56 | 258.06 | 562.96\n**Dart Native** | 273.17 | 306.14 | 5.89 | 585.20\n**Python 2** | 200.54 | 132.89 | 290.26 | 623.69\n**Kotlin** | 184.13 | 220.31 | 273.76 | 678.21\n**Java** | 190.74 | 223.77 | 275.24 | 689.75\n**Python PyPy3** | 268.41 | 226.74 | 261.17 | 756.32\n**Python 3** | 273.70 | 194.09 | 322.09 | 789.88\n**Go** | 244.14 | 238.40 | 365.27 | 847.81\n**C++ STL** | 433.18 | 341.07 | 246.85 | 1021.10\n**C# Mono** | 1400.04 | 1189.50 | 145.73 | 2735.28\n\n- **Language**: Indicates the language.\n- **Email(ms)**, **URI(ms)**, **IP(ms)**: Indicates the time elapsed in milliseconds for finding and counting non-overlapping occurrences for the pattern.\n- **Total(ms)**: Indicates the sum of the above times.\n\n### Versions and notes\n\n- **C**: gcc 7.5.0 \u0026 PCRE2 10.36-2\n- **Crystal**: crystal 0.35.1 - LLVM: 8.0.0\n- **C++**: g++ 7.5.0 | Boost 1.65.1.0\n- **C#**: dotnet 5.0.201 | Mono 6.12.0.122\n- **D**: DMD v2.089.0 | LDC 1.8.0\n- **Dart**: Dart 2.12.2\n- **Go**: go 1.16.2\n- **Java**: OpenJDK 11.0.10\n- **Javascript**: node v15.13.0\n- **Julia**: Julia 1.6.0\n- **Kotlin**: kotlinc-jvm 1.4.32\n- **Nim**: Nim 1.4.4\n- **Perl**: perl v5.26.1\n- **PHP**: PHP 8.0.3\n- **Python**: Python 2.7.17 | Python 3.6.9 | PyPy 7.3.3\n- **Ruby**: ruby 2.5.1p57\n- **Rust**: rustc 1.51.0 \u0026 regex 1.4.5\n\n# How to run\n\nThe easiest way to run the benchmark is by using Docker.\n\n```sh\ngit clone https://github.com/mariomka/regex-benchmark.git\ncd regex-benchmark\ndocker run --rm -v $(pwd):/var/regex mariomka/regex-benchmark:1.6\n```\n\n# Contributing\n\nAll contributions are welcome, from tiny optimizations to new implementations.\n\nThere are only a few requirements:\n- Follow the style of the current implementations\n- Use the default settings for the regex engine\n- Update `Dockerfile` if it's necessary\n\n# Kudos\n\n- Heng Li's for his work on [Benchmark of Regex Libraries](http://lh3lh3.users.sourceforge.net/reb.shtml).\n- A \"challenge\" on [Madrid Devs](http://madriddevs.org/) group inspired me.\n- [Programming subreddit](https://www.reddit.com/r/programming/), helped me to improve the benchmark.\n\n# License\n\nMIT © [Mario Juárez](https://github.com/mariomka).\n","funding_links":[],"categories":["Dockerfile","Benchmarks"],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmariomka%2Fregex-benchmark","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmariomka%2Fregex-benchmark","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmariomka%2Fregex-benchmark/lists"}