{"id":16260163,"url":"https://github.com/twolodzko/rex","last_synced_at":"2025-07-18T13:35:48.737Z","repository":{"id":177337744,"uuid":"658947515","full_name":"twolodzko/rex","owner":"twolodzko","description":"✂️ Use Regular Expressions to eXtract fields from a string","archived":false,"fork":false,"pushed_at":"2023-10-20T19:07:13.000Z","size":46,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-04-08T13:52:44.557Z","etag":null,"topics":["command-line","command-line-tool","regex","regular-expression","rust","rust-lang"],"latest_commit_sha":null,"homepage":"","language":"Rust","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/twolodzko.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-06-26T20:42:44.000Z","updated_at":"2023-11-24T09:22:31.000Z","dependencies_parsed_at":null,"dependency_job_id":"0fd254b9-b6da-46c1-95d8-e93b84e63ea9","html_url":"https://github.com/twolodzko/rex","commit_stats":{"total_commits":24,"total_committers":3,"mean_commits":8.0,"dds":0.08333333333333337,"last_synced_commit":"3af6905dfd5e68a9a0bf47eb5d80ea2f8134c191"},"previous_names":["twolodzko/rex"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/twolodzko/rex","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/twolodzko%2Frex","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/twolodzko%2Frex/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/twolodzko%2Frex/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/twolodzko%2Frex/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/twolodzko","download_url":"https://codeload.github.com/twolodzko/rex/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/twolodzko%2Frex/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":265769146,"owners_count":23825241,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["command-line","command-line-tool","regex","regular-expression","rust","rust-lang"],"created_at":"2024-10-10T16:06:34.727Z","updated_at":"2025-07-18T13:35:48.633Z","avatar_url":"https://github.com/twolodzko.png","language":"Rust","funding_links":[],"categories":[],"sub_categories":[],"readme":"# `rex`: use Regular Expressions to eXtract fields from strings\n\n`rex` is a simple command-line tool for extracting fields of strings using regular expressions. It relies on Rust's\n[`Regex`] crate and its syntax for (Perl-style) regular expressions. The same can be achieved by using common\ncommand-line applications like `sed` or `awk`, but `rex` uses a simpler syntax as you only need to define the regular\nexpression to extract the fields. The extracted fields are returned either as columns or JSON entries.\n\nFor example, the command below extracts three fields for permissions, filename, and extension and returns them as\ncolumns.\n\n```shell\n$ ls -la | rex '([rwx-]+) .*(Cargo)\\.([^ ]*)'\n-rw-rw-r--      Cargo   lock\n-rw-rw-r--      Cargo   toml\n```\n\nThe capturing groups can be named and the `-j` flag marks that the output should be returned as JSON entries\n(aka [JSON Lines] format).\n\n```shell\n$ ls -la | rex '(?P\u003cpermissions\u003e[rwx-]+) .*(?P\u003cname\u003eCargo)\\.(?P\u003cextension\u003e[^ ]*)' -j \n{\"extension\":\"lock\",\"name\":\"Cargo\",\"permissions\":\"-rw-rw-r--\"}\n{\"extension\":\"toml\",\"name\":\"Cargo\",\"permissions\":\"-rw-rw-r--\"}\n```\n\nMoreover, as the benchmark using the [IMDB dataset] shows, the code is faster than `sed` and `gawk`.\n\n```shell\n$ hyperfine --warmup 3 \\\n  \"sed -E 's/(199[0-9]|20[0-9]{2})?.*,(positive|negative)/\\1\\t\\2/' IMDB\\ Dataset.csv \u003e /dev/null\" \\\n  \"gawk 'match(\\$0, /(199[0-9]|20[0-9]{2})?.*,(positive|negative)/, arr) { print arr[1], '\\t' arr[2] }' IMDB\\ Dataset.csv \u003e /dev/null\" \\\n  \"rex '(199[0-9]|20[0-9]{2})?.*,(positive|negative)' IMDB\\ Dataset.csv \u003e /dev/null\"\nBenchmark 1: sed -E 's/(199[0-9]|20[0-9]{2})?.*,(positive|negative)/\\1\\t\\2/' IMDB\\ Dataset.csv \u003e /dev/null\n  Time (mean ± σ):      6.818 s ±  0.384 s    [User: 6.751 s, System: 0.065 s]\n  Range (min … max):    6.547 s …  7.877 s    10 runs\n \nBenchmark 2: gawk 'match($0, /(199[0-9]|20[0-9]{2})?.*,(positive|negative)/, arr) { print arr[1], '\\t' arr[2] }' IMDB\\ Dataset.csv \u003e /dev/null\n  Time (mean ± σ):      7.960 s ±  0.522 s    [User: 7.918 s, System: 0.036 s]\n  Range (min … max):    7.349 s …  8.716 s    10 runs\n \nBenchmark 3: rex '(199[0-9]|20[0-9]{2})?.*,(positive|negative)' IMDB\\ Dataset.csv \u003e /dev/null\n  Time (mean ± σ):     934.5 ms ±  47.5 ms    [User: 874.4 ms, System: 60.0 ms]\n  Range (min … max):   895.1 ms … 1049.5 ms    10 runs\n \nSummary\n  rex '(199[0-9]|20[0-9]{2})?.*,(positive|negative)' IMDB\\ Dataset.csv \u003e /dev/null ran\n    7.30 ± 0.55 times faster than sed -E 's/(199[0-9]|20[0-9]{2})?.*,(positive|negative)/\\1\\t\\2/' IMDB\\ Dataset.csv \u003e /dev/null\n    8.52 ± 0.71 times faster than gawk 'match($0, /(199[0-9]|20[0-9]{2})?.*,(positive|negative)/, arr) { print arr[1], '\\t' arr[2] }' IMDB\\ Dataset.csv \u003e /dev/null\n```\n\n\n [`Regex`]: https://docs.rs/regex/latest/regex/\n [IMDB dataset]: https://www.kaggle.com/datasets/lakshmi25npathi/imdb-dataset-of-50k-movie-reviews?resource=download\n [JSON Lines]: https://jsonlines.org/\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftwolodzko%2Frex","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ftwolodzko%2Frex","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftwolodzko%2Frex/lists"}