{"id":13591074,"url":"https://github.com/p-ranav/hypergrep","last_synced_at":"2025-08-25T21:12:26.704Z","repository":{"id":172755365,"uuid":"629432084","full_name":"p-ranav/hypergrep","owner":"p-ranav","description":"Recursively search directories for a regex pattern","archived":false,"fork":false,"pushed_at":"2023-06-09T22:52:17.000Z","size":9758,"stargazers_count":204,"open_issues_count":3,"forks_count":7,"subscribers_count":7,"default_branch":"master","last_synced_at":"2024-11-23T08:22:21.848Z","etag":null,"topics":["blazing-fast","command-line-tool","cpp17","directory-traversal","filesystem","git","grep","hyperscan","intel","libgit2","lock-free-queue","mmap","multithreading","pattern-matching","recursive","regex","search","simd","utf8"],"latest_commit_sha":null,"homepage":"","language":"C++","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/p-ranav.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null}},"created_at":"2023-04-18T09:53:16.000Z","updated_at":"2024-11-15T07:18:48.000Z","dependencies_parsed_at":null,"dependency_job_id":"eabe5f5e-2e2f-4597-abdc-cb6d6bbce3fb","html_url":"https://github.com/p-ranav/hypergrep","commit_stats":null,"previous_names":["p-ranav/hypergrep"],"tags_count":2,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/p-ranav%2Fhypergrep","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/p-ranav%2Fhypergrep/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/p-ranav%2Fhypergrep/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/p-ranav%2Fhypergrep/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/p-ranav","download_url":"https://codeload.github.com/p-ranav/hypergrep/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":230408170,"owners_count":18220974,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["blazing-fast","command-line-tool","cpp17","directory-traversal","filesystem","git","grep","hyperscan","intel","libgit2","lock-free-queue","mmap","multithreading","pattern-matching","recursive","regex","search","simd","utf8"],"created_at":"2024-08-01T16:00:53.314Z","updated_at":"2024-12-19T09:07:31.969Z","avatar_url":"https://github.com/p-ranav.png","language":"C++","readme":"\u003cp align=\"center\"\u003e\n  \u003cimg height=\"100\" src=\"doc/images/logo.png\"/\u003e\n\u003c/p\u003e\n\n## Highlights\n\n* Search recursively for a regex pattern using [Intel Hyperscan](https://github.com/intel/hyperscan).\n* When a git repository is detected, the repository index is searched using [libgit2](https://github.com/libgit2/libgit2).\n* Similar to `grep`, `ripgrep`, `ugrep`, `The Silver Searcher` etc.\n* C++17, Multi-threading, SIMD.\n* [USAGE GUIDE](doc/USAGE.md)\n* Implementation notes [here](doc/NOTES.md).\n* Not cross-platform. Tested in Linux. \n\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"doc/images/ignore_case_ascii.png\"/\u003e\n\u003c/p\u003e\n\n## Performance\n\nThe following tests compare the performance of `hypergrep` against:\n\n* [ripgrep](https://github.com/BurntSushi/ripgrep/) `v13.0.0`\n* [ag 2.2.0 (The Silver Searcher)](https://github.com/ggreer/the_silver_searcher) `v2.2.0`\n* [ugrep](https://github.com/Genivia/ugrep) `v3.11.2`\n\n### System Details\n\n| Type            | Value |\n|:--------------- |:---- |\n| Processor       | [11th Gen Intel(R) Core(TM) i9-11900KF @ 3.50GHz   3.50 GHz](https://ark.intel.com/content/www/us/en/ark/products/212321/intel-core-i911900kf-processor-16m-cache-up-to-5-30-ghz.html) |\n| Instruction Set Extensions | Intel® SSE4.1, Intel® SSE4.2, Intel® AVX2, Intel® AVX-512 |\n| Installed RAM   | 32.0 GB (31.9 GB usable) |\n| SSD             | [ADATA SX8200PNP](https://www.adata.com/upload/downloadfile/Datasheet_XPG%20SX8200%20Pro_EN_20181017.pdf) |\n| OS              | Ubuntu 20.04 LTS |\n| C++ Compiler    | g++ (Ubuntu 11.1.0-1ubuntu1-20.04) 11.1.0 |\n\n### Vcpkg Installed Libraries\n\n[vcpkg](https://github.com/microsoft/vcpkg) commit: [662dbb5](https://github.com/microsoft/vcpkg/commit/662dbb50e63af15baa2909b7eac5b1b87e86a0aa)\n\n| Library | Version | \n|:---|:---|\n| **argparse** | 2.9 |\n| **concurrentqueue** | 1.0.3 |\n| **fmt** | 10.0.0 |\n| **hyperscan** | 5.4.2 |\n| **libgit2** | 1.6.4 |\n\n### Single Large File Search: `OpenSubtitles.raw.en.txt`\n\n The following searches are performed on a single large file cached in memory (~13GB, [`OpenSubtitles.raw.en.gz`](http://opus.nlpl.eu/download.php?f=OpenSubtitles/v2018/mono/OpenSubtitles.raw.en.gz)).\n\n| Regex | Line Count | ag | ugrep | ripgrep | hypergrep |\n| :---| ---:| ---:| ---:| ---:| ---:|\n| Count number of times Holmes did something\u003cbr/\u003e`hgrep -c 'Holmes did \\w'` | 27 | n/a | 1.820 | 1.022 | **0.696**  |\n| Literal with Regex Suffix\u003cbr/\u003e`hgrep -nw 'Sherlock [A-Z]\\w+' en.txt` | 7882 | n/a | 1.812 | 1.509 | **0.803** |\n| Simple Literal\u003cbr/\u003e`hgrep -nw 'Sherlock Holmes' en.txt` | 7653 | 15.764 | 1.888 | 1.524 | **0.658** |\n| Simple Literal (case insensitive)\u003cbr/\u003e`hgrep -inw 'Sherlock Holmes' en.txt` | 7871 | 15.599 | 6.945 | 2.162 | **0.650** |\n| Alternation of Literals\u003cbr/\u003e`hgrep -n 'Sherlock Holmes\\|John Watson\\|Irene Adler\\|Inspector Lestrade\\|Professor Moriarty' en.txt` | 10078 | n/a | 6.886 | 1.836 | **0.689** |\n| Alternation of Literals (case insensitive)\u003cbr/\u003e`hgrep -in 'Sherlock Holmes\\|John Watson\\|Irene Adler\\|Inspector Lestrade\\|Professor Moriarty' en.txt` | 10333 | n/a | 7.029 | 3.940 | **0.770** |\n| Words surrounding a literal string\u003cbr/\u003e`hgrep -n '\\w+[\\x20]+Holmes[\\x20]+\\w+' en.txt` | 5020 | n/a | 6m 11s | 1.523 | **0.638** |\n\n### Git Repository Search: `torvalds/linux`\n\nThe following searches are performed on the entire [Linux kernel source tree](https://github.com/torvalds/linux) (after running `make defconfig \u0026\u0026 make -j8`). The commit used is [f1fcb](https://github.com/torvalds/linux/commit/f1fcbaa18b28dec10281551dfe6ed3a3ed80e3d6).\n\n| Regex | Line Count | ag | ugrep | ripgrep | hypergrep |\n| :---| ---:| ---:| ---:| ---:| ---:|\n| Simple Literal\u003cbr/\u003e`hgrep -nw 'PM_RESUME'` | 9 | 2.807 | 0.316 | 0.147 | 0.140 |\n| Simple Literal (case insensitive)\u003cbr/\u003e`hgrep -niw 'PM_RESUME'` | 39 | 2.904 | 0.435 | 0.149 | 0.141 |\n| Regex with Literal Suffix\u003cbr/\u003e`hgrep -nw '[A-Z]+_SUSPEND'` | 536 | 3.080 | 1.452 | 0.148 | 0.143 |\n| Alternation of four literals\u003cbr/\u003e`hgrep -nw '(ERR_SYS\\|PME_TURN_OFF\\|LINK_REQ_RST\\|CFG_BME_EVT)'` | 16 | 3.085 | 0.410 | 0.153 | 0.146 |\n| Unicode Greek\u003cbr/\u003e`hgrep -n '\\p{Greek}'` | 111 | 3.762 | 0.484 | 0.345 | **0.146** |\n\n### Git Repository Search: `apple/swift`\n\nThe following searches are performed on the entire [Apple Swift source tree](https://github.com/apple/swift). The commit used is [3865b](https://github.com/apple/swift/commit/3865b5de6f2f56043e21895f65bd0d873e004ed9).\n\n| Regex | Line Count | ag | ugrep | ripgrep | hypergrep |\n| :---| ---:| ---:| ---:| ---:| ---:|\n| Function/Struct/Enum declaration followed by a valid identifier and opening parenthesis\u003cbr/\u003e`hgrep -n '(func\\|struct\\|enum)\\s+[A-Za-z_][A-Za-z0-9_]*\\s*\\('` | 59026 | 1.148 | 0.954 | 0.154 | **0.090** |\n| Words starting with alphabetic characters followed by at least 2 digits\u003cbr/\u003e`hgrep -nw '[A-Za-z]+\\d{2,}'` | 127858 | 1.169 | 1.238 | 0.156 | **0.095** |\n| Workd starting with Uppercase letter, followed by alpha-numeric chars and/or underscores \u003cbr/\u003e`hgrep -nw '[A-Z][a-zA-Z0-9_]*'` | 2012372 | 3.131 | 2.598 | 0.550 | **0.482** |\n| Guard let statement followed by valid identifier\u003cbr/\u003e`hgrep -n 'guard\\s+let\\s+[a-zA-Z_][a-zA-Z0-9_]*\\s*=\\s*\\w+'` | 839 | 0.828 | 0.174 | 0.054 | 0.047 |\n\n### Directory Search: `/usr`\n\nThe following searches are performed on the `/usr` directory. \n\n| Regex | Line Count | ag | ugrep | ripgrep | hypergrep |\n| :---| ---:| ---:| ---:| ---:| ---:|\n| Any HTTPS or FTP URL\u003cbr/\u003e`hgrep \"(https?\\|ftp)://[^\\s/$.?#].[^\\s]*\"` | 13682 | 4.597 | 2.894 | 0.305 | **0.171** |\n| Any IPv4 IP address\u003cbr/\u003e`hgrep -w \"(?:\\d{1,3}\\.){3}\\d{1,3}\"` | 12643 | 4.727 | 2.340 | 0.324 | **0.166** |\n| Any E-mail address\u003cbr/\u003e`hgrep -w \"[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\\.[A-Za-z]{2,}\"` | 47509 | 5.477 | 37.209 | 0.494 | **0.220** |\n| Any valid date `MM/DD/YYYY`\u003cbr/\u003e`hgrep \"(0[1-9]\\|1[0-2])/(0[1-9]\\|[12]\\d\\|3[01])/(19\\|20)\\d{2}\"` | 116 | 4.239 | 1.827 | 0.251 | **0.163** |\n| Count the number of HEX values\u003cbr/\u003e`hgrep -cw \"(?:0x)?[0-9A-Fa-f]+\"` | 68042 | 5.765 | 28.691 | 1.439 | **0.611** |\n| Search any C/C++ for a literal\u003cbr/\u003e`hgrep --filter \"\\.(c\\|cpp\\|h\\|hpp)$\" test` | 7355 | n/a | 0.505 | 0.118 | **0.079** | \n\n## Build\n\n### Install Dependencies with `vcpkg`\n\n```bash\ngit clone https://github.com/microsoft/vcpkg\ncd vcpkg\n./bootstrap-vcpkg.sh\n./vcpkg install concurrentqueue fmt argparse libgit2 hyperscan\n```\n\n### Build `hypergrep` using `cmake` and `vcpkg`\n\n#### Clone the repository\n\n```\ngit clone https://github.com/p-ranav/hypergrep\ncd hypergrep\n```\n\n#### If `cmake` is older than `3.19`\n\n```\nmkdir build\ncd build\ncmake -DCMAKE_TOOLCHAIN_FILE=\u003cpath_to_vcpkg\u003e/scripts/buildsystems/vcpkg.cmake ..\nmake\n```\n\n#### If `cmake` is newer than `3.19`\n\nUse the `release` preset:\n\n```\nexport VCPKG_ROOT=\u003cpath_to_vcpkg\u003e\ncmake -B build -S . --preset release\ncmake --build build\n```\n\n#### Binary Portability\n\nTo build the binary for x86_64 portability, invoke cmake with `-DBUILD_PORTABLE=on` option. This will use `-march=x86-64 -mtune=generic` and `-static-libgcc -static-libstdc++`, and link the C++ standard library and GCC runtime statically into the binary, reducing dependencies on the target system.","funding_links":[],"categories":["\u003ca name=\"text-search\"\u003e\u003c/a\u003eText search (alternatives to grep)","C++"],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fp-ranav%2Fhypergrep","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fp-ranav%2Fhypergrep","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fp-ranav%2Fhypergrep/lists"}