{"id":22140883,"url":"https://github.com/stealth/grab","last_synced_at":"2025-08-25T02:05:16.194Z","repository":{"id":6050953,"uuid":"7275944","full_name":"stealth/grab","owner":"stealth","description":"experimental and very fast implementation of a grep","archived":false,"fork":false,"pushed_at":"2023-08-31T06:32:41.000Z","size":234,"stargazers_count":259,"open_issues_count":0,"forks_count":20,"subscribers_count":9,"default_branch":"master","last_synced_at":"2025-04-09T18:20:03.024Z","etag":null,"topics":["fast","grep","hyperscan","parallel-programming","pcre","regex","ripgrep","search","silver-searcher"],"latest_commit_sha":null,"homepage":"","language":"C++","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/stealth.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2012-12-21T16:22:25.000Z","updated_at":"2024-10-12T10:45:02.000Z","dependencies_parsed_at":"2025-01-01T06:11:38.398Z","dependency_job_id":"3a0724b8-4b6a-46ee-b2d0-9e97f216523b","html_url":"https://github.com/stealth/grab","commit_stats":{"total_commits":26,"total_committers":2,"mean_commits":13.0,"dds":"0.038461538461538436","last_synced_commit":"ba70ff123229c689815c51c395e6190bf2961d4a"},"previous_names":[],"tags_count":6,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/stealth%2Fgrab","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/stealth%2Fgrab/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/stealth%2Fgrab/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/stealth%2Fgrab/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/stealth","download_url":"https://codeload.github.com/stealth/grab/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248085324,"owners_count":21045139,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["fast","grep","hyperscan","parallel-programming","pcre","regex","ripgrep","search","silver-searcher"],"created_at":"2024-12-01T21:08:27.607Z","updated_at":"2025-04-09T18:20:20.303Z","avatar_url":"https://github.com/stealth.png","language":"C++","funding_links":[],"categories":[],"sub_categories":[],"readme":"grab - simple, but very fast grep\n=================================\n\n\n![greppin](https://github.com/stealth/grab/blob/greppin/pic/greppin.jpg)\n\n\nThis is my own, experimental, parallel version of _grep_ so I can test\nvarious strategies to speed up access to large directory trees.\nOn Flash storage or SSDs, you can easily outsmart common greps by up\na factor of 8.\n\nOptions:\n\n```\nUsage: ./greppin [-rIOLlsSH] [-n \u003ccores\u003e] \u003cregex\u003e \u003cpath\u003e\n\n\t-2\t-- use PCRE2 instead of PCRE\n\t-O\t-- print file offset of match\n\t-l\t-- do not print the matching line (Useful if you want\n\t\t   to see _all_ offsets; if you also print the line, only\n\t\t   the first match in the line counts)\n\t-s\t-- single match; dont search file further after first match\n\t\t   (similar to grep on a binary)\n\t-H\t-- use hyperscan lib for scanning\n\t-S\t-- only for hyperscan: interpret pattern as string literal instead of regex\n\t-L\t-- machine has low mem; half chunk-size (default 2GB)\n\t\t   may be used multiple times\n\t-I\t-- enable highlighting of matches (useful)\n\t-n\t-- Use multiple cores in parallel (omit for single core)\n\t-r\t-- recurse on directory\n```\n\n\n_grab_ uses the _pcre_ library, so basically its equivalent to a `grep -P -a`.\nThe `-P` is important, since Perl-Compatible Regular Expressions have different\ncharacteristics than basic regexes.\n\n\nBuild\n-----\n\nThere are two branches. `master` and `greppin`. Master is the 'traditional'\n*grab* that should compile and run on most POSIX systems. `greppin` comes with\nits own optimized and parallelized version of `nftw()` and `readdir()`, which\nagain doubles speed on the top of speedup that the `master` branch already\nprovides. The `greppin` branch runs on Linux, BSD and OSX. `greppin` also comes\nwith support for Intel's [hyperscan](https://www.hyperscan.io) libraries that try\nto exploit CPU's SIMD instructions if possible (AVX2, AVX512 etc.) when compiling\nthe regex pattern into JIT code.\n\nYou will most likely want to build the `greppin` branch:\n\n```\n$ git checkout greppin\n[...]\n$ cd src; make\n[...]\n```\n\nMake sure you have the *pcre* and *pcre2* library packages installed.\nOn BSD systems you need `gmake` instead of `make`.\nIf you want to do cutting edge tech with _greppin's_ multiple regex engine and hyperscan\nsupport, you first need to get and build that:\n\n```\n$ git clone https://github.com/intel/hyperscan\n[...]\n$ cd hyperscan\n$ mkdir build; cd build\n$ cmake -DFAT_RUNTIME=1 -DBUILD_STATIC_AND_SHARED=1 ..\n[...]\n$ make\n[...]\n```\n\nThis will build so called *fat runtime* of the hyperscan libs which contain support\nfor all CPU families in order to select the right compilation pattern at runtime\nfor most performance. Once the build finishes, you build _greppin_ against that:\n\n(inside grab cloned repo)\n```\n$ cd src\n$ HYPERSCAN_BUILD=/path/to/hyperscan/build make -f Makefile.hs\n[...]\n```\n\nThis will produce a `greppin` binary that enables the `-H` option to load\na different engine at runtime, trying to exploit all possible performance bits.\n\nYou could link it against already installed libs, but the API just recently\nadded some functions in the 5.x version and most distros ship with 4.x.\n\n\nWhy is it faster?\n-----------------\n\n_grab_ is using `mmap(2)` and matches the whole file blob\nwithout counting newlines (which _grep_ is doing even if there is no match\n[as of a grep code review of mine in 2012; things may be different today])\nwhich is a lot faster than `read(2)`-ing the file in small chunks and counting the\nnewlines. If available, _grab_ also uses the PCRE JIT feature.\nHowever, speedups are only measurable on large file trees or fast HDDs or SSDs.\nIn the later case, the speedup can be really drastically (up to 3 times faster)\nif matching recursively and in parallel. Since storage is the bottleneck,\nparallelizing the search on HDDs makes no sense, as the seeking takes more time\nthan just doing stuff in linear.\n\nAdditionally, _grab_ is skipping files which are too small to contain the\nregular expression. For larger regex's in a recursive search, this can\nskip quite good amount of files without even opening them.\n\nA quite new *pcre* lib is required, on some older systems the build can fail\ndue to a missing `PCRE_INFO_MINLENGTH` and `pcre_study()`.\n\nFiles are mmaped and matched in chunks of 1Gig. For files which are larger,\nthe last 4096 byte (1 page) of a chunk are overlapped, so that matches on a 1 Gig\nboundary can be found. In this case, you see the match doubled (but with the\nsame offset).\n\nIf you measure _grep_ vs. _grab_, keep in mind to drop the dentry and page\ncaches between each run: `echo 3 \u003e /proc/sys/vm/drop_caches`\n\nNote, that _grep_ will print only a 'Binary file matches', if it detects binary\nfiles, while _grab_ will print all matches, unless `-s` is given. So, for a\nspeed test you have to search for an expression that *does not* exist in the data,\nin order to enforce searching of the entire files.\n\n_grab_ was made to quickly grep through large directory trees without indexing.\nThe original _grep_ has by far a more complete option-set. The speedup\nfor a single file match is very small, if at all measureable.\n\nFor SSDs, the multicore option makes sense. For HDDs it does not, since\nthe head has to be positioned back and forth between the threads, potentially\ndestroying the locality principle and killing performance.\n\nThe `greppin` branch features its own lockfree parallel version of `nftw()`, so the time\nof idling of N - 1 cores when the 1st core builds the directory tree can also\nbe used for working.\n\nWhats left to note: _grab_ will traverse directories _physically_, i.e. it will not follow\nsymlinks.\n\nspot\n----\n\n`spot` is the parallel version of `find`. It supports the most frequently used options as\nyou know it. Theres not much more to tell about it, just try it out.\n\n\nExamples\n--------\n\nThis shows the speedup on a 4-core machine with a search on a SSD:\n\n\n```\nroot@linux:~# echo 3 \u003e /proc/sys/vm/drop_caches\nroot@linux:~# time grep -r foobardoesnotexist /source/linux\n\nreal\t0m34.811s\nuser\t0m3.710s\nsys\t0m10.936s\nroot@linux:~# echo 3 \u003e /proc/sys/vm/drop_caches\nroot@linux:~# time grab -r foobardoesnotexist /source/linux\n\nreal\t0m31.629s\nuser\t0m4.984s\nsys\t0m8.690s\nroot@linux:~# echo 3 \u003e /proc/sys/vm/drop_caches\nroot@linux:~# time grab -n 2 -r foobardoesnotexist /source/linux\n\nreal\t0m15.203s\nuser\t0m3.689s\nsys\t0m4.665s\nroot@linux:~# echo 3 \u003e /proc/sys/vm/drop_caches\nroot@linux:~# time grab -n 4 -r foobardoesnotexist /source/linux\n\nreal\t0m13.135s\nuser\t0m4.023s\nsys\t0m5.581s\n```\n\nWith `greppin` branch:\n\n```\nroot@linux:~# echo 3 \u003e /proc/sys/vm/drop_caches\nroot@linux:~# time grep -a -P -r linus /source/linux/|wc -l\n16918\n\nreal    1m12.470s\nuser    0m49.548s\nsys     0m6.162s\nroot@linux:~# echo 3 \u003e /proc/sys/vm/drop_caches\nroot@linux:~# time greppin -n 4 -r linus /source/linux/|wc -l\n16918\n\nreal    0m8.773s\nuser    0m4.670s\nsys     0m5.837s\nroot@linux:~#\n```\n\nYes! ~ 9s vs. ~ 72s! Thats 8x as fast on a 4-core SSD machine as the traditional grep.\n\nJust to proof that it resulted in the same output:\n\n```\nroot@linux:~# echo 3 \u003e /proc/sys/vm/drop_caches\nroot@linux:~# greppin -n 4 -r linus /source/linux/|sort|md5sum\na1f9fe635bd22575a4cce851e79d26a0  -\nroot@linux:~# echo 3 \u003e /proc/sys/vm/drop_caches\nroot@linux:~# grep -P -a -r linus /source/linux/|sort|md5sum\na1f9fe635bd22575a4cce851e79d26a0  -\nroot@linux:~#\n```\n\n\nIn the single core comparison, speedup also depends on which CPU the kernel\nactually scheduls the _grep_, so a _grab_ may or may not be faster (mostly it is).\nIf the load is equal among the single-core tests, _grab_ will see a speedup if\nsearching on large file trees. On multi-core setups, _grab_ can benefit ofcorse.\n\n\nripgrep comparison\n------------------\n\nThe project can be found [here](https://github.com/BurntSushi/ripgrep).\n\nThe main speedup thats inside their benchmark tables stems from the fact that _ripgrep_\nignores a lot of files (notably  dotfiles) when invoked without special options as well\nas treating binary files as a single-match target (similar to _grep_). In order to have\ncomparable results, keep in mind to (4 is the number of cores):\n\n* `echo 3 \u003e /proc/sys/vm/drop_caches` between each run\n* Add `-j 4 -a --no-unicode --no-pcre2-unicode -uuu --mmap` to _ripgrep_, since\n  it will by default match Unicode which is 3 times slower, and tries to compensate\n  the speedloss by skipping 'ignore'-based files. `-e` is faster than `-P`,\n  so better choose `-e`, but thats not as powerful as a PCRE\n* redirect the output to `/dev/null` to avoid tty based effects\n* add `-H -n 4` to _greppin_ if you want best performance. `-H` is PCRE compatible\n  with only very few exceptions (according to hyperscan docu)\n* `setfattr -n user.pax.flags -v \"m\" /path/to/binary` if you run on grsec systems\n  and require rwx JIT mappings\n\nThen just go ahead and check the timings. Even when not using hyperscan, `greppin`\nis significantly faster than `rg` when using PCRE2 expressions (PCRE2 vs. PCRE2)\nand still faster when comparing the fastest expressions (-e vs. hyperscan).\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fstealth%2Fgrab","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fstealth%2Fgrab","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fstealth%2Fgrab/lists"}