{"id":26518683,"url":"https://github.com/davidesantangelo/krep","last_synced_at":"2026-02-18T17:04:40.176Z","repository":{"id":281610467,"uuid":"944991825","full_name":"davidesantangelo/krep","owner":"davidesantangelo","description":"Fast text search tool with advanced algorithms, SIMD acceleration, multi-threading, and regex support. Designed for rapid, large-scale pattern matching with memory-mapped I/O and hardware optimizations.","archived":false,"fork":false,"pushed_at":"2026-01-23T10:12:55.000Z","size":602,"stargazers_count":432,"open_issues_count":4,"forks_count":22,"subscribers_count":6,"default_branch":"main","last_synced_at":"2026-01-24T02:59:13.155Z","etag":null,"topics":["c","cli","hardware-acceleration","search-algorithm","searching"],"latest_commit_sha":null,"homepage":"","language":"C","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"bsd-2-clause","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/davidesantangelo.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":".github/FUNDING.yml","license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null},"funding":{"github":"davidesantangelo","patreon":null,"open_collective":null,"ko_fi":null,"tidelift":null,"community_bridge":null,"liberapay":null,"issuehunt":null,"lfx_crowdfunding":null,"polar":null,"buy_me_a_coffee":null,"thanks_dev":null,"custom":null}},"created_at":"2025-03-08T12:01:29.000Z","updated_at":"2026-01-23T23:29:04.000Z","dependencies_parsed_at":"2025-03-10T08:06:33.097Z","dependency_job_id":"6f955cf7-47e4-449e-8c91-da41ceb23af4","html_url":"https://github.com/davidesantangelo/krep","commit_stats":null,"previous_names":["davidesantangelo/krep"],"tags_count":39,"template":false,"template_full_name":null,"purl":"pkg:github/davidesantangelo/krep","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/davidesantangelo%2Fkrep","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/davidesantangelo%2Fkrep/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/davidesantangelo%2Fkrep/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/davidesantangelo%2Fkrep/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/davidesantangelo","download_url":"https://codeload.github.com/davidesantangelo/krep/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/davidesantangelo%2Fkrep/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":29587066,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-02-18T16:55:40.614Z","status":"ssl_error","status_checked_at":"2026-02-18T16:55:37.558Z","response_time":162,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.6:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["c","cli","hardware-acceleration","search-algorithm","searching"],"created_at":"2025-03-21T10:01:31.225Z","updated_at":"2026-02-18T17:04:40.168Z","avatar_url":"https://github.com/davidesantangelo.png","language":"C","readme":"# k(r)ep - A high-performance string search utility\n\n![Version](https://img.shields.io/badge/version-2.2.0-blue)\n![License](https://img.shields.io/badge/license-BSD-green)\n\n`krep` is an optimized string search utility designed for maximum throughput and efficiency when processing large files and directories. It is built with performance in mind, offering multiple search algorithms and SIMD acceleration when available.\n\n\u003e **Note:**  \n\u003e Krep is not intended to be a full replacement or direct competitor to feature-rich tools like `grep` or `ripgrep`. Instead, it aims to be a minimal, efficient, and pragmatic tool focused on speed and simplicity.\n\u003e\n\u003e Krep provides the essential features needed for fast searching, without the extensive options and complexity of more comprehensive search utilities. Its design philosophy is to deliver the fastest possible search for the most common use cases, with a clean and minimal interface.\n\n## The Story Behind the Name\n\nThe name \"krep\" has an interesting origin. It is inspired by the Icelandic word \"kreppan,\" which means \"to grasp quickly\" or \"to catch firmly.\" I came across this word while researching efficient techniques for pattern recognition.\n\nJust as skilled fishers identify patterns in the water to locate fish quickly, I designed \"krep\" to find patterns in text with maximum efficiency. The name is also short and easy to remember—perfect for a command-line utility that users might type hundreds of times per day.\n\n## Key Features\n\n- **Multiple search algorithms**: Boyer-Moore-Horspool, KMP, Aho-Corasick for optimal performance across different pattern types\n- **Algorithm selection**: Automatic smart selection with optional `--algo` override for fine-tuning\n- **SIMD acceleration**: Uses SSE4.2, AVX2, or NEON instructions when available for blazing-fast searches\n- **Memory-mapped I/O**: Maximizes throughput when processing large files\n- **Multi-threaded search**: Automatically parallelizes searches across available CPU cores\n- **Regex support**: POSIX Extended Regular Expression searching\n- **Multiple pattern search**: Efficiently search for multiple patterns simultaneously using Aho-Corasick\n- **Recursive directory search**: Skip binary files and common non-code directories\n- **Gitignore support**: Respect `.gitignore` files during recursive search with `--gitignore`\n- **Stdin pattern input**: Read patterns from stdin with `-f -` for seamless pipeline integration\n- **Colored output**: Highlights matches for better readability\n- **UI/UX refresh (v2.2)**: New terminal color palette, clearer `-o` line index styling, and a redesigned help screen\n- **Specialized algorithms**: Optimized handling for single-character and short patterns\n- **Match Limiting**: Stop searching a file after a specific number of matching lines are found.\n\n## What's New in v2.2.0\n\n- Refined terminal-first UI with a cleaner, more legible color theme\n- Better visual hierarchy in `-o` mode (filename, line index, match highlight)\n- Improved `--help` layout with grouped sections and clearer scanning\n\n## Installation\n\n### Using Homebrew (macOS)\n\nIf you are on macOS and have Homebrew installed, you can install `krep` easily:\n\n```bash\nbrew install krep\n```\n\n### Building from Source\n\n```bash\n# Clone the repository\ngit clone https://github.com/davidesantangelo/krep.git\ncd krep\n\n# Build and install\nmake\nsudo make install\n\n# uninstall\nsudo make uninstall\n```\n\nThe binary will be installed to `/usr/local/bin/krep` by default.\n\n### Requirements\n\n- GCC or compatible C compiler\n- POSIX-compliant system (Linux, macOS, BSD)\n- pthread support\n\n### Build Options\n\nOverride default optimization settings in the Makefile:\n\n```bash\n# Disable architecture-specific optimizations\nmake ENABLE_ARCH_DETECTION=0\n```\n\n## Usage\n\n```bash\nkrep [OPTIONS] PATTERN [FILE | DIRECTORY]\nkrep [OPTIONS] -e PATTERN [FILE | DIRECTORY]\nkrep [OPTIONS] -f FILE [FILE | DIRECTORY]\nkrep [OPTIONS] -s PATTERN STRING_TO_SEARCH\nkrep [OPTIONS] PATTERN \u003c FILE\ncat FILE | krep [OPTIONS] PATTERN\necho 'pattern' | krep -f - [FILE | DIRECTORY]\n```\n\n## Usage Examples\n\nSearch for a fixed string in a file:\n\n```bash\nkrep -F \"value: 100%\" config.ini\n```\n\nSearch recursively:\n\n```bash\nkrep -r \"function\" ./project\n```\n\nSearch recursively respecting `.gitignore`:\n\n```bash\nkrep -r --gitignore \"TODO\" ./project\n```\n\nRead patterns from stdin (pipe-friendly):\n\n```bash\necho 'pattern' | krep -f - target.txt\n```\n\nWhole word search (matches only complete words):\n\n```bash\nkrep -w 'cat' samples/text.en\n```\n\nUse with piped input:\n\n```bash\ncat krep.c | krep 'c'\n```\n\n## Command Line Options\n\n- `-i, --ignore-case` Case-insensitive search\n- `-c, --count` Count matching lines only\n- `-o, --only-matching` Print only the matched parts of lines\n- `-e PATTERN, --pattern=PATTERN` Specify pattern(s). Can be used multiple times.\n- `-f FILE, --file=FILE` Read patterns from FILE, one per line. Use `-` for stdin.\n- `-m NUM, --max-count=NUM` Stop searching each file after finding NUM matching lines.\n- `-E, --extended-regexp` Use POSIX Extended Regular Expressions\n- `-F, --fixed-strings` Interpret pattern as fixed string(s) (default unless -E is used)\n- `-r, --recursive` Recursively search directories\n- `--gitignore` Respect `.gitignore` files during recursive search\n- `--algo=ALGO` Force search algorithm: `auto` (default), `bm` (Boyer-Moore), `kmp` (KMP)\n- `-t NUM, --threads=NUM` Use NUM threads for file search (default: auto)\n- `-s STRING, --string=STRING` Search in the provided STRING instead of file(s)\n- `-w, --word-regexp` Match only whole words\n- `--color[=WHEN]` Control color output ('always', 'never', 'auto')\n- `--no-simd` Explicitly disable SIMD acceleration\n- `-v, --version` Show version information\n- `-h, --help` Show help message\n\n## Performance Benchmarks\n\nBenchmarks are run with the official dataset:\n\n```bash\ncurl -LO 'https://burntsushi.net/stuff/subtitles2016-sample.en.gz'\ngzip -dk subtitles2016-sample.en.gz\n```\n\nYou can reproduce the `krep` vs `ripgrep` comparison with:\n\n```bash\nmake bench-rg\n# optional: RUNS=7 bash test/benchmark_krep_vs_rg.sh Sherlock\n```\n\n### krep v2.1.0 vs ripgrep (warm cache, 7 runs average baseline)\n\n| Pattern | krep avg real (s) | ripgrep avg real (s) | Speedup |\n| --- | ---: | ---: | ---: |\n| `the` | 0.175714 | 0.330000 | 1.88x |\n| `Sherlock` | 0.041429 | 0.080000 | 1.93x |\n\n_Measured on macOS ARM64 with `test/benchmark_krep_vs_rg.sh`. Results vary by CPU, storage and cache state._\n\n## How Krep Works\n\nKrep achieves its high performance through several key techniques:\n\n### 1. Smart Algorithm Selection\n\nKrep automatically selects the optimal search algorithm based on the pattern and available hardware:\n\n- **Boyer-Moore-Horspool** for most literal string searches\n- **Knuth-Morris-Pratt (KMP)** for very short patterns and repetitive patterns\n- **memchr optimization** for single-character patterns\n- **SIMD Acceleration** (SSE4.2, AVX2, or NEON) for compatible hardware\n- **Regex Engine** for regular expression patterns\n- **Aho-Corasick** for efficient multiple pattern matching (auto-selected with multiple `-e` patterns)\n\nUse `--algo=bm` or `--algo=kmp` to override the automatic selection for single-pattern literal searches.\n\n### 2. Multi-threading Architecture\n\nKrep utilizes parallel processing to dramatically speed up searches:\n\n- Automatically detects available CPU cores\n- Divides large files into chunks for parallel processing\n- Implements thread pooling for maximum efficiency\n- Optimized thread count selection based on file size\n- Careful boundary handling to ensure no matches are missed\n\n### 3. Memory-Mapped I/O\n\nInstead of traditional read operations:\n\n- Memory maps files for direct access by the CPU\n- Significantly reduces I/O overhead\n- Enables CPU cache optimization\n- Progressive prefetching for larger files\n\n### 4. Optimized Data Structures\n\n- Zero-copy architecture where possible\n- Efficient match position tracking\n- Lock-free aggregation of results\n\n### 5. Skipping Non-Relevant Content\n\nWhen using recursive search (`-r`), Krep automatically:\n\n- Skips common binary file types\n- Ignores version control directories (`.git`, `.svn`)\n- Bypasses dependency directories (`node_modules`, `venv`)\n- Detects binary content to avoid searching non-text files\n\n## Contributing\n\nContributions are welcome! Please feel free to submit a Pull Request.\n\n## Author\n\n- **Davide Santangelo** - [GitHub](https://github.com/davidesantangelo)\n\n## License\n\nThis project is licensed under the BSD-2 License - see the LICENSE file for details.\n\nCopyright © 2025 Davide Santangelo\n","funding_links":["https://github.com/sponsors/davidesantangelo"],"categories":["C","\u003ca name=\"text-search\"\u003e\u003c/a\u003eText search (alternatives to grep)"],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdavidesantangelo%2Fkrep","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdavidesantangelo%2Fkrep","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdavidesantangelo%2Fkrep/lists"}