{"id":13395193,"url":"https://github.com/phiresky/ripgrep-all","last_synced_at":"2025-05-13T11:01:57.146Z","repository":{"id":38305911,"uuid":"190254663","full_name":"phiresky/ripgrep-all","owner":"phiresky","description":"rga: ripgrep, but also search in PDFs, E-Books, Office documents, zip, tar.gz, etc.","archived":false,"fork":false,"pushed_at":"2025-04-27T08:05:48.000Z","size":9728,"stargazers_count":8722,"open_issues_count":48,"forks_count":187,"subscribers_count":44,"default_branch":"master","last_synced_at":"2025-05-05T20:49:50.088Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Rust","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/phiresky.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE.md","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2019-06-04T18:08:31.000Z","updated_at":"2025-05-05T18:38:59.000Z","dependencies_parsed_at":"2023-02-06T01:16:07.037Z","dependency_job_id":"ed176fb5-a95b-416f-8003-bb1ad6cd3343","html_url":"https://github.com/phiresky/ripgrep-all","commit_stats":{"total_commits":359,"total_committers":27,"mean_commits":"13.296296296296296","dds":0.245125348189415,"last_synced_commit":"4a73df37c9fe0f116cc99fe711d32ef614271586"},"previous_names":["phiresky/rga"],"tags_count":13,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/phiresky%2Fripgrep-all","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/phiresky%2Fripgrep-all/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/phiresky%2Fripgrep-all/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/phiresky%2Fripgrep-all/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/phiresky","download_url":"https://codeload.github.com/phiresky/ripgrep-all/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":253621332,"owners_count":21937505,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-07-30T17:01:45.756Z","updated_at":"2025-05-13T11:01:56.666Z","avatar_url":"https://github.com/phiresky.png","language":"Rust","funding_links":[],"categories":["Rust","Literaturrecherche \u0026 Wissensaufbau","Applications","CLI","应用程序 Applications","Productivity","应用","others","Catalog","\u003ca name=\"core\"\u003e\u003c/a\u003ecore","\u003ca name=\"text-search\"\u003e\u003c/a\u003eText search (alternatives to grep)","Text Search","Command Line Tools","Other","Fuzzy Finding \u0026 Search","Table of Contents"],"sub_categories":["Text processing","文本处理 Text processing","Kubernetes","文本处理","search","Other","Tools"],"readme":"# rga: ripgrep, but also search in PDFs, E-Books, Office documents, zip, tar.gz, etc.\n\nrga is a line-oriented search tool that allows you to look for a regex in a multitude of file types. rga wraps the awesome [ripgrep] and enables it to search in pdf, docx, sqlite, jpg, movie subtitles (mkv, mp4), etc.\n\n[ripgrep]: https://github.com/BurntSushi/ripgrep\n\n[![github repo](https://img.shields.io/badge/repo-github.com%2Fphiresky%2Fripgrep--all-informational.svg)](https://github.com/phiresky/ripgrep-all)\n[![Crates.io](https://img.shields.io/crates/v/ripgrep-all.svg)](https://crates.io/crates/ripgrep-all)\n[![fearless concurrency](https://img.shields.io/badge/concurrency-fearless-success.svg)](https://www.reddit.com/r/rustjerk/top/?sort=top\u0026t=all)\n\nFor more detail, see this introductory blogpost: https://phiresky.github.io/blog/2019/rga--ripgrep-for-zip-targz-docx-odt-epub-jpg/\n\nrga will recursively descend into archives and match text in every file type it knows.\n\nHere is an [example directory](https://github.com/phiresky/ripgrep-all/tree/master/exampledir/demo) with different file types:\n\n```\ndemo/\n├── greeting.mkv\n├── hello.odt\n├── hello.sqlite3\n└── somearchive.zip\n├── dir\n│ ├── greeting.docx\n│ └── inner.tar.gz\n│ └── greeting.pdf\n└── greeting.epub\n```\n\n![rga output](doc/demodir.png)\n\n## Integration with fzf\n\n![rga-fzf](doc/rga-fzf.gif)\n\nSee [the wiki](https://github.com/phiresky/ripgrep-all/wiki/fzf-Integration) for instructions of integrating rga with fzf.\n\n## INSTALLATION\n\nLinux x64, macOS and Windows binaries are available [in GitHub Releases][latestrelease].\n\n[latestrelease]: https://github.com/phiresky/ripgrep-all/releases/latest\n\n### Linux\n\n#### Arch Linux\n\n`pacman -S ripgrep-all`\n\n#### Gentoo Linux\n\n`emerge sys-apps/ripgrep-all`\n\n#### Nix\n\n`nix-env -iA nixpkgs.ripgrep-all`\n\n#### Debian-based\n\ndownload the [rga binary][latestrelease] and get the dependencies like this:\n\n`apt install ripgrep pandoc poppler-utils ffmpeg`\n\nIf ripgrep is not included in your package sources, get it from [here](https://github.com/BurntSushi/ripgrep/releases).\n\nrga will search for all binaries it calls in \\$PATH and the directory itself is in.\n\n### Windows\n\nNote that installing via [chocolatey](https://chocolatey.org/packages/ripgrep-all) or [scoop](https://github.com/ScoopInstaller/Main/blob/master/bucket/rga.json) is the only supported download method. If you download the binary from releases manually, you will not get the dependencies (for example pdftotext from poppler).\n\nIf you get an error like `VCRUNTIME140.DLL could not be found`, you need to install [vc_redist.x64.exe](https://support.microsoft.com/en-us/help/2977003/the-latest-supported-visual-c-downloads).\n\n#### Chocolatey\n\n```\nchoco install ripgrep-all\n```\n\n#### Scoop\n\n```\nscoop install rga\n```\n\n### Homebrew/Linuxbrew\n\n`rga` can be installed with [Homebrew](https://formulae.brew.sh/formula/ripgrep-all#default):\n\n`brew install rga`\n\nTo install the dependencies that are each not strictly necessary but very useful:\n\n`brew install pandoc poppler ffmpeg`\n\n### MacPorts\n\n`rga` can also be installed on macOS via [MacPorts](https://ports.macports.org/port/ripgrep-all/):\n\n`sudo port install ripgrep-all`\n\n### Compile from source\n\nrga should compile with stable Rust (v1.75.0+, check with `rustc --version`). To build it, run the following (or the equivalent in your OS):\n\n```\n~$ apt install build-essential pandoc poppler-utils ffmpeg ripgrep cargo\n~$ cargo install --locked ripgrep_all\n~$ rga --version    # this should work now\n```\n\n## Available Adapters\n\nrga works with _adapters_ that adapt various file formats. It comes with a few adapters integrated:\n\n```\nrga --rga-list-adapters\n```\n\nYou can also add **custom adapters**. See [the wiki](https://github.com/phiresky/ripgrep-all/wiki) for more information.\n\n\u003c!-- this part generated by update-readme.sh --\u003e\n\nAdapters:\n\n- **pandoc**\n  Uses pandoc to convert binary/unreadable text documents to plain markdown-like text\n  Runs: pandoc --from= --to=plain --wrap=none --markdown-headings=atx  \n   Extensions: .epub, .odt, .docx, .fb2, .ipynb, .html, .htm\n\n- **poppler**\n  Uses pdftotext (from poppler-utils) to extract plain text from PDF files\n  Runs: pdftotext - -  \n   Extensions: .pdf  \n   Mime Types: application/pdf\n\n- **postprocpagebreaks**\n  Adds the page number to each line for an input file that specifies page breaks as ascii page break character.\n  Mainly to be used internally by the poppler adapter.  \n   Extensions: .asciipagebreaks\n\n- **ffmpeg**\n  Uses ffmpeg to extract video metadata/chapters, subtitles, lyrics, and other metadata  \n   Extensions: .mkv, .mp4, .avi, .mp3, .ogg, .flac, .webm\n\n- **zip**\n  Reads a zip file as a stream and recurses down into its contents  \n   Extensions: .zip, .jar  \n   Mime Types: application/zip\n\n- **decompress**\n  Reads compressed file as a stream and runs a different extractor on the contents.  \n   Extensions: .als, .bz2, .gz, .tbz, .tbz2, .tgz, .xz, .zst  \n   Mime Types: application/gzip, application/x-bzip, application/x-xz, application/zstd\n\n- **tar**\n  Reads a tar file as a stream and recurses down into its contents  \n   Extensions: .tar\n\n- **sqlite**\n  Uses sqlite bindings to convert sqlite databases into a simple plain text format  \n   Extensions: .db, .db3, .sqlite, .sqlite3  \n   Mime Types: application/x-sqlite3\n\nThe following adapters are disabled by default, and can be enabled using '--rga-adapters=+foo,bar':\n\n- **mail**\n  Reads mailbox/mail files and runs extractors on the contents and attachments.  \n   Extensions: .mbox, .mbx, .eml  \n   Mime Types: application/mbox, message/rfc822\n\n## USAGE:\n\n\u003e rga \\[RGA OPTIONS\\] \\[RG OPTIONS\\] PATTERN \\[PATH \\...\\]\n\n\n## FLAGS:\n\n**\\--rga-accurate**\n\n\u003e Use more accurate but slower matching by mime type\n\n\u003e By default, rga will match files using file extensions. Some programs,\n\u003e such as sqlite3, don\\'t care about the file extension at all, so users\n\u003e sometimes use any or no extension at all. With this flag, rga will try\n\u003e to detect the mime type of input files using the magic bytes (similar\n\u003e to the \\`file\\` utility), and use that to choose the adapter.\n\u003e Detection is only done on the first 8KiB of the file, since we can\\'t\n\u003e always seek on the input (in archives).\n\n**\\--rga-no-cache**\n\n\u003e Disable caching of results\n\n\u003e By default, rga caches the extracted text, if it is small enough, to a\n\u003e database in \\${XDG_CACHE_DIR-\\~/.cache}/ripgrep-all on Linux,\n\u003e _\\~/Library/Caches/ripgrep-all_ on macOS, or\n\u003e C:\\\\Users\\\\username\\\\AppData\\\\Local\\\\ripgrep-all on Windows. This way,\n\u003e repeated searches on the same set of files will be much faster. If you\n\u003e pass this flag, all caching will be disabled.\n\n**-h**, **\\--help**\n\n\u003e Prints help information\n\n**\\--rga-list-adapters**\n\n\u003e List all known adapters\n\n**\\--rga-print-config-schema**\n\n\u003e Print the JSON Schema of the configuration file\n\n**\\--rg-help**\n\n\u003e Show help for ripgrep itself\n\n**\\--rg-version**\n\n\u003e Show version of ripgrep itself\n\n**-V**, **\\--version**\n\n\u003e Prints version information\n\n## OPTIONS:\n\n**\\--rga-adapters=**\\\u003cadapters\\\u003e\\...\n\n\u003e Change which adapters to use and in which priority order (descending)\n\n\u003e \\\"foo,bar\\\" means use only adapters foo and bar. \\\"-bar,baz\\\" means\n\u003e use all default adapters except for bar and baz. \\\"+bar,baz\\\" means\n\u003e use all default adapters and also bar and baz.\n\n**\\--rga-cache-compression-level=**\\\u003ccompression-level\\\u003e\n\n\u003e ZSTD compression level to apply to adapter outputs before storing in\n\u003e cache db\n\n\u003e Ranges from 1 - 22 \\[default: 12\\]\n\n**\\--rga-config-file=**\\\u003cconfig-file-path\\\u003e\n\n**\\--rga-max-archive-recursion=**\\\u003cmax-archive-recursion\\\u003e\n\n\u003e Maximum nestedness of archives to recurse into \\[default: 5\\]\n\n**\\--rga-cache-max-blob-len=**\\\u003cmax-blob-len\\\u003e\n\n\u003e Max compressed size to cache\n\n\u003e Longest byte length (after compression) to store in cache. Longer\n\u003e adapter outputs will not be cached and recomputed every time.\n\n\u003e Allowed suffixes on command line: k M G \\[default: 2000000\\]\n\n**\\--rga-cache-path=**\\\u003cpath\\\u003e\n\n\u003e Path to store cache db \\[default: /home/phire/.cache/ripgrep-all\\]\n\n**-h** shows a concise overview, **\\--help** shows more detail and\nadvanced options.\n\nAll other options not shown here are passed directly to rg, especially\n\\[PATTERN\\] and \\[PATH \\...\\]\n\n\u003c!-- end of part generated by update-readme.sh --\u003e\n\n## Config\nThe config file location leverage the mechanisms defined by\n- the [XDG base directory](https://standards.freedesktop.org/basedir-spec/basedir-spec-latest.html) and\n  the [XDG user directory](https://www.freedesktop.org/wiki/Software/xdg-user-dirs/) specifications on Linux (ex: `~/.config/ripgrep-all/config.jsonc`)\n- the [Known Folder](https://msdn.microsoft.com/en-us/library/windows/desktop/dd378457.aspx) API on Windows (ex:  `C:\\Users\\Alice\\AppData\\Roaming\\ripgrep-all/config.jsonc`)\n- the [Standard Directories](https://developer.apple.com/library/content/documentation/FileManagement/Conceptual/FileSystemProgrammingGuide/FileSystemOverview/FileSystemOverview.html#//apple_ref/doc/uid/TP40010672-CH2-SW6)\n  guidelines on macOS (ex: `~/Library/Application Support/ripgrep-all/config.jsonc`)\n\n\n## Development\n\nTo enable debug logging:\n\n```bash\nexport RUST_LOG=debug\nexport RUST_BACKTRACE=1\n```\n\nAlso remember to disable caching with `--rga-no-cache` or clear the cache\n(`~/Library/Caches/rga` on macOS, `~/.cache/rga` on other Unixes,\nor `C:\\Users\\username\\AppData\\Local\\rga` on Windows)\nto debug the adapters.\n\n### Nix and Direnv\n\nYou can use the provided [`flake.nix`](./flake.nix) to setup all build- and\nrun-time dependencies:\n\n1. Enable [Flakes](https://wiki.nixos.org/wiki/Flakes) in your Nix configuration.\n1. Add [`direnv`](https://direnv.net/) to your profile:\n   `nix profile install nixpkgs#direnv`\n1. `cd` into the directory where you have cloned this directory.\n1. Allow use of [`.envrc`](./.envrc): `direnv allow`\n1. After the dependencies have been installed, your shell will now have all of\n   the necessary development dependencies.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fphiresky%2Fripgrep-all","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fphiresky%2Fripgrep-all","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fphiresky%2Fripgrep-all/lists"}