https://github.com/dkorunic/minifind
minimal find reimplementation with the emphasis on speed
https://github.com/dkorunic/minifind
command-line-tool filesystem find linux macos os performance performance-analysis system-administration unix utility
Last synced: about 1 year ago
JSON representation
minimal find reimplementation with the emphasis on speed
- Host: GitHub
- URL: https://github.com/dkorunic/minifind
- Owner: dkorunic
- License: mit
- Created: 2024-10-14T16:35:02.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2025-03-12T07:43:05.000Z (over 1 year ago)
- Last Synced: 2025-04-17T19:18:13.525Z (about 1 year ago)
- Topics: command-line-tool, filesystem, find, linux, macos, os, performance, performance-analysis, system-administration, unix, utility
- Language: Rust
- Homepage:
- Size: 45.9 KB
- Stars: 5
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# minifind
[](https://github.com/dkorunic/minifind/blob/master/LICENSE.txt)
[](https://github.com/dkorunic/minifind/releases/latest)
[](https://rust-reportcard.xuri.me/report/github.com/dkorunic/minifind)
[](https://github.com/dkorunic/minifind/actions/workflows/release.yml)
## About
`minifind` is a barebones Un\*x `find` tool implementation in Rust, meant just to list directory entries as fast as possible and little else. For filename or path matching, it is possible to use `--name` or `--regex` options, toggling case insensitivity with `--case-insensitive` or not. Additionally to narrow down matches, it is possible to use `--file-type` option and filter by file type (`b` for block device, `c` for character device, `d` for directory, `p` for named FIFO, `f` for file, `l` for symlink, `s` for socket or `e` for empty file/directory).
It will not follow filesystem symlinks and it will not cross filesystem boundaries by default. Number of threads used is set to the number of available CPU cores in the system.
## Related projects
Let us also mention other notable projects dealing with this task:
- [sharkdp/fd](https://github.com/sharkdp/fd) which is a much more featured find alternative but with excellent performance,
- [LyonSyonII/hunt-rs](https://github.com/LyonSyonII/hunt-rs), a very similar high performance-oriented tool,
- [BurntSushi/ripgrep](https://github.com/BurntSushi/ripgrep) which also houses [globset](https://github.com/BurntSushi/ripgrep/tree/master/crates/globset) and [ignore](https://github.com/BurntSushi/ripgrep/tree/master/crates/ignore) crates that are used in this project,
- Rust [findutils](https://github.com/uutils/findutils) reimplementation that can be used as drop-in replacement.
## Usage
```shell
Usage: minifind [OPTIONS] ...
Arguments:
... Paths to check for large directories
Options:
-f, --follow-symlinks Follow symlinks [default: false] [short aliases: L] [possible values: true, false]
-o, --one-filesystem Do not cross mount points [default: true] [aliases: xdev] [possible values: true, false]
-x, --threads Number of threads to use when calibrating and scanning [default: 20]
-d, --max-depth Maximum depth to traverse
-n, --name Base of the file name matching globbing pattern
-r, --regex File name (full path) matching regular expression pattern
-i, --case-insensitive Case-insensitive matching for globbing and regular expression patterns [default: false] [possible values: true, false]
-t, --file-type Filter matches by type. Also accepts 'b', 'c', 'd', 'p', 'f', 'l', 's' and 'e' aliases [default: directory file symlink]
[possible values: empty, block-device, char-device, directory, pipe, file, symlink, socket]
-h, --help Print help
-V, --version Print version
```
### Regular expressions
`--regex` option uses Rust [regex syntax](https://docs.rs/regex/latest/regex/#syntax) that is very similar to other engines but without support for look-around and backreferences.
### Glob expressions
`--name` option uses Unix-style [glob syntax](https://docs.rs/globset/latest/globset/#syntax).
## Minifind vs GNU find
Hardware: 8-core Xeon E5-1630 with 4-drive SATA RAID-10
Benchmark setup:
```shell
$ cat bench1.sh
#!/bin/dash
exec /usr/bin/find / -xdev
$ cat bench2.sh
#!/bin/dash
exec /usr/local/sbin/minifind /
```
```shell
Benchmark 1: ./bench1.sh
Time (mean ± σ): 4.655 s ± 0.160 s [User: 1.287 s, System: 3.366 s]
Range (min … max): 4.525 s … 5.016 s 10 runs
Benchmark 2: ./bench2.sh
Time (mean ± σ): 1.244 s ± 0.020 s [User: 3.921 s, System: 5.908 s]
Range (min … max): 1.199 s … 1.271 s 10 runs
Summary
./bench2.sh ran
3.74 ± 0.14 times faster than ./bench1.sh
```