{"id":50766781,"url":"https://github.com/dkorunic/bench_walk","last_synced_at":"2026-06-11T14:30:31.056Z","repository":{"id":362974116,"uuid":"871266587","full_name":"dkorunic/bench_walk","owner":"dkorunic","description":"benchmark suite for Rust filesystem walking crates","archived":false,"fork":false,"pushed_at":"2026-06-06T19:02:56.000Z","size":201,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2026-06-06T21:05:37.261Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Rust","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/dkorunic.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2024-10-11T15:43:14.000Z","updated_at":"2026-06-06T19:03:00.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/dkorunic/bench_walk","commit_stats":null,"previous_names":["dkorunic/bench_walk"],"tags_count":null,"template":false,"template_full_name":null,"purl":"pkg:github/dkorunic/bench_walk","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dkorunic%2Fbench_walk","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dkorunic%2Fbench_walk/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dkorunic%2Fbench_walk/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dkorunic%2Fbench_walk/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/dkorunic","download_url":"https://codeload.github.com/dkorunic/bench_walk/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dkorunic%2Fbench_walk/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":34204177,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-11T02:00:06.485Z","response_time":57,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2026-06-11T14:30:29.445Z","updated_at":"2026-06-11T14:30:31.049Z","avatar_url":"https://github.com/dkorunic.png","language":"Rust","funding_links":[],"categories":[],"sub_categories":[],"readme":"# bench_walk\n\n## About\n\nThis is a small benchmarking project that compares different Rust directory-walking crates, namely:\n\n- [fts_walkdir](https://crates.io/crates/fts)\n- [walkdir](https://crates.io/crates/walkdir)\n- [walkdir_minimal](https://crates.io/crates/walkdir_minimal)\n- [isideload-walkdir](https://crates.io/crates/isideload-walkdir)\n- [walker](https://crates.io/crates/walker)\n- [ignore](https://crates.io/crates/ignore)\n- [jwalk](https://crates.io/crates/jwalk)\n- [fs-walk](https://crates.io/crates/fs-walk)\n- [scandir](https://crates.io/crates/scandir)\n- [swdir](https://crates.io/crates/swdir)\n- [fsindex](https://crates.io/crates/fsindex)\n- [async-walkdir](https://crates.io/crates/async-walkdir)\n\nThe system `find` command (GNU findutils) is also benchmarked as a non-crate baseline.\n\nAll crates are tested against the Linux kernel Git repository, which is checked out locally during the benchmark run. Each crate is exercised in single-threaded mode, and additionally in multi-threaded mode where supported (`ignore`, `jwalk`, `scandir`, `swdir` and `fsindex` offer parallel traversal; `scandir` and `swdir` traverse in parallel internally and so appear only in the multi-threaded group).\n\nEvery crate is configured for the fastest possible *full-tree* traversal: sorting is disabled (`jwalk`, `fs-walk` and `swdir` default to unsorted; `swdir` is pinned to raw filesystem order), optional per-entry metadata is not requested (`fts_walkdir` uses `no_metadata`, `scandir` runs without extended metadata and without retaining entries), symlink-following is left off, and any default filtering that would skip part of the tree is removed so all crates walk the same set of files. Concretely: the default `skip_hidden` is turned off for both `jwalk` and `scandir`, `swdir`'s default hidden-file filter is cleared, and `fsindex` is told to ignore `.gitignore`, include hidden files and skip reading file contents (so it measures traversal rather than I/O). Note that `fsindex` is itself built on top of `ignore` and yields only files (not directories), while computing per-file metadata.\n\nThe benchmark suite uses [Criterion](https://github.com/bheisler/criterion.rs) for statistically rigorous measurement, and is intended as a real-life comparison between the different walking implementations.\n\n## Results\n\n### Hardware\n\nLow-end server, 8-core Xeon E5-1630 v3, 4-drive SATA RAID-10 with ext4 filesystem.\n\nMeasured with Criterion (80 s warm-up, 400 s measurement per benchmark) against a shallow (`--depth 1`) clone of the mainline Linux kernel tree. The filesystem cache is warm, so these numbers reflect in-memory traversal cost rather than cold-cache disk seeks.\n\n### Duration report\n\nBenchmarks are split into two groups for comparison: `bench_serial` (single-threaded) and `bench_parallel` (multi-threaded).\n\nNumbers below are the best estimate plus the 95% confidence interval reported by Criterion, sorted from fastest to slowest within each group.\n\n| crate                                      | lower bound | best estimate | upper bound |\n| ------------------------------------------ | ----------- | ------------- | ----------- |\n| bench_serial/walkdir                       | 68.161 ms   | 68.393 ms     | 68.628 ms   |\n| bench_serial/walkdir_minimal               | 69.529 ms   | 69.769 ms     | 70.022 ms   |\n| bench_serial/ignore (serial unsorted)      | 73.172 ms   | 73.457 ms     | 73.754 ms   |\n| bench_serial/jwalk (serial unsorted)       | 76.746 ms   | 77.011 ms     | 77.300 ms   |\n| bench_serial/fts_walkdir                   | 81.297 ms   | 81.623 ms     | 81.963 ms   |\n| bench_serial/find                          | 90.423 ms   | 90.621 ms     | 90.829 ms   |\n| bench_serial/walker                        | 219.13 ms   | 219.61 ms     | 220.13 ms   |\n| bench_serial/fsindex (serial)              | 222.58 ms   | 223.38 ms     | 224.26 ms   |\n| bench_serial/isideload-walkdir             | 340.98 ms   | 341.86 ms     | 342.79 ms   |\n| bench_serial/fs_walk (serial unsorted)     | 446.17 ms   | 446.82 ms     | 447.57 ms   |\n| bench_serial/async-walkdir (block_on)      | 1.5417 s    | 1.5435 s      | 1.5452 s    |\n| bench_parallel/ignore (n threads unsorted) | 24.714 ms   | 24.728 ms     | 24.742 ms   |\n| bench_parallel/swdir (n threads)           | 25.209 ms   | 25.228 ms     | 25.247 ms   |\n| bench_parallel/jwalk (n threads, unsorted) | 26.332 ms   | 26.349 ms     | 26.366 ms   |\n| bench_parallel/scandir (n threads)         | 64.342 ms   | 64.382 ms     | 64.423 ms   |\n| bench_parallel/fsindex (parallel)          | 123.56 ms   | 123.96 ms     | 124.38 ms   |\n\n### Analysis\n\n**Single-threaded.** The results fall into three tiers. The `readdir`-based walkers cluster tightly at the top: `walkdir` (68 ms) and `walkdir_minimal` (70 ms) are fastest, followed closely by `ignore` (73 ms) and `jwalk` in serial mode (77 ms). All four read each directory once and use the `d_type` field returned by `readdir(3)` to tell files from directories, so they recurse without a single extra `stat`. `fts_walkdir` (82 ms) wraps the C `fts(3)` routines through FFI and pays a small crossing cost, and the `find` baseline (91 ms) additionally pays process-spawn and output-formatting overhead.\n\nThe second tier is several times slower because each crate issues a syscall *per entry* that the leaders avoid. `walker` (220 ms) calls `std::fs::metadata()` on every path to decide whether to recurse, turning one `readdir` per directory into one `readdir` plus one `stat` per entry. `fsindex` (223 ms) is built on top of `ignore`, but it `stat`s every file to populate its metadata struct and yields only files. `isideload-walkdir` (342 ms) is a `walkdir` fork that routes all I/O through the `isideload_vfs` virtual-filesystem abstraction and takes per-entry metadata — portability traded for raw speed. `fs_walk` (447 ms) calls both `is_symlink()` and `is_dir()` (up to two `stat`s) per entry and allocates a `PathBuf` for each.\n\n`async-walkdir` (1.54 s) is in a tier of its own. It is an async `Stream`, so driving it from synchronous benchmark code via `block_on` adds per-item executor scheduling and blocking-I/O thread-pool hand-off on top of the traversal itself; the number reflects that harness cost, not pure walking.\n\n**Multi-threaded.** On 8 physical cores the three work-stealing walkers are essentially tied at the top — `ignore` (24.7 ms), `swdir` (25.2 ms) and `jwalk` (26.3 ms) — each roughly **2.7× faster** than the quickest serial walker. The speedup is sublinear because directory traversal is dominated by I/O and syscall latency rather than CPU, so cores spend much of their time waiting. `scandir` (64 ms) also walks in parallel (via a `jwalk` fork) but streams results through a channel to a background thread and aggregates them into a table-of-contents, and that coordination overhead leaves it ~2.5× behind the leaders. `fsindex` (124 ms) parallelizes only the per-file metadata work; its actual directory walk is the serial `ignore` traversal, and it still `stat`s every file, so it benefits least from the extra cores.\n\n**Takeaway.** For plain recursive traversal, prefer a `readdir`/`d_type` walker (`walkdir` serially, `ignore` or `jwalk` in parallel). Crates that `stat` every entry, add an abstraction layer, or read file contents pay for it linearly in the number of entries — useful features when you need them, but measurable overhead when you only want the paths.\n\n### Graphs\n\n![](bench_serial_violin.svg)\n![](bench_parallel_violin.svg)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdkorunic%2Fbench_walk","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdkorunic%2Fbench_walk","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdkorunic%2Fbench_walk/lists"}