{"id":19783241,"url":"https://github.com/psychedelicshayna/jw","last_synced_at":"2025-09-04T13:36:27.070Z","repository":{"id":255802064,"uuid":"853633965","full_name":"PsychedelicShayna/jw","owner":"PsychedelicShayna","description":" Blazingly fast CLI filesystem traverser and multithreaded mass file hasher / hash index generator, with diff support to validate hashes and track changes, powered by jwalk and xxh3, and of course, Rust! ","archived":false,"fork":false,"pushed_at":"2024-11-14T12:56:21.000Z","size":188,"stargazers_count":37,"open_issues_count":0,"forks_count":1,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-03-31T12:09:24.408Z","etag":null,"topics":["cli","command-line","diff","directory-traversal","file-integrity","files","hashing","minimal","terminal","walker","xxhash"],"latest_commit_sha":null,"homepage":"https://crates.io/crates/jw","language":"Rust","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/PsychedelicShayna.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-09-07T05:02:15.000Z","updated_at":"2024-12-30T12:39:05.000Z","dependencies_parsed_at":"2024-09-07T06:47:50.549Z","dependency_job_id":"6b2c869e-43fe-45d0-9fa6-0ec8491a53de","html_url":"https://github.com/PsychedelicShayna/jw","commit_stats":{"total_commits":50,"total_committers":2,"mean_commits":25.0,"dds":"0.020000000000000018","last_synced_commit":"aa25979ea6bf9cfe31dccef0c55d319156aa3f10"},"previous_names":["psychedelicshayna/jw"],"tags_count":4,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/PsychedelicShayna%2Fjw","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/PsychedelicShayna%2Fjw/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/PsychedelicShayna%2Fjw/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/PsychedelicShayna%2Fjw/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/PsychedelicShayna","download_url":"https://codeload.github.com/PsychedelicShayna/jw/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247666015,"owners_count":20975788,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cli","command-line","diff","directory-traversal","file-integrity","files","hashing","minimal","terminal","walker","xxhash"],"created_at":"2024-11-12T06:07:44.941Z","updated_at":"2025-04-07T14:15:41.683Z","avatar_url":"https://github.com/PsychedelicShayna.png","language":"Rust","funding_links":[],"categories":[],"sub_categories":[],"readme":"# jw - Jwalk CLI Frontend\n\nAre you frustrated with tools like `find`, `fd`, `erd`, `lsd`, `legdur` and others that seem to excel in some areas but fall short in others? I was too, so I built a solution that prioritizes speed and simplicity above all else. The design philosophy of modern tools have a tendency to stray away from the original Linux philosophy of each command doing a single thing, and doing it very well, instead opting to cram as many features in as possible. \n\nThis isn't necessarily a bad thing, I enjoy those features, but there are many times where I simply want to grep every single path from the root of my drive, and that's when those abstractions start backfiring. All the additional rendering tanks performance, the colorized output sometimes messes up your regex, you pipe it to Neovim and are met with a clusterfuck of ANSI escape codes. Higher level languages that are easier to make pretty CLI/TUIs with being single threaded, the creator never anticipating that someone would feed a terrabyte of data to it, and output immediately starts getting dumped to the terminal creating massive I/O bottlenecks... **enough**\n\nSometimes you just need to take a page out of the Sesto Elemento's book.\n\n## What is jw exactly?\njw is a command line frontend for [jwalk](https://github.com/byron/jwalk), a blazingly fast filesystem traversal library. While jwalk itself provides unparalleled performance in recursively traversing directories, it lacks a CLI, so I created jw to fill that gap. This utility leverages the power of jwalk to allow you to efficiently sift through directories containing a massive number of files, with a focus on raw performance and minimal abstraction.\n\nIt also doubles as a way to hash a very large number of files, thanks to the insanely fast [xxHash](https://github.com/Cyan4973/xxHash) algorithm; jwalk and xxh3 go together like bread and butter.\n\nRather than fancy colorized outputs, TUIs, gathering statistics, etc, jw sticks to the essentials, providing the raw performance without any of the bloat.\n\nIt simply gives you the raw output as fast as possible, for you to pipe to other utilities, such as ripgrep/grep, xargs, fzf, and the like, with no additional nonsense.\n\n\nhttps://github.com/user-attachments/assets/9f4a3cf5-4dfa-4a57-845b-a26ded3f660a\n\n\n\nhttps://github.com/user-attachments/assets/f27bda63-a97f-441f-be86-2514fdc64d37\n\n\n## Performance\n\nTo give you a rough idea of the performance, JWalk was capable of traversing thorugh 492 GB worth of files in **3 seconds**. That's all it takes, three seconds and you can already grep for file paths.\n\nAs for Xxh3 combined with JWalk, it was capable of hashing 7.2GB across more than 10,000 files, in **500 milliseconds**. Yes, it's that fast. Stupid fast.\n\nThe SHA2 family and MD5 is also supported but that's only there for compatibility.\n\n### A Personal Request\nMaking Rust go fast is a different beast than making C++ go fast. A lot of the techniques that came to mind when trying to squeeze even more performance out of this utility simply don't apply to Rust without breaking the spirit of the language. I'm not a Rust wizard, there's a lot I still don't know. However, I know for a fact that `jw` could run even faster. This [article proving that an optimization \"impossible\" in Rust, is possible in Rust](https://tunglevo.com/note/an-optimization-thats-impossible-in-rust/) is a prime example of how Rust has its own flavor of black magic I've yet to grasp. I welcome any and all PRs, it's a much appreciated learning experience. By all means, if you spot a way to make it faster, don't hesitate to make a PR, I'd love to learn, even if it's just shaving off a few milliseconds.\n\nThe main aspiration I have for `jw` is **speed** above all else, both traversal and hashing, but especially hashing.\n\n\nhttps://github.com/user-attachments/assets/2db684a0-a6f6-4416-a2fc-4b65c0da5963\n\n\n\nhttps://github.com/user-attachments/assets/1ecdfc70-8233-4fdb-b75d-00d3c7ca22a5\n\n\n\nhttps://github.com/user-attachments/assets/9d959641-2fcd-41bc-b397-2d7098d59174\n\n\n\n\n## Usage\n\n```\nA CLI frontend to jwalk for blazingly fast filesystem traversal!\n\nUsage: jw [OPTIONS] [directories]...\n\nArguments:\n  [directories]...\n          The target directories to traverse, can be multiple. Use -- to read paths from stdin.\n\n          [default: .]\n\nOptions:\n  -l, --live\n          Display results in realtime, rather than collecting first and displaying later.\n          This will result in a significant drop in performance due to the constant terminal output.\n\n  -c, --checksum\n          Generate an index of file hashes and their associated file names, and print it.\n          The algorithm used by default is Xxh3, which is the recommended choice. Though\n          if you want to use a different algorithm, use --checksum-with (-C) instead.\n\n  -C, --checksum-with \u003calgorithm\u003e\n          Performs --checksum but with the specified hashing algorithm.\n          If another argument changes the operating mode of the program, e.g. --diff, then\n          the algorithm specified will only be stored, and no checksum will be performed.\n          Stick to Xxh3 and just use -c unless you have a reason to use a different one.\n\n          [default: xxh3]\n          [possible values: xxh3, sha224, sha256, sha384, sha512, md5]\n\n  -D, --diff \u003cfile1\u003e \u003cfile2\u003e...\n          Validate hashes from two or more files containing output from `jw --checksum`\n          The first file will be treated as the \"correct\" one; any discrepant hashes\n          in the subseqeunt files will be reported. If entries from the first file are\n          missing in the subsequent files, or if the subsequent files have entries not\n          present in the first file, that will be reported as well.\n\n          The hash length must be known for -D to parse the input files and separate\n          hashes from file paths. A length of 16 is assumed by default as that's how\n          long Xxh3 hashes are. If you used a different algorithm however, then you\n          must specify the algorithm before -D, e.g. `jw -C sha256 -D file1 file2`\n\n          If you stuck with defaults: `jw -c`, then you can just `jw -D file1 file2`\n\n  -d, --depth \u003climit\u003e\n          The recursion depth limit. Setting this to 1 effectively disables recursion.\n\n          [default: 0]\n\n  -x, --exclude [\u003ct1,t2\u003e...]\n          Exclude one more types of entries, separated by coma.\n\n          [possible values: files, dirs, dot, other]\n\n  -S, --silent\n          Suppress output, useful for benchmarking, or just counting files via --stats\n\n  -s, --stats\n          Count the number of files, dirs, and other entries, and print at the end.\n          This will decrease performance. This will cause a significant slowdown\n          and is primarily here for debugging or benchmarking. A more efficient\n          method to do this will be implemented in the future.\n\n  -h, --help\n          Print help (see a summary with '-h')\n\n  -V, --version\n          Print version\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpsychedelicshayna%2Fjw","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fpsychedelicshayna%2Fjw","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpsychedelicshayna%2Fjw/lists"}