https://github.com/noahgift/rdedupe
A Rust based deduplication tool
https://github.com/noahgift/rdedupe
clap command-line deduplication filesystem multithreading rust rust-lang
Last synced: about 1 month ago
JSON representation
A Rust based deduplication tool
- Host: GitHub
- URL: https://github.com/noahgift/rdedupe
- Owner: noahgift
- License: cc0-1.0
- Created: 2022-12-24T10:29:54.000Z (over 2 years ago)
- Default Branch: main
- Last Pushed: 2025-01-10T23:01:26.000Z (4 months ago)
- Last Synced: 2025-03-18T20:49:23.113Z (about 1 month ago)
- Topics: clap, command-line, deduplication, filesystem, multithreading, rust, rust-lang
- Language: Rust
- Homepage:
- Size: 51.8 KB
- Stars: 33
- Watchers: 2
- Forks: 23
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Funding: .github/FUNDING.yml
- License: LICENSE
Awesome Lists containing this project
README
[](https://github.com/noahgift/rdedupe/actions/workflows/tests.yml)
[](https://github.com/noahgift/rdedupe/actions/workflows/release.yml)
[](https://github.com/noahgift/rdedupe/actions/workflows/lint.yml)
[](https://github.com/noahgift/rdedupe/actions/workflows/rustfmt.yml)## 🎓 Pragmatic AI Labs | Join 1M+ ML Engineers
### 🔥 Hot Course Offers:
* 🤖 [Master GenAI Engineering](https://ds500.paiml.com/learn/course/0bbb5/) - Build Production AI Systems
* 🦀 [Learn Professional Rust](https://ds500.paiml.com/learn/course/g6u1k/) - Industry-Grade Development
* 📊 [AWS AI & Analytics](https://ds500.paiml.com/learn/course/31si1/) - Scale Your ML in Cloud
* ⚡ [Production GenAI on AWS](https://ds500.paiml.com/learn/course/ehks1/) - Deploy at Enterprise Scale
* 🛠️ [Rust DevOps Mastery](https://ds500.paiml.com/learn/course/ex8eu/) - Automate Everything### 🚀 Level Up Your Career:
* 💼 [Production ML Program](https://paiml.com) - Complete MLOps & Cloud Mastery
* 🎯 [Start Learning Now](https://ds500.paiml.com) - Fast-Track Your ML Career
* 🏢 Trusted by Fortune 500 TeamsLearn end-to-end ML engineering from industry veterans at [PAIML.COM](https://paiml.com)
## RDedupe
A Rust based deduplication tool
### Goals
* Build a multiplatform, fast deduplication tool that uses Rust parallelization.

#### Current Status
* Added 
* Added [progress bar](https://github.com/console-rs/indicatif)
#### Future Improvements
* Add a GUI
* Add a web interface
* Fix GitHub Actions Build process to not fail silently!
* Use Polars DataFrame and include statistics about files and generate a CSV report.
* Store logs about actions performed across multiple runs### Building and Running
* Build: cd into rdedupe and run `make all`
* Run: `cargo run -- dedupe --path tests --pattern .txt`
* Run tests: `make test`### OS X Install
* Install rust via [rustup](https://rustup.rs/)
* Add to `~/.cargo/config````bash
[target.x86_64-apple-darwin]
rustflags = [
"-C", "link-arg=-undefined",
"-C", "link-arg=dynamic_lookup",
][target.aarch64-apple-darwin]
rustflags = [
"-C", "link-arg=-undefined",
"-C", "link-arg=dynamic_lookup",
]
```
* run `make all` in rdedupe directory