Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/newca12/multi-machine-dedup
Deduplication tool using SQLite to allow multi-machine features.
https://github.com/newca12/multi-machine-dedup
Last synced: about 5 hours ago
JSON representation
Deduplication tool using SQLite to allow multi-machine features.
- Host: GitHub
- URL: https://github.com/newca12/multi-machine-dedup
- Owner: newca12
- License: gpl-3.0
- Created: 2022-01-01T13:27:50.000Z (almost 3 years ago)
- Default Branch: main
- Last Pushed: 2022-12-22T22:01:42.000Z (almost 2 years ago)
- Last Synced: 2024-10-08T10:52:14.895Z (about 1 month ago)
- Language: Rust
- Size: 49.8 KB
- Stars: 1
- Watchers: 3
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# multi-machine-dedup
## About ##
multi-machine-dedup is a deduplication tool using SQLite to allow multi-machine features.
multi-machine-dedup is an EDLA project.
The purpose of [edla.org](http://www.edla.org) is to promote the state of the art in various domains.
### Installation ###
```
cargo install multi-machine-dedup
```## How to use it ##
Index recursively a directory labelled with a \ in a SQLite database
```
multi-machine-dedup index -l --db
```Check a directory
```
multi-machine-dedup check-integrity -l --db
```Compare two databases
```
multi-machine-dedup compare --db1 --db2
```## Example of SQL queries ##
You can use a convenient database tool like [DBeaver CE](https://dbeaver.io) or [SQLiteStudio](https://sqlitestudio.pl) to query the generated SQLite database.
Find top duplicates files larger than
```
select label, full_path, hash,size,nb_dup from file , (select hash, count(*) as nb_dup from file where size >
group by hash order by nb_dup DESC, size DESC) as T
where file.hash = T.hash order by nb_dup DESC, size DESC ;
```Find all files with the same
```
select * from file where hash= ;
```Find all files with image/jpeg MIME-type.
```
select * from hash where mime like "image/jpeg" ;
```## Tips ##
* Enable debug mode in PowerShell
```
$Env:LOG='debug'; cargo run ...
```* Remove LOG environement variable in PorwerShell
```
remove-item Env:LOG
```* Show help for a \
```
multi-machine-dedup --help
```
or
```
multi-machine-dedup help
```## Roadmap ##
Inspired by https://github.com/hgrecco/dedup multi-machine-dedup will probably propose similar features.
### License ###
© 2022 Olivier ROLAND. Distributed under the GPLv3 License.