Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/newca12/multi-machine-dedup

Deduplication tool using SQLite to allow multi-machine features.
https://github.com/newca12/multi-machine-dedup

Last synced: about 5 hours ago
JSON representation

Deduplication tool using SQLite to allow multi-machine features.

Awesome Lists containing this project

README

        

# multi-machine-dedup

## About ##

multi-machine-dedup is a deduplication tool using SQLite to allow multi-machine features.

multi-machine-dedup is an EDLA project.

The purpose of [edla.org](http://www.edla.org) is to promote the state of the art in various domains.

### Installation ###

```
cargo install multi-machine-dedup
```

## How to use it ##

Index recursively a directory labelled with a \ in a SQLite database
```
multi-machine-dedup index -l --db
```

Check a directory
```
multi-machine-dedup check-integrity -l --db
```

Compare two databases
```
multi-machine-dedup compare --db1 --db2
```

## Example of SQL queries ##

You can use a convenient database tool like [DBeaver CE](https://dbeaver.io) or [SQLiteStudio](https://sqlitestudio.pl) to query the generated SQLite database.

Find top duplicates files larger than
```
select label, full_path, hash,size,nb_dup from file , (select hash, count(*) as nb_dup from file where size >
group by hash order by nb_dup DESC, size DESC) as T
where file.hash = T.hash order by nb_dup DESC, size DESC ;
```

Find all files with the same
```
select * from file where hash= ;
```

Find all files with image/jpeg MIME-type.
```
select * from hash where mime like "image/jpeg" ;
```

## Tips ##

* Enable debug mode in PowerShell
```
$Env:LOG='debug'; cargo run ...
```

* Remove LOG environement variable in PorwerShell
```
remove-item Env:LOG
```

* Show help for a \
```
multi-machine-dedup --help
```
or
```
multi-machine-dedup help
```

## Roadmap ##

Inspired by https://github.com/hgrecco/dedup multi-machine-dedup will probably propose similar features.

### License ###
© 2022 Olivier ROLAND. Distributed under the GPLv3 License.