https://github.com/horgh/dupefile
Detect and optionally delete duplicate files in a directory tree
https://github.com/horgh/dupefile
dedup file file-management management
Last synced: 3 months ago
JSON representation
Detect and optionally delete duplicate files in a directory tree
- Host: GitHub
- URL: https://github.com/horgh/dupefile
- Owner: horgh
- License: gpl-3.0
- Created: 2017-05-20T18:40:48.000Z (about 8 years ago)
- Default Branch: master
- Last Pushed: 2021-06-06T15:34:39.000Z (about 4 years ago)
- Last Synced: 2025-01-24T22:34:47.225Z (5 months ago)
- Topics: dedup, file, file-management, management
- Language: Go
- Size: 21.5 KB
- Stars: 1
- Watchers: 3
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: COPYING
Awesome Lists containing this project
README
This program helps me deal with duplicate files.
I am trying to organize my collection of images. It is disorganized and
contains many duplicates. I want to delete duplicates so I can have less
to organize.# What this program does
This program takes a directory and calculates checksums for each file under
it. It then checks whether any two files have the same checksum, and if so
it reports the files as duplicates.You can provide rules to define which file to keep in order to delete one
of them. This only runs in live mode. By default it will only report the
duplicates.# Defining rules
You define rules by writing a JSON file. Each rule specifies which file to
remove by identifying which directory holds the file to keep and which
holds the one to delete. This allows fine grained control.An example rule file with one rule looks like this:
```
{
"rules": [
{
"keep": "/directory1",
"remove": "/directory2"
}
]
}
```In this case, if we detect duplicate files `/directory1/example.png` and
`/directory2/example-test.png`, the program deletes
`/directory2/example-test.png` and keeps the other.# Behaviour in more detail
- Recursively find all files.
- Calculate the checksum of each file.
- Check whether any two files have the same checksum.
- If they do, check whether the two files are really identical.
- If they are, take action. This may be to just report (in non-live mode) or
to remove one of them (in live mode).
- Report any two files with identical checksums.
- Report any two files with identical names.