Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/dealfonso/searchdups
Search for duplicate files
https://github.com/dealfonso/searchdups
command command-line command-line-tool commandline duplicate-detection duplicates files python python-script
Last synced: about 7 hours ago
JSON representation
Search for duplicate files
- Host: GitHub
- URL: https://github.com/dealfonso/searchdups
- Owner: dealfonso
- License: apache-2.0
- Created: 2023-02-02T12:48:45.000Z (almost 2 years ago)
- Default Branch: main
- Last Pushed: 2023-02-02T13:48:23.000Z (almost 2 years ago)
- Last Synced: 2024-10-03T19:37:15.090Z (about 1 month ago)
- Topics: command, command-line, command-line-tool, commandline, duplicate-detection, duplicates, files, python, python-script
- Language: Python
- Homepage:
- Size: 9.77 KB
- Stars: 0
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Search Duplicates (searchdups)
This is a simple application that searches for duplicate files in a set of folders. To check whether the files are identical or not, it makes use of `md5` or `sha256` algorithms, but the application calculates a _smart hash_ to enhance performance: the idea is to calculate a partial hash and finalize the calculation only if needed.
Additionally, this application includes a pseudo _hash_ that consists of checking whether the name of the files is the same. If using this _"hash algorithm"_, if the name of two files is the same, they are considered to be the same even if the content is not the same.
The basic usage is
```bash
$ searchdups -r .
> 8f8db820d89c39029a0629094e0f18c9*
/Users/calfonso/Programacion/norepo/searchdups/a1.jpg
/Users/calfonso/Programacion/norepo/searchdups/a11.jpg
```Some other features are:
- Select the hash algorithm (using parameter `-H`).
- Searching in subfolders (using flag `-r`).
- Considering hidden folders and files (using flag `-a`).
- Show a progress bar during the process (using flag `-p`).
- Selecting which files are processed (using `-f` parameter for _sh-like_ filters, or `-e` parameter for regular expressions).
- Exclude the files to process (using `-F` parameter for _sh-like_ filters, or `-E` parameter for regular expressions).
- Summarize the amount of files and folders considered (using flag `-s`).
- Get the result in a file (using parameter `-o`).Please check the CLI help to get updated information about the usage of this tool.
## Installation
To install the tool you can clone the code and execute the next command inside the cloned folder
```shell
$ pip install .
```or install it from the repositories:
```shell
$ pip install searchdups
```