https://github.com/cserajdeep/efficient_remover_of_duplicate_images
Image Processing and Hashing-based duplicate image remover python code that can deal with different sizes, intensity, and grayscale.
https://github.com/cserajdeep/efficient_remover_of_duplicate_images
duplicate-detection hashing image-processing python
Last synced: about 1 month ago
JSON representation
Image Processing and Hashing-based duplicate image remover python code that can deal with different sizes, intensity, and grayscale.
- Host: GitHub
- URL: https://github.com/cserajdeep/efficient_remover_of_duplicate_images
- Owner: cserajdeep
- License: mit
- Created: 2021-09-27T11:58:10.000Z (over 4 years ago)
- Default Branch: main
- Last Pushed: 2021-09-27T14:29:11.000Z (over 4 years ago)
- Last Synced: 2025-08-09T03:02:11.718Z (11 months ago)
- Topics: duplicate-detection, hashing, image-processing, python
- Language: Jupyter Notebook
- Homepage:
- Size: 3.98 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
"duplicates.zip" is a folder that contains five unique cat images, four unique dog images, and overall 16 images (including seven duplicates images). These duplicates images are generated by varying dimensions, intensity, and color of existing unique photos.
```ruby
colab$ !unzip "/content/duplicates.zip"
```
```ruby
colab$ ls -l duplicates/*.jpg duplicates/*.png | wc -l #count total image files with .jpg and .png
```
Execute the duplicate_images_remover.py with required command line arguments folder_path, hamming distance as threshold and details printing as True/False
```ruby
colab$ %run duplicate_images_remover.py -f /content/duplicates -t 2 -i false
```

One may also explore the Structural Similarity Index (https://scikit-image.org/docs/dev/auto_examples/transform/plot_ssim.html#sphx-glr-auto-examples-transform-plot-ssim-py)
# Efficient Remover of Duplicate Images