{"id":26728691,"url":"https://github.com/cserajdeep/efficient_remover_of_duplicate_images","last_synced_at":"2026-05-18T06:37:36.914Z","repository":{"id":165428906,"uuid":"410865755","full_name":"cserajdeep/Efficient_remover_of_duplicate_images","owner":"cserajdeep","description":"Image Processing and Hashing-based duplicate image remover python code that can deal with different sizes, intensity, and grayscale.  ","archived":false,"fork":false,"pushed_at":"2021-09-27T14:29:11.000Z","size":4172,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-08-09T03:02:11.718Z","etag":null,"topics":["duplicate-detection","hashing","image-processing","python"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/cserajdeep.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-09-27T11:58:10.000Z","updated_at":"2023-09-08T18:26:54.000Z","dependencies_parsed_at":"2024-07-14T03:15:08.403Z","dependency_job_id":null,"html_url":"https://github.com/cserajdeep/Efficient_remover_of_duplicate_images","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/cserajdeep/Efficient_remover_of_duplicate_images","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cserajdeep%2FEfficient_remover_of_duplicate_images","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cserajdeep%2FEfficient_remover_of_duplicate_images/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cserajdeep%2FEfficient_remover_of_duplicate_images/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cserajdeep%2FEfficient_remover_of_duplicate_images/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/cserajdeep","download_url":"https://codeload.github.com/cserajdeep/Efficient_remover_of_duplicate_images/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cserajdeep%2FEfficient_remover_of_duplicate_images/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":33167797,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-18T05:43:36.989Z","status":"ssl_error","status_checked_at":"2026-05-18T05:43:19.133Z","response_time":71,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["duplicate-detection","hashing","image-processing","python"],"created_at":"2025-03-27T22:36:22.935Z","updated_at":"2026-05-18T06:37:36.900Z","avatar_url":"https://github.com/cserajdeep.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"\"duplicates.zip\" is a folder that contains five unique cat images, four unique dog images, and overall 16 images (including seven duplicates images). These duplicates images are generated by varying dimensions, intensity, and color of existing unique photos. \u003cbr\u003e \n\n```ruby\ncolab$ !unzip \"/content/duplicates.zip\" \n```\n```ruby\ncolab$ ls -l duplicates/*.jpg duplicates/*.png | wc -l  #count total image files with .jpg and .png\n```\n\nExecute the duplicate_images_remover.py with required command line arguments folder_path, hamming distance as threshold and details printing as True/False \u003cbr\u003e\n```ruby\ncolab$ %run duplicate_images_remover.py -f /content/duplicates -t 2 -i false\n```\n\n\u003cimg width=\"280\" alt=\"duplicate_img_remover\" src=\"https://user-images.githubusercontent.com/18000553/134928552-c9acde2b-2eef-42ac-bda1-f62565300862.png\"\u003e\n\nOne may also explore the Structural Similarity Index (https://scikit-image.org/docs/dev/auto_examples/transform/plot_ssim.html#sphx-glr-auto-examples-transform-plot-ssim-py)\n\n# Efficient Remover of Duplicate Images\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcserajdeep%2Fefficient_remover_of_duplicate_images","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcserajdeep%2Fefficient_remover_of_duplicate_images","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcserajdeep%2Fefficient_remover_of_duplicate_images/lists"}