{"id":26578193,"url":"https://github.com/binbash23/dupfinder","last_synced_at":"2026-04-13T21:03:24.822Z","repository":{"id":187262052,"uuid":"278469921","full_name":"binbash23/dupfinder","owner":"binbash23","description":"Identify similar files","archived":false,"fork":false,"pushed_at":"2022-10-17T08:37:04.000Z","size":106,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"master","last_synced_at":"2023-08-09T16:38:06.616Z","etag":null,"topics":["bash","duplicates","linux","photos","pictures","script","videos"],"latest_commit_sha":null,"homepage":null,"language":"Shell","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/binbash23.png","metadata":{"files":{"readme":"README","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null}},"created_at":"2020-07-09T21:00:46.000Z","updated_at":"2023-08-09T16:38:20.056Z","dependencies_parsed_at":"2023-08-09T16:51:20.121Z","dependency_job_id":null,"html_url":"https://github.com/binbash23/dupfinder","commit_stats":null,"previous_names":["binbash23/dupfinder"],"tags_count":null,"template":null,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/binbash23%2Fdupfinder","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/binbash23%2Fdupfinder/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/binbash23%2Fdupfinder/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/binbash23%2Fdupfinder/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/binbash23","download_url":"https://codeload.github.com/binbash23/dupfinder/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":245056557,"owners_count":20553854,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bash","duplicates","linux","photos","pictures","script","videos"],"created_at":"2025-03-23T04:28:06.011Z","updated_at":"2026-04-13T21:03:24.789Z","avatar_url":"https://github.com/binbash23.png","language":"Shell","funding_links":[],"categories":[],"sub_categories":[],"readme":"\n\nNAME\n\tdupfinder\n\nDESCRIPTION\n\nUse dupfinder to find duplicate files in big archives of files.\nDupfinder searches the given path for files, creates a hash database for\nall files and searches for files which have the same hash. So if you\nhave a picture or video with 2 diffenrent names or in different folders,\nyou can identify them and create an archive script and a delete script.\n\nDupfinder can be used to search for duplicate pictures in huge picture\ncollections for example.\n\nSYNOPSIS\n\n\tdupfinder OPTIONS\n   \nOPTIONS\n\n\t-a Search in path for duplicates (requires -r to set the root path). \n\t   This option scans the root path for files, creates hashes for them \n\t   and searches for duplicate files. The filehash database will also\n     be cleaned from files that do not exist any more in the filesystem.\n     You can use dupfinder -g \n\t   later to generate scripts for moving the duplicates to an archive \n\t   folder. So if you have a picture or video with 2 different names or \n\t   in different folders, you can identify them and create an archive \n\t   script and a delete script.\n\t   The delete script will only delete duplicates - one version of the\n\t   duplicate files will be kept in the root path.\n\t-c Calculate hashes for files in filelist file and add them to \n\t   filehash database\n\t-C Create duplicate files file from hash database\n\t-D Show contents of duplicates file\n\t-d Delete not existing files from filehash database\n\t-f search root path for files and create plain files file (requires -r)\n\t-g Generate scripts to copy duplicates and to delete duplicates\n\t-h Show usage information\n\t-H Show contents of hash database\n\t-L Show contents of filelist database\n\t-n Calculate new hash for all files found, even if they exist in hash db\n        -N Skip calculating sum of filesizes (size calculation may take some time \n           in big archives)\n\t-p Create filelist database for root path (requires -r)\n\t-r [PATH] set root path\n\t-R Delete filehash database files\n\t-s Show statistics of database\n\t-v Verbose mode\n\t-V Show program version\n\nEXAMPLES\n\n1) To search duplicate files in path /tmp/picture use:\n\n\t\u003e ./dupfinder -avr /tmp/Pictures\n\n   3 files will be created:\n\n\t\u003e filelist.db - list of files in root path\n\t\u003e filehash.db - list of all files with every hash value\n\t\u003e duplicates.db - list of duplicate files in\n\t\t\t\n2) To get help for all options use:\n\n\t\u003e ./dupfinder -h\n\t\n3) To generate 2 scripts which can be used to archive the duplicates to\n   an archive folder and to delete the duplicates use:\n   \n\t\u003e ./dupfinder -g\n\t\n   2 scripts will be generated:\n   \n\t\u003e copy_dups_from_root_path_to_archive.sh \n   \n   This script which will copy all duplicates in root path to\n   an archive folder in your current directory called \"duplicates\".\n   \n   \t\u003e delete_dups_from_root_path.sh\n\t\n   This script can be used to delete all duplicate files from the\n   root path. If you want to exclude several folders, you can modify\n   the script with grep for example.\n   The delete script contents only the duplicates. If there are 3\n   similar files, the delete script will only have 2 of them deleted.\n   \n   Just have a look into the delete script:\n   \n   \u003e tail delete_dups_from_root_path.sh \n     rm -v '/tmp/2015/921.JPG'\n     rm -v '/tmp/2015/.thumbs/923.JPG'\n     \n   If you don't want to delete file from the thumbnail folders, create\n   a new script:\n   \n   \u003e grep -v \".thumbs\" \u003e delete_dups_from_root_path_custom.sh\n\n4) To find out about the duplicates statistics use:\n\n   \u003e ./dupfinder -s\n\n     Dupfinder Statistics\n\n     Files in filehash database : 4774\t(537 Kb)\tmodified: 2020-10-03 17:06:59.953627326 +0200\n     Files in plain file list   : 4774\t(378 Kb)\tmodified: 2020-10-03 17:06:08.454297780 +0200\n     Files in duplicates file   : 1529\t(130 Kb)\tmodified: 2020-10-03 17:07:04.149572704 +0200\n     Size of all duplicates     : 1.489 GB     \n\nREPORTING BUGS\n\tReport bugs to \u003cbinbash@gmx.net\u003e\n\nAUTHOR\n\tdupfinder by Jens Heine \u003cbinbash@gmx.net\u003e 2020\n\t             and Dennis Brossat \u003cdennis.brossat@email.de\u003e\n\t\t     and Benjamin Heine \u003cbenjaminheine@gmx.net\u003e\n\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbinbash23%2Fdupfinder","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbinbash23%2Fdupfinder","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbinbash23%2Fdupfinder/lists"}