Projects in Awesome Lists tagged with duplicates
A curated list of projects in awesome lists tagged with duplicates .
https://github.com/qarmin/czkawka
Multi functional app to find duplicates, empty folders, similar images etc.
cleaner duplicates gtk-rs multiplatform rust similar-images similar-music similar-videos
Last synced: 12 May 2025
https://github.com/kucherenko/jscpd
Copy/paste detector for programming source code.
clones-detection code-quality copy-paste cpd detect-duplications detector duplicates duplications quality
Last synced: 12 May 2025
https://github.com/sahib/rmlint
Extremely fast tool to remove duplicates and other lint from your filesystem
c deduplication duplicates fdupes filesystem lint python
Last synced: 14 May 2025
https://github.com/scinos/yarn-deduplicate
Deduplication tool for yarn.lock files
dedupe duplicated-packages duplicates lock-file yarn yarn-lock
Last synced: 11 May 2025
https://github.com/F483/dejavu
Quickly detect already witnessed data.
command-line command-line-tool deduplication duplicate-values duplicates go golang history memory probabilistic
Last synced: 30 Mar 2025
https://github.com/f483/dejavu
Quickly detect already witnessed data.
command-line command-line-tool deduplication duplicate-values duplicates go golang history memory probabilistic
Last synced: 20 Aug 2025
https://github.com/kristiankoskimaki/vidupe
Vidupe is a program that can find duplicate and similar video files. V1.211 released on 2019-09-18, Windows exe here:
duplicate-detection duplicate-videos duplicates videos
Last synced: 10 Apr 2025
https://github.com/kouhin/redux-dataloader
Loads async data for Redux apps focusing on preventing duplicated requests and dealing with async dependencies.
action async duplicates react redux thunk
Last synced: 21 Aug 2025
https://github.com/scrubbbbs/cbird
Command-line program for Content-Based Image Retrieval of images and videos. Includes tools for general search and de-duplication.
command-line-interface computer-vision content-based-image-retrieval duplicate-detection duplicate-files duplicates ffmpeg opencv qt6 similarity-search
Last synced: 06 Apr 2025
https://github.com/matteodelabre/mongoose-beautiful-unique-validation
Plugin for Mongoose that turns duplicate errors into regular Mongoose validation errors
duplicates errors mongoose plugin schema validation
Last synced: 06 Apr 2025
https://github.com/jvirkki/dupd
CLI utility to find duplicate files
c deduplication duplicate-files duplicatefilefinder duplicates fdupes
Last synced: 21 Mar 2025
https://github.com/cloud-py-api/mediadc
Nextcloud Media Duplicate Collector application
collector duplicate-detection duplicates media mediadc nextcloud nextcloud-apps nextcloud-vue-app open-source php python python3 single-page-app vue
Last synced: 06 Apr 2025
https://github.com/eyalroz/removedupes
Remove Duplicate Messages
cleaner cleanup duplicate-detection duplicates email email-parsing mail-client mail-folders mozilla productivity thunderbird thunderbird-addon thunderbird-extension
Last synced: 27 Mar 2025
https://github.com/StephaneCouturier/Katalog
Katalog is an application to manage catalogs of disks and files to search and get statistics.
backup cataloging catalogue differences duplicates external-content external-drives file file-manager file-statistics filemanager filesystem indexer katalog search storage
Last synced: 29 Apr 2025
https://github.com/vuolter/deplicate
Advanced Duplicate File Finder for Python
deplicate duplicate duplicate-detection duplicate-files duplicatefilefinder duplicates duplicates-removed duplication-finder finder macosx multi-filtering purge-duplicate-files pypi python scanning unix windows
Last synced: 24 Apr 2025
https://github.com/deplicate/deplicate
Advanced Duplicate File Finder for Python
deplicate duplicate duplicate-detection duplicate-files duplicatefilefinder duplicates duplicates-removed duplication-finder finder macosx multi-filtering purge-duplicate-files pypi python scanning unix windows
Last synced: 30 Apr 2025
https://github.com/PJDude/dude
Duplicates Detector is a cross-platform GUI utility for finding duplicate files, allowing you to delete or link them to save space. Duplicate files are displayed and processed on two synchronized panels for efficient and convenient operation.
cli deduplication duplicate duplicate-detection duplicate-files duplicates duplicates-removal easy easy-to-use easyui gui gui-application python python3 sha1 threads tkinter utility utility-application
Last synced: 06 Mar 2025
https://github.com/src-d/gemini
Advanced similarity and duplicate source code at scale.
duplicate-detection duplicates hash source-code-analysis spark
Last synced: 05 May 2025
https://github.com/twpayne/find-duplicates
Find duplicate files quickly.
duplicate-detection duplicate-files duplicates find
Last synced: 17 Mar 2025
https://github.com/deric/es-dedupe
Tool for removing duplicate documents from Elasticsearch
duplicates duplicity elasticsearch
Last synced: 23 Apr 2025
https://github.com/src-d/apollo
Advanced similarity and duplicate source code proof of concept for our research efforts.
duplicate-detection duplicates python similarity similarity-search source-code
Last synced: 05 May 2025
https://github.com/lyonsyonii/akin
Rust crate for writing repetitive code easier and faster.
code duplicate duplicates repeat repetition rust rust-crate similar similars simpler write
Last synced: 08 Aug 2025
https://github.com/bartozzz/potential-duplicates-bot
A configurable GitHub App which checks for potential issue duplicates using Damerau–Levenshtein distance algorithm.
bot duplicates github issues probot probot-app probot-plugin
Last synced: 23 Oct 2025
https://github.com/navid2zp/dups
A CLI tool to find/remove duplicate files supporting multi-core and different algorithms (MD5, SHA256, and XXHash).
duplicates go golang md5 sha256 xxhash
Last synced: 15 May 2025
https://github.com/gacarrillor/appendfeaturestolayer
QGIS Processing plugin to add an algorithm for upserting features from a source vector layer to an existing target vector layer.
append copy copy-paste duplicates etl export load load-data qgis qgis-processing qgis-processing-provider qgis3 qgis3-plugin update upsert
Last synced: 22 Mar 2025
https://github.com/Navid2zp/dups
A CLI tool to find/remove duplicate files supporting multi-core and different algorithms (MD5, SHA256, and XXHash).
duplicates go golang md5 sha256 xxhash
Last synced: 27 Feb 2025
https://github.com/rasheedsulayman/duplicatecontactsremover
📒 A simple app to optimize your address book and remove duplicate contacts.
androidx architecture-components contacts-manager dependency-injection duplicates kotlin material-design mvvm
Last synced: 14 Oct 2025
https://github.com/arikw/outlook-duplicated-items-remover
A VBA script that finds and moves duplicated items in selected outlook folders
duplicate-detection duplicates outlook vba vba-script vba-snippets
Last synced: 11 Sep 2025
https://github.com/mkearney/funique
⌚️ A faster unique() function
data-frame data-wrangling date-time duplicates mkearney-r-package posix posixct r r-package rstats unique
Last synced: 12 Apr 2025
https://github.com/lmammino/indexed-string-variation
Experimental JavaScript module to generate all possible variations of strings over an alphabet using an n-ary virtual tree
algorithm alphabet characters duplicates javascript javascript-library library node nodejs string strings tree variations virtual
Last synced: 06 May 2025
https://github.com/kevinpollet/pocket-deduper
Remove duplicates from your Pocket list.
cli dedupe duplicates go golang pocket tool
Last synced: 11 Apr 2025
https://github.com/raspi/duplikaatti
Remove duplicate files.
duplicate-files duplicates files go golang
Last synced: 19 Oct 2025
https://github.com/rsalmei/refine
Refine your file collections using Rust!
batch deduplicate duplicates files rename rust scan
Last synced: 17 Jul 2025
https://github.com/nicolasbizzozzero/dupe_eraser
A command-line tool which automate the deletion of duplicate files based on their hash or perceptual-hash.
cli duplicate-detection duplicate-files duplicates file-management
Last synced: 12 Apr 2025
https://github.com/deadsoul/dugu
Find, remove and avoid duplicates with dugu: The Duplicates Guru
deduplication dugu duplicate-detection duplicate-files duplicatefilefinder duplicates duplicates-guru python
Last synced: 05 Apr 2025
https://github.com/danielpclark/dfm
Duplicate File Manager
duplicates file-indexing md5 recursively-search
Last synced: 10 Apr 2025
https://github.com/mesqueeb/compare-anything
Compares objects and tells you which props are duplicate, and props are only present once.
compare compare-arrays compare-objects count-if countif duplicates find-duplicates javascript remove-duplicates typescript
Last synced: 10 Aug 2025
https://github.com/rix4uni/unew
A tool combined of 2 commands features in 1 sort and tee for adding new lines to files, skipping duplicates
bug-bounty bugbounty bugbountytips duplicates hacking infosec osint osint-resources osint-tool penetration-testing pentest-tool pentesting recon reconnaissance security security-tools threat-intelligence
Last synced: 15 Apr 2025
https://github.com/raspi/samanlainen
Delete duplicate files
duplicate-detection duplicate-files duplicates files rust
Last synced: 15 Apr 2025
https://github.com/artemanufrij/findfileconflicts
An elementary OS app
conflict duplicates linux vala
Last synced: 15 Apr 2025
https://github.com/tasleson/duplihere
Copy & Paste finder for structured text files.
clones-detection code-quality copy-paste cpd detect-duplications detector developer-tools duplicate-detection duplicates duplications quality research rust
Last synced: 22 Aug 2025
https://github.com/redonkulus/dump-deps
Dump NPM package dependencies to display packages with multiple versions.
Last synced: 28 Oct 2025
https://github.com/ant-js/compare-similarity
👁 Compare the similarity of two strings
compare duplicates similarity utils
Last synced: 15 May 2025
https://github.com/innovatrics/dedubcheck
dedubcheck - De-Duplicate Dependency Checker for Node.js monorepos
deduplication duplicates duplicity javascript nodejs nodejs-modules
Last synced: 13 Apr 2025
https://github.com/keyweeusr/bear
:bear: The decluttering deduplicator
cli-app clutterremoval duplicate-detection duplicates python
Last synced: 25 Jul 2025
https://github.com/supporterino/textanalyzer
analyzer duplicates python python3 sentences text text-analysis text-processing
Last synced: 10 Jun 2025
https://github.com/ajmalshahabudeen/Bitwarden-Duplicate-remover
When Importing multiple CSV files Bitwarden creates Duplicate Entries. So this Python script will remove duplicate entries and keep ONE.
bitwarden bitwarden-password-vault duplicate-detection duplicates duplicates-removal python
Last synced: 27 Mar 2025
https://github.com/ajmalshahabudeen/bitwarden-duplicate-remover
When Importing multiple CSV files Bitwarden creates Duplicate Entries. So this Python script will remove duplicate entries and keep ONE.
bitwarden bitwarden-password-vault duplicate-detection duplicates duplicates-removal python
Last synced: 10 Jul 2025
https://github.com/codecliff/fdupesanalyzer
A script to analyze output of fdupes linux utility to find level of overlap between directories. Written in R
bash bash-script directory duplicates fdupes fdupes-linux-utility files r rstudio
Last synced: 23 Feb 2025
https://github.com/betaweb/twicejs
Manage duplicates, count item occurences, dedupe an Array.
array array-manipulations base64 countable counter dedupe duplicate-detection duplicates duplicates-removal javascript js json occurrences
Last synced: 29 Dec 2025
https://github.com/pouyakary/dup
a tiny and fast command line utility to find the duplicate files within a directory
cli cmd duplicate-detection duplicate-files duplicates filesystem gnu-utilities utility
Last synced: 14 May 2025
https://github.com/jordicorbilla/duplicatechecker
☑️ .Net service that allows you to check duplicate rows on a sql table using Levenshtein distance
duplicates levenshtein-distance
Last synced: 13 Apr 2025
https://github.com/mckael/goduf
A simple (but fast) duplicate file finder written in Go [Mirror repository]
cli duplicate-files duplicates go golang
Last synced: 03 Apr 2025
https://github.com/webis-de/sigir20-sampling-bias-due-to-near-duplicates-in-learning-to-rank
Sampling Bias Due to Near-Duplicates in Learning to Rank
bias duplicates information-retrieval learning-to-rank sigir sigir2020
Last synced: 07 Apr 2025
https://github.com/arasgungore/job-posting-duplicate-detection
A project aiming to leverage text embeddings and Milvus, a high-performance vector search engine, to detect duplicate job postings.
data-science docker-compose dockerfile duplicate-detection duplicates embedding embeddings exploratory-data-analysis job-posting job-postings machine-learning milvus natural-language-processing sentence-embedding sentence-embeddings sentence-encoder sentence-encoding sentence-transformers text-embedding vector-search-engine
Last synced: 09 Mar 2025
https://github.com/nbari/backup
Command line tool for creating encrypted backups avoiding duplicates
backup backup-utility duplicates encryption-decryption restore
Last synced: 06 Oct 2025
https://github.com/davidefiocco/dockerized-elasticsearch-duplicate-finder
Attempt to use MinHash to find duplicates in an Elasticsearch index
duplicates elasticsearch minhash python
Last synced: 13 Oct 2025
https://github.com/oscarsun72/delete_duplicate_files_from_the_source_directory
檔案總管汰重-WindowsFormsApplication1 delete duplicate files from the source directory
duplicate duplicate-detection duplicate-files duplicates explorer explorer-filemanager file file-manager filemanagement filemanager filemanager-ui files filesystem
Last synced: 16 Mar 2025
https://github.com/moindalvs/learn_about_python_dataframes
Learn about Pandas Dataframe
clipboard-copy dataframe dataframes dropna duplicates duplicates-removal fillna gif import-csv ipython-display merge-dataframe missing-data pandas-dataframe pandas-dataframes pandas-python summary-statistics tocsv youtube-video
Last synced: 11 Mar 2025
https://github.com/theohbrothers/get-duplicatephotos
A script to locate duplicate image files between two sets of folders (e.g. Camera Roll folders vs other folders).
criteria date-taken duplicate-files duplicate-photos duplicates export file-metadata powershell pwsh script search
Last synced: 20 Mar 2025
https://github.com/davdiv/hashfolder
Simple command line tool that can create/update an sqlite database that contains the hash (by default SHA256) of all files inside a specified root folder.
checksum duplicate-detection duplicate-files duplicates sha256
Last synced: 16 May 2025
https://github.com/cfurrow/fzf-rdfind
Use rdfind + fzf to find and remove duplicate files
duplicates fzf fzf-scripts rdfind
Last synced: 07 Apr 2025
https://github.com/busterc/similars
:dancers: Find similar objects and partial duplicates in collections
arrays collections duplicate-detection duplicates similar-objects similarity-search
Last synced: 16 Jun 2025
https://github.com/derhuerst/key-map
Keep track of old keys when removing duplicates.
Last synced: 20 Feb 2025
https://github.com/miroslav-reiter/microsoft_outlook_vba
📧 Makrá a skripty pre Microsoft Outlook VBA - Automatizácia úloh, opravy a úpravy Microsoft Outlook
duplicates emails macro microsoft office outlook vba
Last synced: 22 Sep 2025
https://github.com/exitare/duplicateimagefinder
A tool to find duplicate images for given paths
duplicate-detection duplicates filemanagement images python python3
Last synced: 13 Aug 2025
https://github.com/cepr0/duplicate-parent-entities
Spring Data JPA - duplicated parent entities in 'join fetch' repository query methods
duplicates spring-boot spring-data-jpa
Last synced: 24 Jul 2025
https://github.com/alberanid/audiodedupe
Scan one or more directories for duplicated audio files.
audio deduplication duplicates files fingerprint music sound
Last synced: 27 Mar 2025
https://github.com/streanger/duplicate
files duplicate viewer
duplicate-detection duplicates gui python tkinter-python
Last synced: 04 Apr 2025
https://github.com/intera/redmine_subject_autocomplete
makes the new issue subject field show an autocomplete that lists existing issues to prevent duplicate tickets
didyoumean duplicates redmine redmine-plugin
Last synced: 26 Jul 2025
https://github.com/dealfonso/searchdups
Search for duplicate files
command command-line command-line-tool commandline duplicate-detection duplicates files python python-script
Last synced: 26 Jul 2025
https://github.com/jonas054/dupfind
Duplication finder for source code and other text files
Last synced: 12 Jun 2025
https://github.com/superjmn/dedup
Tool to detect duplicates and copy them to a curated directory (without duplicates)
dbscan-clustering duplicates gallery gallery-images optimize picture tools
Last synced: 31 Dec 2025
https://github.com/robb-fr/fast-dupes-finder
This repository proposes clean, fast and shell based scripts for identifying finding duplicate files in a folder.
bash duplicate-detection duplicate-files duplicate-images duplicates fdupes fdupes-linux-utility
Last synced: 04 Jul 2025
https://github.com/sarfraznawaz2005/buttondisabler
Simple and no-dependency, vanilla JavaScript package to disable submit button to avoid duplicate form submissions.
duplicates es6 form library package submission validation
Last synced: 09 Apr 2025
https://github.com/americanhanko/columnmapper
A simple VLOOKUP-like function that can handle duplicates.
csv duplicate-detection duplicates vlookup
Last synced: 05 Sep 2025
https://github.com/hansalemaos/stridesduplicatefinder
Calculate overlapping values between two arrays and return the results as a DataFrame
duplicates fast numexpr numpy strides
Last synced: 02 Mar 2025
https://github.com/clmnin/dupper
Find duplicate files in a directory, in Rust
duplicates filesystem lint rust
Last synced: 16 Mar 2025
https://github.com/ndsvw/clone-the-tab
A Firefox Add-on that clones the active tab with just 1 click on the icon or with "Ctrl+Alt+D" / "Cmd+Alt+D".
clone duplicates firefox firefox-addon tab
Last synced: 05 Jul 2025
https://github.com/writetome51/array-remove-duplicates
Function that removes any duplicate items in the array
array array-manipulations duplicates javascript remove remove-duplicates
Last synced: 21 Feb 2025
https://github.com/writetome51/get-indexes-of-item-duplicates
Function returns indexes of duplicates of one item in array
array duplicates element index indexes javascript typescript
Last synced: 21 Feb 2025
https://github.com/lazycatcoder/common-python
Implementing various tasks using Python
bacon bacon-cipher banned-words cesar cesar-cipher change common duplicate-detection duplicates fibbonacci lucky-tickets morse-code morze python ticket vigenere-cipher vignere
Last synced: 25 Feb 2025
https://github.com/prajjwol09/data-cleaning-project
This project is dedicated to cleaning, standardizing a dataset, dealing with null values from a CSV file named "layoffs" using MySQL, with MySQL Workbench as the workspace environment. The goal is to prepare the data for analysis.
cleaning-data columns data-analysis database duplicates mysql rows standard
Last synced: 28 Dec 2025
https://github.com/pklatka/photo-organizer
Portable and lightweight application for photo segregation.
duplicates lightweight organizer photo-organizer photos portable segregation
Last synced: 29 Mar 2025
https://github.com/bkb3/duplicate-bib-fix
Small python script to check and replace duplicated bib entries in your .tex files
bib biber bibliography bibtex duplicates entries latex tex
Last synced: 11 Mar 2025
https://github.com/binbash23/dupfinder
Identify similar files
bash duplicates linux photos pictures script videos
Last synced: 23 Mar 2025
https://github.com/kstenerud/go-duplicates
Examines an abitrary golang object and reports any duplicate pointers it finds
Last synced: 23 Mar 2025
https://github.com/theohbrothers/get-duplicatecontact
A script to locate duplicate or non-duplicate contacts between two `.csv` contact lists.
compare-object contact-lists contacts csv duplicates hashtable json keys nonduplicates object objects powershell pwsh script xml
Last synced: 20 Mar 2025
https://github.com/kandekore/advanced-duplicate-post-manager-main
A powerful tool for WordPress administrators to detect and manage duplicate content across posts, pages, categories, media, and custom post types. Assign 301 redirects, clean up media, and manage .htaccess rules from a user-friendly interface.
301 301-redirect 301-redirects admon csv duplicates htaccess media redirect seo slug wordpress
Last synced: 06 Aug 2025
https://github.com/dav-m85/hashsnap
Remove files that have duplicates elsewhere.
Last synced: 11 Oct 2025
https://github.com/andreid/godupes
Super Fast Go Duplicates Finder
duplicate-files duplicates go golang
Last synced: 01 Jul 2025
https://github.com/nikitaeverywhere/utils-find-duplicates
A simple script to find duplicated files in any directory.
duplicates find-duplicates find-files
Last synced: 05 Oct 2025
https://github.com/devjiwonchoi/duplicheck
Retrieve duplicates from strings, numbers, arrays, or objects.
Last synced: 31 Mar 2025
https://github.com/clement-berard/go-imap-backup
A collection of Go tools for managing IMAP emails, featuring backup capabilities and duplicate detection/cleanup.
Last synced: 14 Sep 2025