An open API service indexing awesome lists of open source software.

Projects in Awesome Lists tagged with dedupe

A curated list of projects in awesome lists tagged with dedupe .

https://github.com/restic/restic

Fast, secure, efficient backup program

backup dedupe deduplication go restic secure-by-default

Last synced: 12 May 2025

https://github.com/dedupeio/dedupe

:id: A python library for accurate and scalable fuzzy matching, record deduplication and entity-resolution.

clustering datamade de-duplicating dedupe dedupe-library entity-resolution python python-library record-linkage

Last synced: 18 Dec 2025

https://github.com/scinos/yarn-deduplicate

Deduplication tool for yarn.lock files

dedupe duplicated-packages duplicates lock-file yarn yarn-lock

Last synced: 11 May 2025

https://github.com/nil0x42/duplicut

Remove duplicates from MASSIVE wordlist, without sorting it (for dictionary-based password cracking)

c cracking dedupe dictionary duplicate-detection hashcat hashes password password-cracking remove-duplicates uniq unique wordlist wordlist-generator wordlists

Last synced: 13 Apr 2025

https://github.com/zygo/bees

Best-Effort Extent-Same, a btrfs dedupe agent

btrfs dedup dedupe

Last synced: 18 Feb 2026

https://github.com/blakeembrey/free-style

Make CSS easier and more maintainable by using JavaScript

css css-in-js css-string dedupe hash javascript js minification typescript

Last synced: 08 Oct 2025

https://github.com/dedupeio/csvdedupe

:id: Command line tool for deduplicating CSV files

cli csv-files dedupe entity-resolution record-linkage

Last synced: 13 Apr 2025

https://github.com/dedupeio/dedupe-examples

:id: Examples for using the dedupe library

dedupe entity-resolution python record-linkage

Last synced: 18 Dec 2025

https://github.com/knjcode/imgdupes

Identifying and removing near-duplicate images using perceptual hashing.

dedupe deduplicate image perceptual-hashes perceptual-hashing

Last synced: 18 Jan 2026

https://laktak.github.io/chkbit/

Check your files for data corruption and run quick file deduplication

backup bitrot-detection btrfs cloud-backup data-degradation data-integrity dedup dedupe deduper deduplication disk-check storage-media

Last synced: 03 Apr 2026

https://github.com/kdeldycke/mail-deduplicate

📧 CLI to deduplicate mails from mail boxes.

babyl cleanup cli dedupe deduplication email mail mailbox maildir mbox mh mmdf python

Last synced: 13 Dec 2025

https://github.com/laktak/chkbit

Check your files for data corruption and run quick file deduplication

backup bitrot-detection btrfs cloud-backup data-degradation data-integrity dedup dedupe deduper deduplication disk-check storage-media

Last synced: 04 Apr 2025

https://github.com/jason89521/daxus

Daxus is a server state management library for React that provides full control over data, leading to a better user experience.

cache data dedupe hook react revalidate server-state-management user-experience

Last synced: 23 Jun 2025

https://github.com/zayne-labs/callapi

A lightweight fetching library packed with essential features - retries, interceptors, request deduplication and much more, all while still retaining a similar API surface with regular Fetch.

callapi dedupe fetch fetch-wrapper interceptors params plugins query request-dedupe retries schema standard-schema typesafe validation

Last synced: 27 Apr 2026

https://github.com/dssg/pgdedupe

A simple command line interface to the datamade/dedupe library.

data-cleaning database dedupe deduplication postgresql python record-linkage

Last synced: 21 Jan 2026

https://github.com/mighty-justice/django-super-deduper

Utilities for de-duping Django model instances

dedupe django python

Last synced: 30 Jul 2025

https://github.com/kevinpollet/pocket-deduper

Remove duplicates from your Pocket list.

cli dedupe duplicates go golang pocket tool

Last synced: 11 Apr 2025

https://github.com/futuresearch/everyrow-sdk

Intelligent pandas dataframe ops: sort, filter, dedupe & join by qualitative criteria

cleaning-data dedupe entity-resolution filtering llm-agents merging-algorithms pandas-dataframe ranking semantic-analysis

Last synced: 20 Feb 2026

https://github.com/dedupeio/dedupe-variable-address

Address Variable Type for dedupe

dedupe dedupe-variable

Last synced: 15 Apr 2025

https://github.com/dedupeio/dedupe-variable-name

name variable type for dedupe

dedupe dedupe-variable

Last synced: 15 Apr 2025

https://github.com/samhirtarif/helper-methods-js

A repo that contains helper methods for common and not-so-common use cases

async dedupe deduplication deepcopy indexesof isasync

Last synced: 08 Mar 2025

https://github.com/mterron/swuniq

A command-line tool for deduplicating entries in a file or stream with constant memory usage

cli dedupe deduping deduplicate deduplication filter sliding-window uniq

Last synced: 22 Feb 2026

https://github.com/harpin-ai/toolkit-examples

Examples for trying out the harpin AI identity resolution and data quality toolkit

data-engineering data-quality dedupe deduplication entity-resolution identity identity-resolution spark

Last synced: 23 Apr 2025

https://github.com/lilydjwg/android-dedupefs

A filesystem for reading Android dedupe backup

android backup dedupe fuse

Last synced: 09 May 2026

https://github.com/dedupeio/dedupe-variable-fuzzycategory

Dedupe Variable for Fuzzy Categories

dedupe dedupe-variable

Last synced: 27 Aug 2025

https://github.com/dedupeio/parseratorvariable

Base class for dedupe variables for parsed fields

dedupe dedupe-variable

Last synced: 15 Apr 2025

https://github.com/barchart/aws-lambda-suppressor

JavaScript utility for suppressing duplicate AWS Lambda invocations

dedupe deduplication duplicate-detection dynamodb javascript lambda public-repository serverless

Last synced: 23 Jul 2025

https://github.com/stdlib-js/iter-unique-by-hash

Create an iterator which returns unique values according to a hash function.

dedupe deduplicate distinct hash iterable iterate iterator javascript node node-js nodejs stdlib uniq unique util utilities utility utils

Last synced: 26 Feb 2026

https://github.com/mattriley/node-duplicate-file-finder

Finds duplicate files across given directories without hashing.

dedupe duplicate-files hashless javascript nodejs npm-package

Last synced: 03 Sep 2025

https://github.com/stdlib-js/iter-unique-by

Create an iterator which returns unique values according to a predicate function.

dedupe deduplicate distinct iterable iterate iterator javascript node node-js nodejs predicate stdlib uniq unique util utilities utility utils

Last synced: 17 Apr 2026

https://github.com/stdlib-js/iter-dedupe-by

Create an iterator which removes consecutive values that resolve to the same value according to a provided function.

compress dedupe deduplicate deduplication duplicate iterable iterate iteration iterator javascript node node-js nodejs stdlib uniq unique util utilities utility utils

Last synced: 16 Aug 2025

https://github.com/jchristn/watsondedupeui

UI for WatsonDedupe library

compression dedupe deduplication watson-dedupe

Last synced: 31 Mar 2025

https://github.com/octivi/borg-backup-wrapper

Wrapper for a deduplicating archiver BorgBackup. It simplifies performing everyday tasks on multiply repositories.

backup bash borgbackup borgbase compression dedupe deduplication encryption servers

Last synced: 20 Feb 2026

https://github.com/stdlib-js/array-base-to-deduped

Copy elements to a new generic array after removing consecutive duplicated values.

array compress copy data dedupe deduplicate deduplication duplicate generic javascript node node-js nodejs stdlib structure types uniq unique

Last synced: 14 Jun 2025

https://github.com/jaredkoontz/bitwarden-dedup

Filters bitwarden json files to find duplicate entries, and "useless" entries.

bitwarden dedupe deduplication python

Last synced: 13 Oct 2025

https://github.com/cdaringe/dedupe-assert

asserts that packages are truly deduped

assert dedupe npm packages redundant

Last synced: 17 May 2026

https://github.com/soenneker/soenneker.sets.concurrent.slidingwindow

A high-throughput, thread-safe set whose bucketed entries automatically expire after a fixed time window.

auto-expire concurrency concurrent csharp de-dupe dedupe dotnet object set sets slidingwindow slidingwindowconcurrentset threadsafe

Last synced: 22 Apr 2026

https://github.com/soenneker/soenneker.deduplication.bounded

A thread-safe high-performance bounded size deduplication utility for .NET.

bounded boundeddedupe csharp dedupe deduplication dotnet max maxsize object size

Last synced: 03 May 2026

https://github.com/derhuerst/callbag-keep-sequences

A callbag operator that passes through only sequences with minimum length.

callbag dedupe filter sequence streak

Last synced: 05 Mar 2026

https://github.com/marirs/dedupe_yara_rule-rs

Dedupe yara rules - Rust version

dedupe deduper rust rust-lang yara yara-rules yara-x

Last synced: 23 Apr 2025

https://github.com/sajad-net/dedupe

A lightweight and efficient command-line tool written in Go to help you find and remove duplicate files on your disk.

cli dedupe duplicate-detection go

Last synced: 09 Mar 2025