An open API service indexing awesome lists of open source software.

https://github.com/kdeldycke/mail-deduplicate

📧 CLI to deduplicate mails from mail boxes.
https://github.com/kdeldycke/mail-deduplicate

babyl cleanup cli dedupe deduplication email mail mailbox maildir mbox mh mmdf python

Last synced: 6 months ago
JSON representation

📧 CLI to deduplicate mails from mail boxes.

Awesome Lists containing this project

README

          



Mail Deduplicate

[![Last release](https://img.shields.io/pypi/v/mail-deduplicate.svg)](https://pypi.org/project/mail-deduplicate)
[![Python versions](https://img.shields.io/pypi/pyversions/mail-deduplicate.svg)](https://pypi.org/project/mail-deduplicate)
[![Downloads](https://static.pepy.tech/badge/mail-deduplicate/month)](https://pepy.tech/projects/mail-deduplicate)
[![Unittests status](https://github.com/kdeldycke/mail-deduplicate/actions/workflows/tests.yaml/badge.svg?branch=main)](https://github.com/kdeldycke/mail-deduplicate/actions/workflows/tests.yaml?query=branch%3Amain)
[![Coverage status](https://codecov.io/gh/kdeldycke/mail-deduplicate/graph/badge.svg?token=81NWQAPjEQ)](https://app.codecov.io/gh/kdeldycke/mail-deduplicate)
[![Documentation status](https://github.com/kdeldycke/mail-deduplicate/actions/workflows/docs.yaml/badge.svg?branch=main)](https://github.com/kdeldycke/mail-deduplicate/actions/workflows/docs.yaml?query=branch%3Amain)
[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.7364256.svg)](https://doi.org/10.5281/zenodo.7364256)

**What is Mail Deduplicate?**

Provides the `mdedup` CLI, an utility to deduplicate mails from a set of boxes.


Mail Deduplicate

## Features

- Duplicate detection based on cherry-picked and normalized mail headers.
- Fetch mails from multiple sources.
- Reads and writes to `mbox`, `maildir`, `babyl`, `mh` and `mmdf` formats.
- Deduplication strategies based on size, content, timestamp, file path or random choice.
- Copy, move or delete the resulting set of duplicates.
- Dry-run mode.
- Protection against false-positives with safety checks on size and content differences.
- Supports macOS, Linux and Windows.
- [Standalone executables](#executables) for Linux, macOS and Windows.
- Shell auto-completion for Bash, Zsh and Fish.

## Installation

All [installation methods](https://kdeldycke.github.io/mail-deduplicate/install.html) are available in the documentation. Below are the most popular ones:

### Try it now

[`uv`](https://docs.astral.sh/uv/getting-started/installation/) is the fastest way to run `mdedup` on any platform, thanks to its [`uvx` command](https://docs.astral.sh/uv/guides/tools/#running-tools):

```shell-session
$ uvx --from mail-deduplicate -- mdedup
```

### macOS

`mdedup` is part of the official [Homebrew](https://brew.sh) default tap, so you can install it with:

```shell-session
$ brew install mail-deduplicate
```

### Executables

Standalone binaries of `mdedup`'s latest version are available as direct downloads for several platforms and architectures:

| Platform | `arm64` | `x86_64` |
| ----------- | -------------------------------------------------------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------- |
| **Linux** | [Download `mdedup-linux-arm64.bin`](https://github.com/kdeldycke/mail-deduplicate/releases/latest/download/mdedup-linux-arm64.bin) | [Download `mdedup-linux-x64.bin`](https://github.com/kdeldycke/mail-deduplicate/releases/latest/download/mdedup-linux-x64.bin) |
| **macOS** | [Download `mdedup-macos-arm64.bin`](https://github.com/kdeldycke/mail-deduplicate/releases/latest/download/mdedup-macos-arm64.bin) | [Download `mdedup-macos-x64.bin`](https://github.com/kdeldycke/mail-deduplicate/releases/latest/download/mdedup-macos-x64.bin) |
| **Windows** | [Download `mdedup-windows-arm64.exe`](https://github.com/kdeldycke/mail-deduplicate/releases/latest/download/mdedup-windows-arm64.exe) | [Download `mdedup-windows-x64.exe`](https://github.com/kdeldycke/mail-deduplicate/releases/latest/download/mdedup-windows-x64.exe) |

## Quickstart

> [!WARNING]
> Performance and memory usage: `mdedup` implementation is quite naive and everything resides in memory.
>
> If this is good enough for a volume of a couple of gigabytes, the more emails `mdedup` try to parse, the closer you'll reach the memory limits of your machine. In which case [`mdedup` will exit abruptly](https://github.com/kdeldycke/mail-deduplicate/issues/362#issuecomment-1266743045), zapped by the [OOM killer](https://en.wikipedia.org/wiki/Out_of_memory) of your OS. Of course your mileage may vary depending on your hardware.
>
> You can influence implementation of this feature with pull requests, [purchasing business support 🤝](https://github.com/sponsors/kdeldycke) and [sponsorship 🫶](https://github.com/sponsors/kdeldycke).

### Example