{"id":31725645,"url":"https://github.com/open-technology-foundation/mail-tools","last_synced_at":"2026-05-18T10:38:24.915Z","repository":{"id":318398067,"uuid":"1070020822","full_name":"Open-Technology-Foundation/mail-tools","owner":"Open-Technology-Foundation","description":"Fast email parsing utilities for extracting headers, message bodies, and cleaning bloat headers. Available as both standalone binaries and bash loadable builtins.","archived":false,"fork":false,"pushed_at":"2025-10-18T07:00:18.000Z","size":3453,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2026-04-14T18:38:17.432Z","etag":null,"topics":["bash","bash-scripting","mail","mail-header","mdir"],"latest_commit_sha":null,"homepage":"https://yatti.id/","language":"Shell","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Open-Technology-Foundation.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-10-05T05:10:01.000Z","updated_at":"2025-10-18T07:00:22.000Z","dependencies_parsed_at":"2025-10-08T03:21:57.905Z","dependency_job_id":"3796c75e-2bc5-491b-bb65-5346dcc4666d","html_url":"https://github.com/Open-Technology-Foundation/mail-tools","commit_stats":null,"previous_names":["open-technology-foundation/mailheader","open-technology-foundation/mail-tools"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/Open-Technology-Foundation/mail-tools","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Open-Technology-Foundation%2Fmail-tools","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Open-Technology-Foundation%2Fmail-tools/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Open-Technology-Foundation%2Fmail-tools/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Open-Technology-Foundation%2Fmail-tools/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Open-Technology-Foundation","download_url":"https://codeload.github.com/Open-Technology-Foundation/mail-tools/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Open-Technology-Foundation%2Fmail-tools/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":33175173,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-18T09:27:30.708Z","status":"ssl_error","status_checked_at":"2026-05-18T09:27:28.300Z","response_time":71,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bash","bash-scripting","mail","mail-header","mdir"],"created_at":"2025-10-09T05:55:01.756Z","updated_at":"2026-05-18T10:38:24.432Z","avatar_url":"https://github.com/Open-Technology-Foundation.png","language":"Shell","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Mail Tools\n\n[![License: GPL v3](https://img.shields.io/badge/License-GPLv3-blue.svg)](https://www.gnu.org/licenses/gpl-3.0)\n[![CI](https://github.com/Open-Technology-Foundation/mail-tools/workflows/CI/badge.svg)](https://github.com/Open-Technology-Foundation/mail-tools/actions)\n[![GitHub Issues](https://img.shields.io/github/issues/Open-Technology-Foundation/mail-tools)](https://github.com/Open-Technology-Foundation/mail-tools/issues)\n\nFast email parsing utilities for extracting headers, message bodies, and cleaning bloat headers. Available as both standalone binaries and bash loadable builtins.\n\n**Features:**\n- Professional directory structure (src/, scripts/, man/, examples/, tools/, build/)\n- Clean separation of source code and build artifacts\n- Comprehensive test suite with 632 real-world email files\n- Automated installation with dependency management\n- Bash completions for all utilities (tab completion for options and files)\n\n## Purpose and Use Cases\n\nThese utilities solve common email processing challenges in scripts and automation:\n\n**Use Cases:**\n- **Email archival**: Strip bloat headers before archiving to reduce storage by ~20% (mailheaderclean)\n- **Email parsing**: Extract sender, subject, dates from mailbox files (mailheader + parsing)\n- **Mailbox processing**: Process thousands of emails efficiently with bash builtins (10-20x faster)\n- **Email splitting**: Separate headers from body for independent processing\n- **Privacy**: Remove tracking headers and metadata before forwarding emails\n- **Thunderbird integration**: Clean emails while preserving client-specific headers\n- **Batch operations**: Clean entire directories of emails in-place\n\n**Why these tools?**\n- **Performance**: Bash builtins eliminate fork/exec overhead (~1-2ms → ~0.1ms per call)\n- **Simplicity**: Single-purpose utilities that do one thing well\n- **RFC 822 compliance**: Proper handling of header continuation lines\n- **Flexibility**: Dual implementation (binaries + builtins) for different contexts\n- **No dependencies**: Pure C, minimal external requirements\n\n**Comparison with alternatives:**\n- **vs formail/reformail**: Focused specifically on header/body extraction and cleaning\n- **vs python/perl**: No interpreter overhead, much faster for batch processing\n- **vs awk/sed**: Built-in RFC 822 support, easier to use correctly\n\n## Utilities\n\n### mailheader\nExtracts email headers (everything up to the first blank line).\n\n- Handles RFC 822 continuation lines (joins lines starting with whitespace)\n- Normalizes formatting (removes `\\r`, converts tabs to spaces)\n- Available as binary and builtin\n\n```bash\nmailheader email.eml\nmailheader -h          # Show help\n```\n\n### mailmessage\nExtracts email message body (everything after the first blank line).\n\n- Skips the header section entirely\n- Preserves message formatting\n- Available as binary and builtin\n\n```bash\nmailmessage email.eml\nmailmessage -h         # Show help\n```\n\n### mailheaderclean\nFilters non-essential email headers from complete email files.\n\n- Removes ~207 bloat headers (Microsoft Exchange, security vendors, tracking, etc.)\n- Keeps only the first \"Received\" header\n- Preserves essential routing headers and complete message body\n- Supports flexible header filtering via environment variables\n- Available as binary and builtin\n\n```bash\nmailheaderclean email.eml \u003e cleaned.eml\nmailheaderclean -l                        # List active removal headers\nmailheaderclean -h                        # Show help\n```\n\n**Environment variables** (processed in this order):\n1. `MAILHEADERCLEAN`: Replace entire removal list with custom headers (or use built-in if not set)\n2. `MAILHEADERCLEAN_PRESERVE`: Exclude specific headers from removal (e.g., for Thunderbird features)\n3. `MAILHEADERCLEAN_EXTRA`: Add additional headers to the removal list\n\n**Formula:** `(MAILHEADERCLEAN or built-in) - PRESERVE + EXTRA`\n\n**Wildcard patterns supported** (shell glob syntax):\n- `X-*` - Match any header starting with X-\n- `*-Status` - Match any header ending with -Status\n- `X-MS-*` - Match any header starting with X-MS-\n- `X-*-Status` - Match X- followed by anything, ending in -Status\n\n### mailgetaddresses\nBash script that extracts email addresses from From, To, and Cc headers in email files.\n\n- Handles various email formats (with/without names, quoted strings, etc.)\n- Multiple recipients per header\n- Optional output formatting (with names, separated by header type)\n- Selective header extraction\n- Accepts multiple files and/or directories as arguments\n- Processes all files in given directories\n- Directory exclusions (default: .Junk, .Trash, .Sent) with override support\n- **Name cleaning**: Decodes RFC 2047 encoded names, removes quotes and parenthetical notation\n- Removes redundant names (when name equals email address)\n\n```bash\nmailgetaddresses email.eml                      # Extract from single file\nmailgetaddresses -n email.eml                   # Include names with addresses\nmailgetaddresses -s email.eml                   # Separate by header type\nmailgetaddresses -H from,to email.eml           # Extract only From and To\nmailgetaddresses file1.eml file2.eml file3.eml  # Multiple files\nmailgetaddresses /path/to/maildir/              # Process directory (excludes .Junk,.Trash,.Sent)\nmailgetaddresses -x .spam,.Junk /path/to/mail/  # Custom directory exclusions\nmailgetaddresses --exclude '' /path/to/mail/    # No exclusions (process all subdirs)\nmailgetaddresses email.eml /path/to/maildir/    # Combine files and dirs\nmailgetaddresses /path/to/maildir/ | sort -fu   # Deduplicate (case-insensitive)\nmailgetaddresses -n /path/to/maildir/ | sort -fu \u003e contacts.txt  # Build contact list\nmailgetaddresses --help                         # Show help\n```\n\n**Name Cleaning Examples:**\n```bash\n# Input:  =?utf-8?Q?Undangan_Pelatihan?= \u003cinvite@example.com\u003e\n# Output: Undangan Pelatihan \u003cinvite@example.com\u003e\n\n# Input:  'John Doe' \u003cjohn@example.com\u003e\n# Output: John Doe \u003cjohn@example.com\u003e\n\n# Input:  Jane Smith (jane@example.com) \u003cjane@example.com\u003e\n# Output: Jane Smith \u003cjane@example.com\u003e\n\n# Input:  bob@test.com \u003cbob@test.com\u003e\n# Output: bob@test.com\n```\n\n**Post-Processing: Filtering with Exclude Patterns:**\n\nFor excluding specific domains or patterns from the results, use `grep -v -F -f`:\n\n```bash\n# Create exclude pattern file\ncat \u003e /tmp/exclude-patterns.list \u003c\u003cEOF\nokusi\nhukumonline\n@singularityu.org\n@example.com\nEOF\n\n# Filter results using pattern file (case-sensitive)\nmailgetaddresses -n /path/to/maildir/ | sort -fu | grep -v -F -f /tmp/exclude-patterns.list\n\n# Case-insensitive filtering (recommended - matches okusi, Okusi, OKUSI, etc.)\nmailgetaddresses -n /path/to/maildir/ | sort -fu | grep -v -i -F -f /tmp/exclude-patterns.list\n\n# If patterns are regex (slower, only if needed)\nmailgetaddresses -n /path/to/maildir/ | sort -fu | grep -v -E -f /tmp/exclude-patterns.list\n```\n\n**grep options explained:**\n- `-v` = Invert match (exclude lines that match)\n- `-F` = Fixed strings (not regex) - much faster for literal patterns\n- `-i` = Case-insensitive matching\n- `-E` = Extended regex (only if patterns contain regex syntax)\n- `-f FILE` = Read patterns from file\n\nThis approach is highly efficient because `grep` uses optimized C code with Boyer-Moore algorithm for pattern matching.\n\n### mailgetheaders\nBash script that parses email headers into a bash associative array for easy access in scripts.\n\n- Extracts all headers from an email file\n- Outputs bash code to populate an associative array\n- Handles continuation lines (RFC 822)\n- Ideal for scripting and parsing email metadata\n\n```bash\nmailgetheaders email.eml                    # Output bash array declaration\neval \"$(mailgetheaders email.eml)\"          # Populate Headers array in current shell\n\n# Use in scripts:\ndeclare -A Headers\neval \"$(mailgetheaders email.eml)\"\necho \"From: ${Headers[From]}\"\necho \"Subject: ${Headers[Subject]}\"\necho \"Date: ${Headers[Date]}\"\necho \"File: ${Headers[file]}\"\n\nmailgetheaders --help                       # Show help\n```\n\n### mailheaderclean-batch (script)\nProduction script for batch cleaning of email files or directories in-place.\n\n- Process single files or entire directories\n- Age filtering with `-d/--days` option\n- Configurable directory traversal depth\n- Preserves timestamps and permissions\n- Progress reporting and error handling\n- Available as `clean-email-headers` symlink for backwards compatibility\n\n```bash\nmailheaderclean-batch email.eml              # Clean single file\nmailheaderclean-batch /path/to/maildir       # Clean all files in directory\nmailheaderclean-batch -d 7 /path/to/maildir  # Only files from last 7 days\nmailheaderclean-batch -m 2 /path/to/maildir  # Traverse 2 levels deep\nmailheaderclean-batch -h                     # Show help\n\n# Also available via backwards-compatible symlink:\nclean-email-headers email.eml                # Same as mailheaderclean-batch\n```\n\nAll utilities support:\n- **Help options**: `-h` or `--help` for usage information\n- **Consistent exit codes**: 0 (success), 1 (file error), 2 (usage error)\n- **Dual implementation**:\n  - Standalone binaries for general use\n  - Bash loadable builtins for high-performance scripting (10-20x faster)\n\n## Installation\n\n### Prerequisites\n\nFor the bash loadable builtin:\n```bash\nsudo apt-get install bash-builtins  # Ubuntu/Debian\n```\n\n### One-Liner Install\n\nQuick installation without keeping the source:\n\n```bash\ngit clone https://github.com/Open-Technology-Foundation/mail-tools.git \u0026\u0026 cd mail-tools \u0026\u0026 sudo ./install.sh --builtin \u0026\u0026 cd .. \u0026\u0026 rm -rf mail-tools\n```\n\nThis will clone, install, and clean up the source in one command.\n\n### Getting the Source\n\nFor manual installation or to keep the source:\n\n```bash\ngit clone https://github.com/Open-Technology-Foundation/mail-tools.git\ncd mail-tools\n```\n\n**Optional**: Keep a system-wide copy of the source for future updates or rebuilds:\n\n```bash\n# Move to traditional source location (optional)\nsudo mv mail-tools /usr/local/src/\ncd /usr/local/src/mail-tools\n```\n\n**Note**: The source code is not required after installation. You can delete the cloned directory after running `install.sh` and re-clone from GitHub if needed later.\n\n### Quick Install (Recommended)\n\nUse the installation script for an interactive, automated installation:\n\n```bash\nsudo ./install.sh\n```\n\nThe script will:\n- Check prerequisites and build all utilities\n- Prompt whether to install the bash builtins (optional)\n- Install all files to the correct locations\n- Update the man database\n\n**Options:**\n```bash\nsudo ./install.sh --help              # Show all options\nsudo ./install.sh --builtin           # Force builtins (auto-installs dependencies)\nsudo ./install.sh --no-builtin        # Skip builtin installation\nsudo ./install.sh --uninstall         # Remove installation\nsudo ./install.sh --prefix=/opt       # Custom install location\nsudo ./install.sh --dry-run           # Preview without installing\n```\n\n**Note:** Using `--builtin` will automatically install the `bash-builtins` package if it's not already present (Debian/Ubuntu).\n\n### Manual Build and Install\n\nAlternatively, use make directly:\n\n```bash\n# Build all utilities\nmake\n\n# Install system-wide (requires sudo)\nsudo make install\n```\n\nThis installs:\n- Standalone binaries: `/usr/local/bin/{mailheader,mailmessage,mailheaderclean}`\n- Bash scripts: `/usr/local/bin/{mailgetaddresses,mailgetheaders,mailheaderclean-batch}` (includes backwards-compatible `clean-email-headers` symlink)\n- Loadable builtins: `/usr/local/lib/bash/loadables/{mailheader,mailmessage,mailheaderclean}.so`\n- Auto-load script: `/etc/profile.d/mail-tools.sh`\n- Bash completions: `/usr/local/share/bash-completion/completions/mail-tools`\n- Manpages: `/usr/local/share/man/man1/{mailheader,mailmessage,mailheaderclean,mailgetaddresses}.1`\n- Documentation: `/usr/local/share/doc/mail-tools/`\n\n### Verify Installation\n\n**Quick diagnostic script:**\n```bash\n# Run comprehensive installation check\nscripts/check-installation.sh\n```\n\nThis diagnostic script checks all installation components (binaries, builtins, man pages, profile script) and provides troubleshooting guidance if anything is missing. It's particularly useful for diagnosing builtin loading issues in non-interactive contexts.\n\n**Manual verification:**\n```bash\n# Check all installed programs\nwhich mailheader mailmessage mailheaderclean mailgetaddresses mailgetheaders\n\n# Verify backwards-compatible symlink works\nwhich clean-email-headers  # Should point to mailheaderclean\n\n# Check builtins (after opening new shell or sourcing profile)\nenable -a | grep mail\n\n# View help\nmailheader -h\nmailmessage -h\nmailheaderclean -h\nmailgetaddresses --help\nmailgetheaders --help\n\n# View manpages\nman mailheader\nman mailmessage\nman mailheaderclean\nman mailgetaddresses\n\n# Get help for builtins\nhelp mailheader\nhelp mailmessage\nhelp mailheaderclean\n```\n\n## Usage\n\n### Interactive Shell\n\nThe builtins are automatically loaded in new bash sessions:\n```bash\nmailheader email.eml\nmailmessage email.eml\nmailheaderclean email.eml\n```\n\n### Scripts\n\nScripts must explicitly enable the builtins:\n```bash\n#!/bin/bash\nenable -f mailheader.so mailheader 2\u003e/dev/null || true\nenable -f mailmessage.so mailmessage 2\u003e/dev/null || true\nenable -f mailheaderclean.so mailheaderclean 2\u003e/dev/null || true\n\n# Extract headers\nmailheader /path/to/email.eml\n\n# Extract message body\nmailmessage /path/to/email.eml\n\n# Clean bloat headers from email\nmailheaderclean /path/to/email.eml \u003e cleaned.eml\n\n# Split an email into parts\nmailheader email.eml \u003e headers.txt\nmailmessage email.eml \u003e body.txt\n\n# Remove custom headers\nMAILHEADERCLEAN_EXTRA=\"X-Custom,X-Internal\" mailheaderclean email.eml\n```\n\n### Cron Jobs\n\nCron requires explicit setup:\n```bash\n# Method 1: Source profile\n*/15 * * * * bash -c 'source /etc/profile.d/mail-tools.sh; mailheader /path/to/email.eml'\n\n# Method 2: Explicit enable\n*/15 * * * * bash -c 'enable -f mailheader.so mailheader; mailheader /path/to/email.eml'\n\n# Method 3: Use standalone binary\n*/15 * * * * /usr/local/bin/mailheader /path/to/email.eml\n\n# Method 4: Use production script\n0 2 * * * /usr/local/bin/mailheaderclean-batch -d 30 /var/mail/archive 2\u003e/var/log/email-clean.log\n```\n\n## Examples\n\n### Basic Usage\n\nGiven `test.eml`:\n```\nFrom: sender@example.com\nTo: recipient@example.com\nSubject: Test email with\n continuation line\nDate: Mon, 1 Jan 2025 12:00:00 +0000\n\nThis is the message body.\nIt can span multiple lines.\n```\n\nExtract headers:\n```bash\n$ mailheader test.eml\nFrom: sender@example.com\nTo: recipient@example.com\nSubject: Test email with continuation line\nDate: Mon, 1 Jan 2025 12:00:00 +0000\n```\n\nExtract message body:\n```bash\n$ mailmessage test.eml\nThis is the message body.\nIt can span multiple lines.\n```\n\nSplit email into components:\n```bash\n$ mailheader email.eml \u003e headers.txt\n$ mailmessage email.eml \u003e body.txt\n$ cat headers.txt body.txt  # Reconstruct email\n```\n\n### Cleaning Bloat Headers\n\nList currently active removal headers:\n```bash\n$ mailheaderclean -l\nx-ms-exchange-antispam-messagedata-0\nx-ms-exchange-antispam-messagedata-chunkcount\n...\nX-Spam-Score\nX-Proofpoint-NotVirusScanned\n\n# With environment variables applied\n$ MAILHEADERCLEAN_PRESERVE=\"List-Unsubscribe,X-Priority\" mailheaderclean -l\n# Shows built-in list minus preserved headers\n\n$ MAILHEADERCLEAN=\"X-Spam-Status,DKIM-Signature\" mailheaderclean -l\nX-Spam-Status\nDKIM-Signature\n```\n\nClean single file:\n```bash\n$ mailheaderclean email.eml \u003e cleaned.eml\n```\n\nUse custom removal list entirely:\n```bash\n$ MAILHEADERCLEAN=\"X-Spam-Status,Delivered-To\" mailheaderclean email.eml\n```\n\nPreserve specific headers (e.g., for Thunderbird):\n```bash\n$ MAILHEADERCLEAN_PRESERVE=\"List-Unsubscribe,X-Priority\" mailheaderclean email.eml\n```\n\nAdd custom headers to built-in list:\n```bash\n$ MAILHEADERCLEAN_EXTRA=\"X-Custom-Header,X-Internal\" mailheaderclean email.eml\n```\n\nComplex combination:\n```bash\n$ MAILHEADERCLEAN=\"DKIM-Signature,List-Unsubscribe\" \\\n  MAILHEADERCLEAN_PRESERVE=\"List-Unsubscribe\" \\\n  MAILHEADERCLEAN_EXTRA=\"X-Custom\" mailheaderclean email.eml\n# Result: Removes DKIM-Signature and X-Custom, preserves List-Unsubscribe\n```\n\nUse wildcard patterns:\n```bash\n# Remove all X- headers\n$ MAILHEADERCLEAN=\"X-*\" mailheaderclean email.eml\n\n# Remove all Microsoft headers\n$ MAILHEADERCLEAN_EXTRA=\"X-Microsoft-*,X-MS-*\" mailheaderclean email.eml\n\n# Remove all spam and status headers\n$ MAILHEADERCLEAN=\"X-Spam-*,*-Status\" mailheaderclean email.eml\n\n# Complex: remove all X-MS- headers except X-MS-TNEF-Correlator\n$ MAILHEADERCLEAN_EXTRA=\"X-MS-*\" \\\n  MAILHEADERCLEAN_PRESERVE=\"X-MS-TNEF-Correlator\" mailheaderclean email.eml\n```\n\n### Production Script Examples\n\nClean single file in-place:\n```bash\nmailheaderclean-batch email.eml\n```\n\nClean entire directory:\n```bash\nmailheaderclean-batch /path/to/maildir\n```\n\nClean only recent files (last 7 days):\n```bash\nmailheaderclean-batch -d 7 /path/to/maildir\n```\n\nClean with custom depth and verbosity:\n```bash\nmailheaderclean-batch -m 3 -v /path/to/maildir\n```\n\nQuiet mode for cron:\n```bash\nmailheaderclean-batch -q /path/to/maildir\n```\n\n### Advanced Examples\n\nParse headers into associative array:\n```bash\n#!/bin/bash\n\ndeclare -A Headers\neval \"$(mailgetheaders email.eml)\"\n\necho \"From: ${Headers[From]}\"\necho \"Subject: ${Headers[Subject]}\"\necho \"Date: ${Headers[Date]}\"\necho \"File: ${Headers[file]}\"\n```\n\nBulk email cleaning with progress:\n```bash\n#!/bin/bash\nenable -f mailheaderclean.so mailheaderclean 2\u003e/dev/null || true\n\nexport MAILHEADERCLEAN_PRESERVE=\"List-Unsubscribe,List-Post,X-Priority,Importance\"\n\nfor email in ~/Maildir/cur/*; do\n  mailheaderclean \"$email\" \u003e \"/tmp/cleaned/$(basename \"$email\")\"\ndone\n```\n\nEmail archive with storage savings:\n```bash\n#!/bin/bash\nenable -f mailheaderclean.so mailheaderclean 2\u003e/dev/null || true\n\ntotal_saved=0\nfor email in /var/mail/archive/*.eml; do\n  original_size=$(stat -c%s \"$email\")\n  mailheaderclean \"$email\" \u003e \"/archive/clean/$(basename \"$email\")\"\n  cleaned_size=$(stat -c%s \"/archive/clean/$(basename \"$email\")\")\n  saved=$((original_size - cleaned_size))\n  total_saved=$((total_saved + saved))\n  echo \"Saved $saved bytes on $(basename \"$email\")\"\ndone\necho \"Total saved: $total_saved bytes\"\n```\n\n## Exit Codes\n\nAll utilities follow standard Unix exit code conventions:\n\n- **0** - Success\n- **1** - File error (cannot open/read file)\n- **2** - Usage error (incorrect arguments)\n\nExamples:\n```bash\nmailheader email.eml\necho $?  # 0 (success)\n\nmailheader /nonexistent/file\necho $?  # 1 (file error)\n\nmailheader\necho $?  # 2 (usage error - missing argument)\n```\n\n## Performance\n\nThe bash builtins eliminate fork/exec overhead:\n\n- **Standalone binaries**: ~1-2ms per call (fork + exec + startup)\n- **Bash builtins**: ~0.1ms per call (in-process execution)\n\nFor scripts processing many emails, this provides **10-20x speedup**.\n\n### Benchmarking\n\nThe project includes comprehensive benchmarking tools to measure performance:\n\n```bash\n# Basic performance comparison\ntools/benchmark.sh\n\n# Detailed scaling analysis with multiple file counts\ntools/benchmark_detailed.sh\n```\n\nBoth scripts compare builtin vs standalone performance across different file counts. Results typically show:\n- **Small files (1-2KB)**: 15-20x speedup with builtins\n- **Large files (\u003e100KB)**: 8-12x speedup with builtins\n- **Overall**: 10-20x average speedup for typical email processing\n\n### Real-World Performance\n\nTesting with 632 real email files (8.3MB total):\n\n**mailheaderclean cleaning operation:**\n- **Standalone**: ~2.5 seconds\n- **Builtin**: ~0.15 seconds\n- **Speedup**: ~16x faster\n\n**Storage savings:**\n- Original size: 8.3MB\n- Cleaned size: 6.6MB\n- **Savings**: 1.7MB (~20% reduction)\n\n## Testing\n\nThe project includes a comprehensive test suite with **632 real email files** from various sources.\n\n### Running Tests\n\n```bash\ncd tests\n\n# Master test runner (runs all tests)\n./test_master.sh\n\n# Repository structure and build tests\n./test_structure.sh             # Validate directory structure\n./test_build_system.sh          # Test Makefile and build process\n./test_installation.sh          # Test install.sh functionality\n\n# Comprehensive tests (all 632 files)\n./test_all_mailheader.sh       # Test header extraction\n./test_all_mailmessage.sh      # Test message body extraction\n./test_all_mailheaderclean.sh  # Test header cleaning\n\n# Quick functionality tests\n./test_simple.sh                    # Basic functionality\n./test_builtin_vs_standalone.sh     # Verify identical output\n./validate_email_format.sh          # RFC 822 compliance\n\n# Script tests\n./test_mailgetaddresses.sh      # Test address extraction\n./test_mailgetheaders.sh        # Test header parsing\n\n# Environment variable tests\n./test_env_vars.sh\n./test_mailheaderclean.sh\n```\n\n### Test Results\n\nAll tests pass with 100% success rate on critical functionality:\n\n- ✓ **632/632** files produce valid header extraction\n- ✓ **632/632** files produce valid message extraction\n- ✓ **632/632** files produce valid cleaned output\n- ✓ **632/632** files maintain valid RFC 822 format after cleaning\n- ✓ **100%** identical output between standalone and builtin versions\n- ✓ All environment variable combinations work correctly\n\nSee `tests/README.md` for detailed test documentation.\n\n## Build Targets\n\nThe build system uses organized directories for clean separation of source and artifacts:\n\n```bash\nmake                      # Build all utilities (output to build/)\nmake all-mailheader       # Build mailheader only\nmake all-mailmessage      # Build mailmessage only\nmake all-mailheaderclean  # Build mailheaderclean only\nmake standalone           # Build all standalone binaries\nmake loadable             # Build all builtins\nmake clean                # Remove build/ directory\nsudo make install         # Install all utilities system-wide\nsudo make uninstall       # Remove all installed files\nmake help                 # Show all available targets\n```\n\nBuild artifacts are organized in `build/`:\n- `build/bin/` - Standalone executables\n- `build/lib/` - Loadable builtins (.so files)\n- `build/obj/` - Object files\n\n## How the Builtins Work\n\nThe builtins are automatically available in interactive shells via `/etc/profile.d/mail-tools.sh`:\n\n1. Sets `BASH_LOADABLES_PATH=/usr/local/lib/bash/loadables`\n2. Auto-loads builtins in interactive shells:\n   - `enable -f mailheader.so mailheader`\n   - `enable -f mailmessage.so mailmessage`\n   - `enable -f mailheaderclean.so mailheaderclean`\n3. Non-interactive contexts (scripts, cron) must explicitly enable them\n\nThe builtins seamlessly integrate with bash, appearing identical to native commands while providing significant performance benefits.\n\n## Bash Completions\n\nAll mail-tools utilities include intelligent bash completions for improved usability. Completions are automatically installed and become available in new bash sessions.\n\n**Features:**\n- Tab completion for command options (e.g., `-h`, `--help`, `-l`, `-n`, `-s`)\n- File and directory path completion\n- Smart option-specific suggestions:\n  - `mailgetaddresses -H \u003ctab\u003e` suggests common header names (from, to, cc, all, etc.)\n  - `mailgetaddresses -x \u003ctab\u003e` suggests common exclusion patterns (.Junk, .Trash, .Sent)\n  - `mailheaderclean \u003ctab\u003e` completes with `-l` and `-h` options\n  - `mailheaderclean-batch \u003ctab\u003e` completes with `-d`, `-m`, `-v`, `-q` options\n\n**Usage:**\n```bash\n# Type command and press TAB to see available options\nmailheaderclean -\u003cTAB\u003e\n# Shows: -l  -h  --help\n\n# Complete header options\nmailgetaddresses -H \u003cTAB\u003e\n# Shows: from to cc bcc reply-to from,to from,to,cc from,cc all\n\n# Complete exclusion patterns\nmailgetaddresses -x \u003cTAB\u003e\n# Shows: .Junk .Trash .Sent .Drafts .Spam .Junk,.Trash .Junk,.Trash,.Sent\n\n# File path completion works for all utilities\nmailheader /path/to/\u003cTAB\u003e\n# Shows available files and directories\n```\n\n**Activation:**\n- **New sessions**: Completions are automatically available after installation\n- **Current session**: Run `source /usr/local/share/bash-completion/completions/mail-tools`\n- **Verify**: Type `mailheaderclean -\u003cTAB\u003e` to test completion\n\n**Supported utilities:**\n- `mailheader` - Options: `-h`, `--help`\n- `mailmessage` - Options: `-h`, `--help`\n- `mailheaderclean` - Options: `-l`, `-h`, `--help`\n- `mailgetaddresses` - Options: `-n`, `-s`, `-H`, `-x`, `-h`, `--help` (with smart suggestions)\n- `mailgetheaders` - Options: `-h`, `--help`, `-V`, `--version`\n- `mailheaderclean-batch` - Options: `-d`, `-m`, `-v`, `-q`, `-V`, `--version`, `-h`, `--help`\n- `clean-email-headers` - Same as mailheaderclean-batch (symlink support)\n\n## Project Structure\n\n```\n.\n├── src/                           # Source code\n│   ├── mailheader.c                   # mailheader standalone binary\n│   ├── mailheader_loadable.c          # mailheader bash builtin\n│   ├── mailmessage.c                  # mailmessage standalone binary\n│   ├── mailmessage_loadable.c         # mailmessage bash builtin\n│   ├── mailheaderclean.c              # mailheaderclean standalone binary\n│   ├── mailheaderclean_loadable.c     # mailheaderclean bash builtin\n│   └── mailheaderclean_headers.h      # Shared header removal list (~207 headers)\n├── scripts/                       # Bash scripts\n│   ├── mailgetaddresses               # Address extraction script\n│   ├── mailgetheaders                 # Header parsing script\n│   ├── mailheaderclean-batch          # Production batch cleaning script\n│   ├── mail-tools.sh                  # Profile script for auto-loading builtins\n│   └── check-installation.sh          # Diagnostic script for verifying installation\n├── man/                           # Manual pages\n│   ├── mailheader.1\n│   ├── mailmessage.1\n│   ├── mailheaderclean.1\n│   └── mailgetaddresses.1\n├── examples/                      # Sample email files\n│   ├── test.eml\n│   └── test-bloat.eml\n├── tools/                         # Benchmarking utilities\n│   ├── benchmark.sh\n│   └── benchmark_detailed.sh\n├── build/                         # Build artifacts (generated)\n│   ├── bin/                           # Compiled binaries\n│   │   ├── mailheader\n│   │   ├── mailmessage\n│   │   └── mailheaderclean\n│   ├── lib/                           # Loadable builtins\n│   │   ├── mailheader.so\n│   │   ├── mailmessage.so\n│   │   └── mailheaderclean.so\n│   └── obj/                           # Object files\n│       ├── mailheader.o\n│       ├── mailmessage.o\n│       └── mailheaderclean.o\n├── tests/                         # Test suite (632 email files)\n│   ├── test_master.sh                 # Master test runner\n│   ├── test_structure.sh              # Repository structure validation\n│   ├── test_build_system.sh           # Build system validation\n│   ├── test_installation.sh           # Installation script testing\n│   ├── test_mailgetheaders.sh         # Header parsing script tests\n│   ├── test_all_mailheader.sh         # Comprehensive header tests\n│   ├── test_all_mailmessage.sh        # Comprehensive message tests\n│   ├── test_all_mailheaderclean.sh    # Comprehensive cleaning tests\n│   ├── test_*.sh                      # Additional functionality tests\n│   └── test-data/                     # 632 real email files\n├── mail-tools.bash_completions    # Bash completion definitions\n├── Makefile                       # Build system\n├── install.sh                     # Installation script\n├── README.md                      # User documentation\n└── LICENSE                        # GPL v3.0\n```\n\n## FAQ\n\n**Q: Do I need both the binaries and builtins?**\nA: The installation script installs both by default. Use binaries for general scripts and builtins for high-performance batch processing.\n\n**Q: Will this work with my maildir?**\nA: Yes, these tools work with any RFC 822 compliant email format, including Maildir, mbox, and individual .eml files.\n\n**Q: How do I customize which headers to remove?**\nA: Use the `MAILHEADERCLEAN_*` environment variables to control header filtering. See examples above.\n\n**Q: Are timestamps preserved when cleaning emails?**\nA: Yes, when using `mailheaderclean-batch` script (or the backwards-compatible `clean-email-headers` symlink). For manual operations, use `touch -r` to preserve timestamps.\n\n**Q: Can I use this in production?**\nA: Yes, all utilities are production-ready with comprehensive testing and error handling. The test suite validates against 632 real-world emails.\n\n## License\n\nGNU General Public License v3.0 or later. See LICENSE file for details.\n\n## Contributing\n\nIssues and pull requests welcome at [github.com/Open-Technology-Foundation/mail-tools](https://github.com/Open-Technology-Foundation/mail-tools)\n\n## Credits\n\nDeveloped by the Open Technology Foundation for efficient email processing in bash environments.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fopen-technology-foundation%2Fmail-tools","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fopen-technology-foundation%2Fmail-tools","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fopen-technology-foundation%2Fmail-tools/lists"}