{"id":32861909,"url":"https://github.com/ehamiter/mbox2db","last_synced_at":"2026-05-15T08:03:11.848Z","repository":{"id":322450115,"uuid":"1089136589","full_name":"ehamiter/mbox2db","owner":"ehamiter","description":"Convert a Gmail (.mbox) export into a SQLite db","archived":false,"fork":false,"pushed_at":"2025-11-04T13:41:35.000Z","size":9,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-11-04T15:21:25.203Z","etag":null,"topics":["gmail","mbox","rust","sqlite"],"latest_commit_sha":null,"homepage":"","language":"Rust","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ehamiter.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-11-03T23:50:54.000Z","updated_at":"2025-11-04T13:44:15.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/ehamiter/mbox2db","commit_stats":null,"previous_names":["ehamiter/mbox2db"],"tags_count":null,"template":false,"template_full_name":null,"purl":"pkg:github/ehamiter/mbox2db","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ehamiter%2Fmbox2db","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ehamiter%2Fmbox2db/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ehamiter%2Fmbox2db/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ehamiter%2Fmbox2db/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ehamiter","download_url":"https://codeload.github.com/ehamiter/mbox2db/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ehamiter%2Fmbox2db/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":283424652,"owners_count":26833720,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-11-08T02:00:06.281Z","response_time":57,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["gmail","mbox","rust","sqlite"],"created_at":"2025-11-08T22:01:01.129Z","updated_at":"2025-11-08T22:02:01.009Z","avatar_url":"https://github.com/ehamiter.png","language":"Rust","funding_links":[],"categories":[],"sub_categories":[],"readme":"# mbox2db\n\nA fast, simple Rust-based tool to convert large mbox email archives into optimized SQLite databases. Built for handling gigabyte-sized Gmail exports with maximum performance.\n\n## Installation\n\n```bash\ncargo install mbox2db\n```\n\n## Quick Start\n\n```bash\n# Convert mbox to SQLite (excludes Spam/Trash by default)\nmbox2db all-mail.mbox\n\n# Output: 2025-11-04-emails.db (in current directory)\n```\n\n## Basic SQL Queries\n\n```sql\n-- Count all emails\nSELECT COUNT(*) FROM emails;\n\n-- Get most recent emails\nSELECT subject, from_addr, date_parsed\nFROM emails\nORDER BY date_parsed DESC\nLIMIT 10;\n\n-- Search subject lines\nSELECT subject, date_parsed, from_addr\nFROM emails\nWHERE subject LIKE '%keyword%'\nORDER BY date_parsed DESC;\n\n-- Count emails by year\nSELECT strftime('%Y', date_parsed) as year, COUNT(*)\nFROM emails\nWHERE date_parsed IS NOT NULL\nGROUP BY year\nORDER BY year;\n```\n\n## Usage Options\n\n```\nmbox2db [OPTIONS] \u003cINPUT\u003e\n\nArguments:\n  \u003cINPUT\u003e  Input mbox file path\n\nOptions:\n  -o, --output \u003cOUTPUT\u003e              Custom output database path\n  -d, --destructive                  Overwrite existing database instead of auto-incrementing\n      --include-spam                 Include emails marked as Spam\n      --include-trash                Include emails marked as Trash\n      --include-spam-and-trash       Include both Spam and Trash emails\n  -h, --help                         Print help\n```\n\n## How to Export Gmail to mbox\n\n1. Go to [Google Takeout](https://takeout.google.com/)\n2. Deselect all products, then select **Mail**\n3. Click \"All Mail data included\" and select specific labels if desired\n4. Choose \"Export once\" and \"Send download link via email\"\n5. Select file format: `.zip` or `.tgz`\n6. Click \"Create export\"\n7. Download and extract the `.mbox` file\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cb\u003eTechnical Details\u003c/b\u003e\u003c/summary\u003e\n\n## Features\n\n- **Lightning Fast**: Single-transaction writes with optimized SQLite settings (WAL mode, memory mapping, large cache)\n- **Smart Filtering**: Automatically excludes Spam and Trash by default (configurable)\n- **Auto-Incrementing Filenames**: Creates dated databases (e.g., `2025-11-03-emails.db`) that auto-increment to avoid overwriting\n- **Robust Date Parsing**: Handles 20+ malformed date formats commonly found in email archives\n- **Progress Indicator**: Modern spinner shows real-time progress and skipped email counts\n- **Full-Text Search Ready**: Creates indexes on common fields for instant queries\n\n## Building from Source\n\n```bash\n# Build release binary\ncargo build --release\n\n# Binary will be at ./target/release/mbox2db\n```\n\n## Examples\n\n### Basic Conversion (Default Behavior)\n\n```bash\n# Filters out Spam/Trash, creates dated output file\nmbox2db all-mail.mbox\n# Output: 2025-11-04-emails.db\n\n# Running again on the same day creates incremented file\nmbox2db all-mail.mbox\n# Output: 2025-11-04-emails-0001.db\n```\n\n### Include Spam/Trash\n\n```bash\n# Include spam emails only\nmbox2db all-mail.mbox --include-spam\n\n# Include trash emails only\nmbox2db all-mail.mbox --include-trash\n\n# Include both spam and trash\nmbox2db all-mail.mbox --include-spam-and-trash\n```\n\n### Custom Output Path\n\n```bash\n# Specify custom output location\nmbox2db all-mail.mbox -o ~/Documents/my-emails.db\n\n# Overwrite existing file (destructive mode)\nmbox2db all-mail.mbox -d -o emails.db\n```\n\n## Database Schema\n\n```sql\nCREATE TABLE emails (\n    id INTEGER PRIMARY KEY AUTOINCREMENT,\n    from_addr TEXT,\n    to_addr TEXT,\n    cc TEXT,\n    bcc TEXT,\n    subject TEXT,\n    date TEXT,              -- Original email date header\n    date_parsed TEXT,       -- Parsed datetime in SQLite format (YYYY-MM-DD HH:MM:SS)\n    message_id TEXT,\n    in_reply_to TEXT,\n    refs TEXT,              -- \"references\" header\n    content_type TEXT,\n    body_plain TEXT,\n    body_html TEXT\n);\n\n-- Indexes for fast queries\nCREATE INDEX idx_from ON emails(from_addr);\nCREATE INDEX idx_date ON emails(date);\nCREATE INDEX idx_date_parsed ON emails(date_parsed);\nCREATE INDEX idx_subject ON emails(subject);\n```\n\n## More SQL Query Examples\n\n### Search by Date\n\n```sql\n-- Get emails from 2025\nSELECT * FROM emails \nWHERE date_parsed LIKE '2025%'\nORDER BY date_parsed DESC;\n\n-- Get emails from date range\nSELECT subject, date_parsed, from_addr \nFROM emails \nWHERE date_parsed BETWEEN '2020-01-01' AND '2020-12-31'\nORDER BY date_parsed DESC;\n\n-- Count emails from specific sender\nSELECT COUNT(*) FROM emails WHERE from_addr LIKE '%user@example.com%';\n```\n\n### Full-Text Search\n\n```sql\n-- Search email body\nSELECT subject, from_addr, date_parsed \nFROM emails \nWHERE body_plain LIKE '%search term%' \n   OR body_html LIKE '%search term%'\nORDER BY date_parsed DESC;\n```\n\n### Email Threads\n\n```sql\n-- Find email threads by message_id/in_reply_to\nSELECT * FROM emails \nWHERE in_reply_to = '\u003csome-message-id\u003e'\nORDER BY date_parsed;\n```\n\n## Performance Notes\n\n- **Optimized SQLite Settings**:\n  - WAL (Write-Ahead Logging) mode for better concurrency\n  - NORMAL synchronous mode for fast writes\n  - 64MB cache size\n  - 30GB memory mapping\n  - Single transaction for all inserts (~10-100x faster)\n  \n- **Handles Large Files**: Tested with multi-GB mbox files containing 80,000+ emails\n\n- **Date Parsing**: Handles malformed dates including:\n  - Double-dash timezones (`--0400`)\n  - Single-digit time components (`9:47:11`)\n  - Two-digit years (`Jun 09`)\n  - Named timezones (`Eastern Daylight Time`, `GMT-0700`)\n  - Various date formats (`7/19/2005 8:11:52 AM`)\n\n\u003c/details\u003e\n\n## License\n\nMIT\n\n## Author\n\nEric Hamiter\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fehamiter%2Fmbox2db","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fehamiter%2Fmbox2db","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fehamiter%2Fmbox2db/lists"}