https://github.com/depayfi/mac-mails-parse-export-csv-json
Parse & extract Mail app (Mac OS) emails & save to structured CSV or JSON
https://github.com/depayfi/mac-mails-parse-export-csv-json
apple-mail macos-mail mail mail-extractor mail-parser mail-to-csv mail-to-json
Last synced: 3 months ago
JSON representation
Parse & extract Mail app (Mac OS) emails & save to structured CSV or JSON
- Host: GitHub
- URL: https://github.com/depayfi/mac-mails-parse-export-csv-json
- Owner: DePayFi
- Created: 2022-08-29T11:49:51.000Z (almost 4 years ago)
- Default Branch: main
- Last Pushed: 2022-09-19T11:49:20.000Z (over 3 years ago)
- Last Synced: 2025-09-18T14:53:08.169Z (9 months ago)
- Topics: apple-mail, macos-mail, mail, mail-extractor, mail-parser, mail-to-csv, mail-to-json
- Language: Python
- Homepage:
- Size: 566 KB
- Stars: 4
- Watchers: 3
- Forks: 1
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- Funding: .github/FUNDING.yml
Awesome Lists containing this project
README
# Mac OS Mail - Email Extractor
## Mail App Emails to CSV or JSON
**What it (currently) does:**
Email Export from Apple's Mail App on Mac OS -> Plaintext input -> Parse/Extract -> Save to CSV or JSON
*Sometimes it can be handy to get structured data from a mailbox on Mac OS (Mail). This is a quick (still) hacky script which parses a plaintext export email export from the Mail app in Mac OS and returns a structured CSV.*
## Features
- **TXT mode**: Parse manually exported emails from a txt file.
- **EMXL mode**: Parse emlx files created by the Mail app
- Export formats & versions per script execution
- Export versions
- "plain"
- "exploded": email addresses are split in case of multiple email recipients, each row contains only one receiving address (all other fields are duplicated)
- Formats (for each version)
- CSV
- JSON
- Custom Columns based on Regex-extraction:
- Custom regex extraction schemes can be defined in `config.py`:

## Input (text file mode)
1. Select the emails you want to parse in Mac OS Mail (hold shift in order to select multiple, CMD + A in order to select all from the current mailbox)
2. File > Save As > Select Format: Plain Text > Save to the script directory & rename to "input.txt"
## Input (emlx parser mode)
Edit the config.py file with your Mac OS filepath to your Mailbox that you want to extract email data from
## Run
> Tested with Python 3.10.6
```
pip install -r requirements.txt
```
1. edit config.py
2. run either in **txt file mode:**
```
rename your txt file input.txt, place in script dir, then execute:
python3 extract_txt.py
```
or
1. run in emlx parser mode:
```
don't forget to edit config.py first, then execute:
python3 extract_txt.py
```
if you want to use a CLI with args
(implementation in progress)
```
python3 mac-os-extract-mails.py
[Options]
-e : run emlx parser
-t : run txt file parser
-c : open config file for editing
-sinput: select input file for txt file parser
-sodir: select folder for outputs
```
## Outputs

## Output Columns
```
- "date_utc"
- "date_iso"
- "date"
- "from"
- "from_name"
- "from_mail"
- "subject"
- "to"
- "to_name"
- "to_mail"
- "from"
- "reply_to"
- "reply_to_name"
- "reply_to_mail"
- "body"
- "x-universally-unique-identifier"
- "message-id"
- "mime-version"
- "content-type"
```
## Todo
- [x] Plaintext mode
- [x] Emlx mode
- [x] Explode outputs: Split multiple recipient email addresses into separate data rows (1 email = 1..* rows). Currently: 1 row = 1 email, recipient email addresses are comma-separated
- [x] Custom columns from RegEx outputs
- [x] Export json
- [ ] Command line tool
- [ ] EMXL mode: Add caching
- [ ] EMXL mode: Add mailbox folder picker (tkinter?)
- [ ] TXT mode: Add file picker for input file (+cli path option)
- [ ] Option: Create Google Sheet from Outputs
- [ ] Option: Setting utc min-date
- [ ] Option: Ignore Signatures
- [ ] Option: Traverse all mbox'es if a mail box for mail account folder is undefined