Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/willdurand/EmailReplyParser

PHP library for parsing plain text email content.
https://github.com/willdurand/EmailReplyParser

email parser php

Last synced: 3 months ago
JSON representation

PHP library for parsing plain text email content.

Awesome Lists containing this project

README

        

# EmailReplyParser

[![GitHub Actions](https://github.com/willdurand/EmailReplyParser/workflows/ci/badge.svg)](https://github.com/willdurand/EmailReplyParser/actions?query=workflow%3A%22ci%22+branch%3Amaster)
[![Total
Downloads](https://poser.pugx.org/willdurand/email-reply-parser/downloads.png)](https://packagist.org/packages/willdurand/email-reply-parser)
[![Latest Stable
Version](https://poser.pugx.org/willdurand/email-reply-parser/v/stable.png)](https://packagist.org/packages/willdurand/email-reply-parser)

**EmailReplyParser** is a PHP library for parsing plain text email content,
based on GitHub's [email_reply_parser](http://github.com/github/email_reply_parser)
library written in Ruby.

## Installation

The recommended way to install EmailReplyParser is through
[Composer](http://getcomposer.org/):

```shell
composer require willdurand/email-reply-parser
```

## Usage

Instantiate an `EmailParser` object and parse your email:

```php
parse($emailContent);
```

You get an `Email` object that contains a set of `Fragment` objects. The `Email`
class exposes two methods:

- `getFragments()`: returns all fragments;
- `getVisibleText()`: returns a string which represents the content considered
as "visible".

The `Fragment` represents a part of the full email content, and has the
following API:

```php
getFragments());

$fragment->getContent();

$fragment->isSignature();

$fragment->isQuoted();

$fragment->isHidden();

$fragment->isEmpty();
```

Alternatively, you can rely on the `EmailReplyParser` to either parse an email
or get its visible content in a single line of code:

```php
$email = \EmailReplyParser\EmailReplyParser::read($emailContent);

$visibleText = \EmailReplyParser\EmailReplyParser::parseReply($emailContent);
```

## Known Issues

### Quoted Headers

Quoted headers aren't picked up if there's an extra line break:

On , wrote:

> blah

Also, they're not picked up if the email client breaks it up into
multiple lines. GMail breaks up any lines over 80 characters for you.

On ,
wrote:
> blah

The above `On ....wrote:` can be cleaned up with the following regex:

```php
$fragment_without_date_author = preg_replace(
'/\nOn(.*?)wrote:(.*?)$/si',
"",
$fragment->getContent()
);
```

Note though that we're search for "on" and "wrote". Therefore, it won't work
with other languages.

Possible solution: Remove "[email protected]" lines...

### Weird Signatures

Lines starting with `-` or `_` sometimes mark the beginning of
signatures:

Hello

--
Rick

Not everyone follows this convention:

Hello

Mr Rick Olson
Galactic President Superstar Mc Awesomeville
GitHub

**********************DISCLAIMER***********************************
* Note: blah blah blah *
**********************DISCLAIMER***********************************

### Strange Quoting

Apparently, prefixing lines with `>` isn't universal either:

Hello

--
Rick

________________________________________
From: Bob [[email protected]]
Sent: Monday, March 14, 2011 6:16 PM
To: Rick

## Unit Tests

Setup the test suite using Composer:

$ composer install

Run it using PHPUnit:

$ ./vendor/bin/simple-phpunit

## Contributing

See CONTRIBUTING file.

## Credits

- GitHub
- William Durand

## License

EmailReplyParser is released under the MIT License. See the bundled LICENSE
file for details.