https://github.com/bookshelf-writer/puremail
A blazing‑fast, zero‑allocation Go package for strict e‑mail parsing, tag trimming and binary serialisation
https://github.com/bookshelf-writer/puremail
email golang gomod normalize parser valid zero-alloc
Last synced: 5 months ago
JSON representation
A blazing‑fast, zero‑allocation Go package for strict e‑mail parsing, tag trimming and binary serialisation
- Host: GitHub
- URL: https://github.com/bookshelf-writer/puremail
- Owner: Bookshelf-Writer
- License: apache-2.0
- Created: 2025-07-24T22:59:49.000Z (6 months ago)
- Default Branch: main
- Last Pushed: 2025-07-25T00:50:46.000Z (6 months ago)
- Last Synced: 2025-07-25T05:15:14.255Z (6 months ago)
- Topics: email, golang, gomod, normalize, parser, valid, zero-alloc
- Language: Go
- Homepage:
- Size: 43.9 KB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
- Code of conduct: CODE_OF_CONDUCT.md
- Security: SECURITY.md
Awesome Lists containing this project
README
[](https://goreportcard.com/report/github.com/Bookshelf-Writer/puremail)



# puremail
A **zero‑allocation**, high‑throughput Go library for *strict* e‑mail parsing, tag trimming, binary
serialisation and DNS‑MX probing.
The parser normalises case, removes disposable `+` / `=` tags **before** validation, caches its own
results and lets you hash or encode an address in a single line.
> **Focus:** production back‑ends that need predictable latency and memory
> footprint. Exhaustive RFC‑5322 edge‑cases are intentionally ignored.
---
## Features
| ✔ | Description |
|-------------------------------------|-----------------------------------------------------------------------------|
| **Prefix trimming** | `bob+promo=gophers@gmail.com` → `bob@gmail.com` (prefixes kept internally). |
| **RFC‑ish validation** | Login & domain checked against a pragmatic subset of the RFC. |
| **Parser cache** | Same address parsed only once thanks to `singleflight`; toggle via config. |
| **MX probing with smart cache** | `HasMX()` uses a sharded, TTL‑aware cache with concurrency limits. |
| **CRC‑protected bytes** | `Bytes()` / `Decode()` round‑trip with CRC‑32 guard. |
| **BLAKE2b‑160 hashes** | `Hash()` (login+domain) & `HashFull()` (including prefixes). |
| **100 % allocation‑free fast path** | All hot methods avoid heap use. |
| **Fuzz‑tested & benchmarked** | >500 k/s parse on a single core (see `go test -bench .`). |
---
## Installation
```bash
go get github.com/Bookshelf-Writer/puremail
```
---
## Quick start
```go
package main
import (
"fmt"
"log"
"github.com/Bookshelf-Writer/puremail"
)
func main() {
// Initialize with default configuration
puremail.InitDefault()
addr, err := puremail.New("Alice+dev=go@example.io")
if err != nil {
log.Fatal(err)
}
fmt.Println(addr.Mail()) // alice@example.io
fmt.Println(addr.MailFull()) // alice+dev=go@example.io
fmt.Printf("%x\n", addr.Hash()) // 20‑byte BLAKE2b‑160
}
```
The package can be configured with a custom ConfigObj:
```go
package main
import (
"context"
"time"
"github.com/Bookshelf-Writer/puremail"
)
func main() {
config := puremail.ConfigObj{
NoCache: false,
MX: puremail.ConfigMxObj{
TllPos: 12 * time.Hour,
TllNeg: 30 * time.Minute,
RefreshAhead: 20 * time.Minute,
TimeoutDns: 500 * time.Millisecond,
TimeoutDnsBurst: 3 * time.Second,
TimeoutRefresh: 60 * time.Second,
ShardAbs: 8,
ShardMaxSize: 20_000,
ConcurrencyLimitLookupMX: 500,
},
Ctx: context.Background(),
}
puremail.Init(config)
// Use the package functions...
}
```
---
## Configuration (`ConfigObj`)
| Field | Type | Purpose / default |
|-------------|-------------------|------------------------------------------------------------------------------|
| **NoCache** | `bool` | `true` disables the internal *singleflight* cache used by `New` / `NewFast`. |
| **MX** | `ConfigMxObj` | Nested object that tunes the MX resolver cache (see below). |
| **Ctx** | `context.Context` | Root context for background goroutines. Defaults to `context.Background()`. |
### `ConfigMxObj`
| Field | Default | What it does |
|----------------------------|----------|--------------------------------------------------------------------|
| `TllPos` | `6h` | TTL for *positive* MX answers. |
| `TllNeg` | `15m` | TTL for *negative* answers (NXDOMAIN / no records). |
| `RefreshAhead` | `10m` | Time **before** TTL when an entry may be refreshed asynchronously. |
| `TimeoutDns` | `400ms` | Hard limit for a single DNS lookup. |
| `TimeoutDnsBurst` | `2s` | Upper bound when many lookups queue at once. |
| `TimeoutRefresh` | `90s` | How often the cleaner scans & evicts expired items. |
| `ShardAbs` | `4` | log₂ of cache shards ⇒ `2⁴ = 16` shards (1 .. 31). |
| `ShardMaxSize` | `10 000` | Max entries per shard (oldest drop first). |
| `ConcurrencyLimitLookupMX` | `250` | Global semaphore guarding parallel DNS queries. |
> Call `puremail.Init(&cfg)` once at program start.
> Calling nothing is identical to `puremail.InitDefault()`.
---
## Constructors
| Function | Behaviour |
|---------------------|---------------------------------------------------------|
| `New(s string)` | Validates and **trims prefixes** (`+`, `=`). |
| `NewFast(s string)` | Same validation, but prefixes are not treated (faster). |
---
## API reference
### `EmailObj` methods
| Method | Returns | Comment |
|--------------|--------------------|------------------------------------------------------------|
| `Login()` | `string` | Local part without prefixes. |
| `Domain()` | `string` | Domain in lower‑case. |
| `Prefixes()` | `[]EmailPrefixObj` | Slice of preserved prefixes. |
| `Mail()` | `string` | Canonical `@`. |
| `MailFull()` | `string` | Original address with prefixes. |
| `String()` | `string` | Debug representation. |
| `Bytes()` | `[]byte` | Binary payload + CRC‑32. |
| `Hash()` | `[20]byte` | BLAKE2b‑160 of login+domain. |
| `HashFull()` | `[20]byte` | Same, but includes prefixes. |
| `HasMX()` | `error` | `nil` if at least one MX exists. Cached, concurrency‑safe. |
### `EmailPrefixObj`
| Method | Purpose |
|------------|---------------------------------|
| `String()` | Original text (`"dev"`). |
| `Prefix()` | Delimiter char (`'+'` / `'='`). |
### Stand‑alone helpers
| Function | Use case |
|------------------|------------------------------------------|
| `Decode([]byte)` | Recreate `EmailObj` from `Bytes()` blob. |
---
## Usage examples
```go
addr, _ := puremail.New("bob+promo=gophers@gmail.com")
// 1. Basic fields
fmt.Println(addr.Login()) // bob
fmt.Println(addr.Domain()) // gmail.com
fmt.Println(addr.Mail()) // bob@gmail.com
fmt.Println(addr.MailFull()) // bob+promo=gophers@gmail.com
fmt.Println(addr.String()) // [ 'bob@gmail.com', ['+promo', '=gophers'] ]
// 2. Prefix enumeration
for _, p := range addr.Prefixes() {
fmt.Printf("tag %c = %s\n", p.Prefix(), p.String())
}
// 3. Hashes
fmt.Printf("stable hash : %x\n", addr.Hash())
fmt.Printf("hash w/tags : %x\n", addr.HashFull())
// 4. Binary round‑trip
blob := addr.Bytes()
back, _ := puremail.Decode(blob)
fmt.Println(back.Mail()) // bob@gmail.com
// 5. MX check (cached)
if err := addr.HasMX(); err != nil {
log.Printf("domain has no MX: %v", err)
}
// 6. NewFast: keep prefixes
fast, _ := puremail.NewFast("bob+promo=gophers@gmail.com")
fmt.Println(fast.MailFull()) // unchanged
```
---
## Encoding / decoding in detail
```go
e, _ := puremail.New("alice+dev=go@example.io")
payload := e.Bytes() // safe to store in Redis or pass over the wire
again, err := puremail.Decode(payload)
if err != nil { panic(err) }
```
The format is:
```
[ ... ]
```
Any corruption (or truncated payload) is caught by the CRC check.
---
## MX cache life‑cycle
```
┌─parse.HasMX()──────────────────┐
│ shard lookup (CRC‑32 hash) │ ← constant‑time
│ ├─ fresh? → return │
│ └─ group.Do(domain, dnsQuery) │ ← singleflight + semaphore
└────────────────────────────────┘
```
* Positive TTL (`TllPos`) and negative TTL (`TllNeg`) are fully configurable.
* A background goroutine prunes expired entries every `TimeoutRefresh`.
* Cache size is bounded per shard; oldest keys are dropped.
---
## Limitations
* ASCII input only; supply punycode yourself (`пример.укр` → `xn--e1afmkfd.xn--j1amh`).
* No quoted‑local‑part, comments or IP‑literals.
* Max total length **254 bytes**.
* `HasMX()` issues network DNS lookups (honours context cancellation).
---
---
### Mirrors
- https://git.bookshelf-writer.fun/Bookshelf-Writer/puremail