Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/fnando/email_data

This project is a compilation of datasets related to emails. Includes disposable emails, disposable domains, and free email services.
https://github.com/fnando/email_data

Last synced: 11 days ago
JSON representation

This project is a compilation of datasets related to emails. Includes disposable emails, disposable domains, and free email services.

Awesome Lists containing this project

README

        

# email_data

[![Tests](https://github.com/fnando/email_data/workflows/ruby-tests/badge.svg)](https://github.com/fnando/email_data)
[![Code Climate](https://codeclimate.com/github/fnando/email_data/badges/gpa.svg)](https://codeclimate.com/github/fnando/email_data)
[![Gem version](https://img.shields.io/gem/v/email_data.svg?label=Gem%20version)](https://rubygems.org/gems/email_data)
[![Gem downloads](https://img.shields.io/gem/dt/email_data.svg?label=Gem%20downloads)](https://rubygems.org/gems/email_data)
[![NPM version](https://img.shields.io/npm/v/%40fnando%2Femail_data.svg?label=NPM%20version)](https://npmjs.org/package/@fnando/email_data)
[![NPM downloads](https://img.shields.io/npm/dt/%40fnando%2Femail_data.svg?label=NPM%20downloads)](https://npmjs.org/package/@fnando/email_data)

This project is a compilation of datasets related to emails.

- Disposable emails
- Disposable domains
- Free email services
- Roles (e.g. info, spam, etc)

The data is compiled from several different sources, so thank you all for making
this data available.

## Ruby

### Installation

```bash
gem install email_data
```

Or add the following line to your project's Gemfile:

```ruby
gem "email_data"
```

### Usage

```ruby
require "email_data"

# instance pointing to the data directory.
EmailData.data_dir

# List of disposable domains. Punycode is expanded into ASCII domains.
EmailData.disposable_domains
EmailData.disposable_domains_with_mx
EmailData.disposable_domains_without_mx

# List of disposable emails. Some services use free email like Gmail to create
# disposable emails.
EmailData.disposable_emails

# List of free email services.
EmailData.free_email_domains

# List of roles. Can be used to filter out emails like info@ or all@.
EmailData.roles

# List of private relays like Apple's Hide My Email.
EmailData.private_relays

# List of country tlds.
EmailData.country_tlds

# List of official tlds.
EmailData.tlds

# List of second-level domains.
EmailData.slds

# A list of DNSBL providers.
EmailData.dnsbls
```

#### Data sources

By default, Ruby will load data from filesystem. You may want to load this data
from the database instead. `email-data` has support for ActiveRecord out of the
box. To use the ActiveRecord adapter, you must load
`email_data/source/active_record.rb`. You can easily do it so with Bundler's
`require` key.

```ruby
gem "email_data", require: "email_data/source/active_record"
```

Then, you need to assign the new data source.

```ruby
EmailData.source = EmailData::Source::ActiveRecord
```

If you need to configure a different database connection than the one defined by
`ActiveRecord::Base`, use `EmailData::Source::ActiveRecord::ApplicationRecord`
for that.

##### Creating the tables

To create the tables, use the migration code below (tweak it accordingly if you
use something different than PostgreSQL, or don't want to use `citext`).

```ruby
class SetupEmailData < ActiveRecord::Migration[6.1]
def change
enable_extension "citext"

create_table :tlds do |t|
t.citext :name, null: false
end

add_index :tlds, :name, unique: true

create_table :slds do |t|
t.citext :name, null: false
end

add_index :slds, :name, unique: true

create_table :country_tlds do |t|
t.citext :name, null: false
end

add_index :country_tlds, :name, unique: true

create_table :disposable_emails do |t|
t.citext :name, null: false
end

add_index :disposable_emails, :name, unique: true

create_table :disposable_domains do |t|
t.citext :name, null: false
end

add_index :disposable_domains, :name, unique: true

create_table :disposable_domains_with_mx do |t|
t.citext :name, null: false
end

add_index :disposable_domains_with_mx, :name, unique: true

create_table :disposable_domains_without_mx do |t|
t.citext :name, null: false
end

add_index :disposable_domains_without_mx, :name, unique: true

create_table :free_email_domains do |t|
t.citext :name, null: false
end

add_index :free_email_domains, :name, unique: true

create_table :roles do |t|
t.citext :name, null: false
end

add_index :roles, :name, unique: true

create_table :private_relays do |t|
t.citext :name, null: false
end

add_index :private_relays, :name, unique: true

create_table :dnsbls do |t|
t.citext :name, null: false
end

add_index :dnsbls, :name, unique: true
end
end
```

##### Loading the data

With PostgreSQL, you load the data using the `COPY` command. First, you'll need
to discover where your gems are being installed. Use `gem list` for that.

```console
$ gem list email_data -d

*** LOCAL GEMS ***

email_data (1601479967, 1601260789)
Author: Nando Vieira
Homepage: https://github.com/fnando/email_data
License: MIT
Installed at (1601479967): /usr/local/ruby/2.7.1/lib/ruby/gems/2.7.0
This project is a compilation of datasets related to emails.
Includes disposable emails, disposable domains, and free email
services.
```

The you can load each dataset using `COPY`:

```sql
COPY tlds (name) FROM '/usr/local/ruby/2.7.1/lib/ruby/gems/2.7.0/gems/email_data-1601479967/data/tlds.txt';
COPY slds (name) FROM '/usr/local/ruby/2.7.1/lib/ruby/gems/2.7.0/gems/email_data-1601479967/data/slds.txt';
COPY country_tlds (name) FROM '/usr/local/ruby/2.7.1/lib/ruby/gems/2.7.0/gems/email_data-1601479967/data/country_tlds.txt';
COPY disposable_emails (name) FROM '/usr/local/ruby/2.7.1/lib/ruby/gems/2.7.0/gems/email_data-1601479967/data/disposable_emails.txt';
COPY disposable_domains (name) FROM '/usr/local/ruby/2.7.1/lib/ruby/gems/2.7.0/gems/email_data-1601479967/data/disposable_domains.txt';
COPY disposable_domains_with_mx (name) FROM '/usr/local/ruby/2.7.1/lib/ruby/gems/2.7.0/gems/email_data-1601479967/data/disposable_domains_with_mx.txt';
COPY disposable_domains_without_mx (name) FROM '/usr/local/ruby/2.7.1/lib/ruby/gems/2.7.0/gems/email_data-1601479967/data/disposable_domains_without_mx.txt';
COPY free_email_domains (name) FROM '/usr/local/ruby/2.7.1/lib/ruby/gems/2.7.0/gems/email_data-1601479967/data/free_email_domains.txt';
COPY roles (name) FROM '/usr/local/ruby/2.7.1/lib/ruby/gems/2.7.0/gems/email_data-1601479967/data/roles.txt';
COPY private_relays (name) FROM '/usr/local/ruby/2.7.1/lib/ruby/gems/2.7.0/gems/email_data-1601479967/data/private_relays.txt';
COPY dnsbls (name) FROM '/usr/local/ruby/2.7.1/lib/ruby/gems/2.7.0/gems/email_data-1601479967/data/dnsbls.txt';
```

Alternatively, you could create a migration that executes that same command;
given that you'd be running Ruby code, you can replace the steps to find the gem
path with `EmailData.data_dir`.

```ruby
class LoadEmailData < ActiveRecord::Migration[6.1]
def change
copy = lambda do |table_name|
connection = ActiveRecord::Base.connection
data_path = EmailData.data_dir

connection.execute <<~PG
COPY #{table_name} (name)
FROM '#{data_path.join(table_name)}.txt'
(FORMAT CSV)
PG
end

copy.call(:tlds)
copy.call(:slds)
copy.call(:country_tlds)
copy.call(:disposable_emails)
copy.call(:disposable_domains)
copy.call(:disposable_domains_with_mx)
copy.call(:disposable_domains_without_mx)
copy.call(:free_email_domains)
copy.call(:roles)
copy.call(:private_relays)
copy.call(:dnsbls)
end
end
```

## Node.js

### Installation

```console
$ yarn add @fnando/email_data
```

or

```console
$ npm install @fnando/email_data
```

### Usage

```js
const disposableEmails = require("@fnando/email_data/data/json/disposable_emails.json");
const disposableDomains = require("@fnando/email_data/data/json/disposable_domains.json");
const disposableDomainsWithMx = require("@fnando/email_data/data/json/disposable_domains_with_mx.json");
const disposableDomainsWithoutMx = require("@fnando/email_data/data/json/disposable_domains_without_mx.json");
const freeEmailDomains = require("@fnando/email_data/data/json/free_email_domains.json");
const roles = require("@fnando/email_data/data/json/roles.json");
const privateRelays = require("@fnando/email_data/data/json/private_relays.json");
const tlds = require("@fnando/email_data/data/json/tlds.json");
const slds = require("@fnando/email_data/data/json/slds.json");
const cctlds = require("@fnando/email_data/data/json/country_tlds.json");
const dnsbls = require("@fnando/email_data/data/json/dnsbls.json");
```

## Dataset

The dataset is updated automatically. If you have any manual entries you would
like to add, please make a pull request against the files `data/manual/*.txt`.

- `data/manual/disposable_domains.txt`: only domains from disposable servers
must go here.
- `data/manual/disposable_emails.txt`: only normalized email addresses that use
free email services must go here. E.g. `[email protected]`
must be added as `[email protected]`.
- `data/manual/free_email_domains.txt`: only free email services must go here.
- `data/manual/roles.txt`: list of role-based user names like `info` or
`no-reply`.
- `data/manual/private_relays.txt`: list of private relay services, like Apple's
Hide My Email.

## Maintainer

- [Nando Vieira](https://github.com/fnando)

## Contributors

- https://github.com/fnando/email_data/contributors

## Contributing

For more details about how to contribute, please read
https://github.com/fnando/email_data/blob/main/CONTRIBUTING.md.

## License

The gem is available as open source under the terms of the
[MIT License](https://opensource.org/licenses/MIT). A copy of the license can be
found at https://github.com/fnando/email_data/blob/main/LICENSE.md.

## Code of Conduct

Everyone interacting in the email_data project's codebases, issue trackers, chat
rooms and mailing lists is expected to follow the
[code of conduct](https://github.com/fnando/email_data/blob/main/CODE_OF_CONDUCT.md).