Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/dmitriiweb/extract-emails
Extract emails from a given website
https://github.com/dmitriiweb/extract-emails
email extract-emails linkedin parser parsing parsing-library python scraper
Last synced: 6 days ago
JSON representation
Extract emails from a given website
- Host: GitHub
- URL: https://github.com/dmitriiweb/extract-emails
- Owner: dmitriiweb
- License: mit
- Created: 2017-07-24T09:23:48.000Z (over 7 years ago)
- Default Branch: main
- Last Pushed: 2024-06-02T09:48:46.000Z (7 months ago)
- Last Synced: 2024-12-31T04:09:05.523Z (13 days ago)
- Topics: email, extract-emails, linkedin, parser, parsing, parsing-library, python, scraper
- Language: Python
- Homepage:
- Size: 12.1 MB
- Stars: 97
- Watchers: 2
- Forks: 35
- Open Issues: 4
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
Awesome Lists containing this project
README
# Extract Emails
![Image](https://github.com/dmitriiweb/extract-emails/blob/docs_improvements/images/email.png?raw=true)
[![PyPI version](https://badge.fury.io/py/extract-emails.svg)](https://badge.fury.io/py/extract-emails)
Extract emails and linkedins profiles from a given website
**Support the project with BTC**: *bc1q0cxl5j3se0ufhr96h8x0zs8nz4t7h6krrxkd6l*
[Documentation](https://dmitriiweb.github.io/extract-emails/)
## Requirements
- Python >= 3.10
## Installation
```bash
pip install extract_emails[all]
# or
pip install extract_emails[requests]
# or
pip install extract_emails[selenium]
```## Simple Usage
### As library
```python
from pathlib import Pathfrom extract_emails import DefaultFilterAndEmailFactory as Factory
from extract_emails import DefaultWorker
from extract_emails.browsers.requests_browser import RequestsBrowser as Browser
from extract_emails.data_savers import CsvSaverwebsites = [
"website1.com",
"website2.com",
]browser = Browser()
data_saver = CsvSaver(save_mode="a", output_path=Path("output.csv"))for website in websites:
factory = Factory(
website_url=website, browser=browser, depth=5, max_links_from_page=1
)
worker = DefaultWorker(factory)
data = worker.get_data()
data_saver.save(data)
```### As CLI tool
```bash
$ extract-emails --help$ extract-emails --url https://en.wikipedia.org/wiki/Email -of output.csv -d 1
$ cat output.csv
email,page,website
[email protected],https://en.wikipedia.org/wiki/Email,https://en.wikipedia.org/wiki/Email
```### By me a coffee
- **USDT** (TRC20): TXuYegp5L8Zf7wF2YRFjskZwdBxhRpvxBS
- **BEP20**: 0x4D51Db2B754eA83ce228F7de8EaEB93a88bdC965
- **TON**: UQA5quJljQz84RwzteN3uuKsdPTDee7a_GF5lgIgezA2oib5