Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/samwize/python-email-crawler
Search on Google, and crawls for emails related to the result
https://github.com/samwize/python-email-crawler
Last synced: 3 months ago
JSON representation
Search on Google, and crawls for emails related to the result
- Host: GitHub
- URL: https://github.com/samwize/python-email-crawler
- Owner: samwize
- Created: 2012-07-16T10:09:08.000Z (over 12 years ago)
- Default Branch: master
- Last Pushed: 2020-10-22T13:23:10.000Z (about 4 years ago)
- Last Synced: 2024-07-15T15:39:03.997Z (5 months ago)
- Language: Python
- Size: 18.6 KB
- Stars: 288
- Watchers: 30
- Forks: 128
- Open Issues: 21
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
Python Email Crawler
====================This python script search/google certain keywords, crawls the webpages from the results, and return all emails found.
Requirements
------------- sqlalchemy
- urllib2If you don't have, simply `sudo pip install sqlalchemy`.
Usage
-------Start the search with a keyword. We use "iphone developers" as an example.
python email_crawler.py "iphone developers"
The search and crawling process will take quite a while, as it retrieve up to 500 search results (from Google), and crawl up to 2 level deep. It should crawl around 10,000 webpages :)
After the process finished, run this command to get the list of emails
python email_crawler.py --emails
The emails will be saved in ./data/emails.csv