https://github.com/peterujah/email-crawl
PHP Email Web Crawler. using curl and command line interface to extract emails from website.
https://github.com/peterujah/email-crawl
Last synced: 12 months ago
JSON representation
PHP Email Web Crawler. using curl and command line interface to extract emails from website.
- Host: GitHub
- URL: https://github.com/peterujah/email-crawl
- Owner: peterujah
- Created: 2022-05-05T09:45:05.000Z (about 4 years ago)
- Default Branch: main
- Last Pushed: 2022-05-05T15:59:26.000Z (about 4 years ago)
- Last Synced: 2025-06-28T02:02:00.294Z (12 months ago)
- Language: PHP
- Homepage:
- Size: 31.3 KB
- Stars: 0
- Watchers: 1
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# email-crawl
PHP Email Web Crawler, is a simple and easy to use class that uses curl & command line interface to extract email address from websites.
It also has the feature to deep extract email from website link which is found from the initial target website.
## Installation
Installation is super-easy via Composer:
```cli
composer require peterujah/email-crawl
```
## Basic Usage
Initalize email crawl instance
```php
$craw = new EmailCrawl("https://example.com", 200);
```
Star email crawling scan
```php
$craw->craw()
```
Get scanned response and return CrawlResponse instance
```php
$response = $craw->getResponse();
```
Get response emails separate in a new line
```php
$data = $response->inLine();
```
Get response emails separate with a comma
```php
$data = $response->withComma();
```
Get response emails as an array
```php
$data = $response->asArray();
```
Print response email
```php
$response->printCommandResult($data);
```
Save response emails to file. This will save result as json string
```php
$response->save("/path/save/craw/");
```
Save response emails to file. If string data is passed it will save it, els it will save result as json string
```php
$response->saveAs("/path/save/craw/", $data);
```
Example
Create a file name it craw.php, inside the file add this example code.
With this example you can run your craw directly from `command line, browser or php shell_exec`.
```php
error_reporting(E_ALL);
ini_set('display_errors', '1');
require __DIR__ . '/plugins/autoload.php';
use Peterujah\NanoBlock\EmailCrawl;
$target = "https://example.com/contact";
$limit = 50;
if(!empty($argv[1])){
if(filter_var($argv[1], FILTER_VALIDATE_URL)){
$target = $argv[1];
$limit = $argv[2]??50;
}else{
$req = unserialize(base64_decode($argv[1]));
$target = $req["target"];
$limit = $req["max"]??50;
}
}
$craw = new EmailCrawl($target, $limit);
$response = $craw->craw()->getResponse();
$data = $response->inLine();
$response->printCommandResult($data)->saveAs(__DIR__ . "/craw/", $data);
```
Execute craw through command line interface, run the below command
```cli
php craw.php https://google.com 50
```
Execute craw through php shell_exec, create a file call exec.php and add below example script.
Note: change `PHP_SHELL_EXECUTION_PATH` to your php executable path.
Once done navigate to https://mycraw.example.com/exec.php
```php
define("PHP_SHELL_EXECUTION_PATH", "path/to/php");
$crawOptions = array(
'target' => 'https://example.com',
'max' => 50,
);
$crawRequest = base64_encode(serialize($crawOptions));
$crawScript = __DIR__ . "/craw.php";
$crawLogs = __DIR__ . "/craw_logs.log";
shell_exec(PHP_SHELL_EXECUTION_PATH . " " . $crawScript . " " . $crawRequest ." 'alert' >> " . $crawLogs . " 2>&1");
```
# ATTENTION
Is advisable to run this code in command line interface for be better performance.