Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/maxlen/webcrawler
Search engines crawlers
https://github.com/maxlen/webcrawler
Last synced: 3 days ago
JSON representation
Search engines crawlers
- Host: GitHub
- URL: https://github.com/maxlen/webcrawler
- Owner: maxlen
- Created: 2016-11-23T11:52:50.000Z (almost 8 years ago)
- Default Branch: master
- Last Pushed: 2017-03-29T13:46:58.000Z (over 7 years ago)
- Last Synced: 2024-04-21T12:26:56.156Z (7 months ago)
- Language: PHP
- Size: 10.7 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# webcrawler
Search engines crawlers### Google example:
```
$proxy = []; //['host' => '*.*.*.*', 'port' => '', 'login' => '', 'password' => '']
$params = ['query' => 'test search', 'page' => $page, 'proxy' => $proxy];
$crawler = new WebCrawler(['strategy' => new GoogleSearch()]);
print_r($crawler->crawl($params));
```### Site-parse example:
```
$params = ['url' => 'http://your-site.com', 'proxy' => []];
$crawler = new WebCrawler(['strategy' => new SiteSearch()]);
print_r($crawler->crawl($params));
```