https://github.com/JayBizzle/Crawler-Detect
🕷 CrawlerDetect is a PHP class for detecting bots/crawlers/spiders via the user agent
https://github.com/JayBizzle/Crawler-Detect
bots crawler detect hacktoberfest php spider user-agent
Last synced: about 1 year ago
JSON representation
🕷 CrawlerDetect is a PHP class for detecting bots/crawlers/spiders via the user agent
- Host: GitHub
- URL: https://github.com/JayBizzle/Crawler-Detect
- Owner: JayBizzle
- License: mit
- Created: 2015-03-23T20:05:37.000Z (about 11 years ago)
- Default Branch: master
- Last Pushed: 2025-03-05T23:15:13.000Z (about 1 year ago)
- Last Synced: 2025-03-31T16:55:26.999Z (about 1 year ago)
- Topics: bots, crawler, detect, hacktoberfest, php, spider, user-agent
- Language: PHP
- Homepage: https://crawlerdetect.io
- Size: 8.95 MB
- Stars: 2,060
- Watchers: 55
- Forks: 262
- Open Issues: 10
-
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
Awesome Lists containing this project
- awesome-php - CrawlerDetect - A PHP class for detecting bots/crawlers/spiders via the user agent. (Table of Contents / Scraping)
- fucking-awesome-php - CrawlerDetect - A PHP class for detecting bots/crawlers/spiders via the user agent. (Table of Contents / Scraping)
- php-awesome - CrawlerDetect - 网页爬虫检查 (类库 / 网页抓取/代理)
- web-stuff - Crawler Detect - Detect if the requestor is a crawler (based on user-agents) (PHP)
- StarryDivineSky - JayBizzle/Crawler-Detect - Detect 是一个轻量级的 PHP 工具类,专门用于通过用户代理(User Agent)识别网络爬虫、机器人或蜘蛛程序。该项目的核心功能是提供简单高效的接口,开发者可通过调用类方法快速判断当前访问请求是否来自爬虫程序,并支持获取爬虫的详细信息(如名称、类型、所属公司等)。其工作原理基于内置的 JSON 格式数据库,该数据库包含大量主流爬虫的用户代理特征数据,开发者可轻松更新或扩展数据文件以适配新出现的爬虫类型。项目采用纯 PHP 实现,无需依赖外部库或框架,兼容 PHP 5.4 及以上版本,可无缝集成到 Laravel、Symfony 等主流 PHP 框架中。其 API 设计简洁直观,例如通过 `isBot()` 方法判断是否为爬虫,`getBot()` 方法获取爬虫信息,`isCrawler()` 方法检测是否为特定类型爬虫。由于数据存储为独立的 JSON 文件,开发者可快速自定义规则或排除误判情况。项目还支持区分搜索引擎爬虫(如 Googlebot)与普通爬虫,帮助开发者更精准地识别流量来源。整体设计注重性能与灵活性,适合需要区分爬虫与真实用户访问的场景,例如防止爬虫滥用资源或实现访问控制策略。 (后端开发框架及项目 / PHP开发)
README
## About CrawlerDetect
CrawlerDetect is a PHP class for detecting bots/crawlers/spiders via the `user agent` and `http_from` header. Currently able to detect 1,000's of bots/spiders/crawlers.
### Installation
```
composer require jaybizzle/crawler-detect
```
### Usage
```PHP
use Jaybizzle\CrawlerDetect\CrawlerDetect;
$CrawlerDetect = new CrawlerDetect;
// Check the user agent of the current 'visitor'
if($CrawlerDetect->isCrawler()) {
// true if crawler user agent detected
}
// Pass a user agent as a string
if($CrawlerDetect->isCrawler('Mozilla/5.0 (compatible; Sosospider/2.0; +http://help.soso.com/webspider.htm)')) {
// true if crawler user agent detected
}
// Output the name of the bot that matched (if any)
echo $CrawlerDetect->getMatches();
```
### Contributing
If you find a bot/spider/crawler user agent that CrawlerDetect fails to detect, please submit a pull request with the regex pattern added to the `$data` array in `Fixtures/Crawlers.php` and add the failing user agent to `tests/crawlers.txt`.
Failing that, just create an issue with the user agent you have found, and we'll take it from there :)
### Laravel Package
If you would like to use this with Laravel, please see [Laravel-Crawler-Detect](https://github.com/JayBizzle/Laravel-Crawler-Detect)
### Symfony Bundle
To use this library with Symfony 2/3/4, check out the [CrawlerDetectBundle](https://github.com/nicolasmure/CrawlerDetectBundle).
### YII2 Extension
To use this library with the YII2 framework, check out [yii2-crawler-detect](https://github.com/AlikDex/yii2-crawler-detect).
### ES6 Library
To use this library with NodeJS or any ES6 application based, check out [es6-crawler-detect](https://github.com/JefferyHus/es6-crawler-detect).
### Python Library
To use this library in a Python project, check out [crawlerdetect](https://github.com/moskrc/CrawlerDetect).
### JVM Library (written in Java)
To use this library in a JVM project (including Java, Scala, Kotlin, etc.), check out [CrawlerDetect](https://github.com/nekosoftllc/crawler-detect).
### .NET Library
To use this library in a .net standard (including .net core) based project, check out [NetCrawlerDetect](https://github.com/gplumb/NetCrawlerDetect).
### Ruby Gem
To use this library with Ruby on Rails or any Ruby-based application, check out [crawler_detect](https://github.com/loadkpi/crawler_detect) gem.
### Go Module
To use this library with Go, check out the [crawlerdetect](https://github.com/x-way/crawlerdetect) module.
_Parts of this class are based on the brilliant [MobileDetect](https://github.com/serbanghita/Mobile-Detect)_
[](https://github.com/JayBizzle/Crawler-Detect)
