Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/JefferyHus/es6-crawler-detect
:spider: This is an ES6 adaptation of the original PHP library CrawlerDetect, this library will help you detect bots/crawlers/spiders vie the useragent.
https://github.com/JefferyHus/es6-crawler-detect
bots crawler detection es6-javascript spider
Last synced: 2 months ago
JSON representation
:spider: This is an ES6 adaptation of the original PHP library CrawlerDetect, this library will help you detect bots/crawlers/spiders vie the useragent.
- Host: GitHub
- URL: https://github.com/JefferyHus/es6-crawler-detect
- Owner: JefferyHus
- License: mit
- Created: 2017-10-31T15:24:18.000Z (about 7 years ago)
- Default Branch: master
- Last Pushed: 2024-11-01T07:24:15.000Z (2 months ago)
- Last Synced: 2024-11-09T22:51:44.412Z (2 months ago)
- Topics: bots, crawler, detection, es6-javascript, spider
- Language: TypeScript
- Homepage:
- Size: 3.21 MB
- Stars: 90
- Watchers: 4
- Forks: 30
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Funding: .github/FUNDING.yml
- License: LICENSE
- Code of conduct: CODE_OF_CONDUCT.md
Awesome Lists containing this project
- awesome-morocco - es6-crawler-detect - crawler-detect.svg?style=social)](https://github.com/JefferyHus/es6-crawler-detect/stargazers) - ES6 module to help you detected and block bots, crawlers and spiders. by [@jefferyhus](https://github.com/JefferyHus) (Uncategorized / Uncategorized)
README
# Crawler Detect
[![DeepScan grade](https://deepscan.io/api/teams/16465/projects/19756/branches/518343/badge/grade.svg)](https://deepscan.io/dashboard#view=project&tid=16465&pid=19756&bid=518343)
[![npm version](https://badge.fury.io/js/es6-crawler-detect.svg)](https://badge.fury.io/js/es6-crawler-detect)
[![contributions welcome](https://img.shields.io/badge/contributions-welcome-brightgreen.svg?style=flat)](https://github.com/JefferyHus/es6-crawler-detect/issues)## About
This Library is an ES6 version of the original PHP class @[CrawlerDetect](https://github.com/JayBizzle/Crawler-Detect), it helps you detect bots/crawlers and spiders only by scanning the user-agent string or from the global `request.headers`.
## Installation
`npm install es6-crawler-detect`
## Usage
### ECMAScript 6 (ES6)
```javascript
'use strict';const express = require('express')
const { Crawler, middleware } = require('es6-crawler-detect')const app = express()
app.get('your/route', function async (request, response) {
// create a new Crawler instance
var CrawlerDetector = new Crawler(request)
// check the current visitor's useragent
if ( CrawlerDetector.isCrawler() )
{
// true if crawler user agent detected
}
// or check a user agent string
if ( CrawlerDetector.isCrawler('Mozilla/5.0 (compatible; Sosospider/2.0; +http://help.soso.com/webspider.htm)') )
{
// true if crawler user agent detected
}
// Output the name of the bot that matched (if any)
response.send(CrawlerDetector.getMatches())
})/**
* Or by using the middleware
*/
app.use(middleware((request, reponse) => {
// do something here
// e.g. console.log(request.body)
// e.g. return response.status(403).send('Forbidden')
}))app.get('/crawler', function async (request, response) {
// or check a user agent string
request.Crawler.isCrawler('Mozilla/5.0 (compatible; Sosospider/2.0; +http://help.soso.com/webspider.htm)')
// Output the name of the bot that matched (if any)
response.send(request.Crawler.getMatches())
})
```### ECMAScript 5 (ES5)
```xml
CrawlerDetect - the web crawler detection library
```
```javascript
// create a new Crawler instance
var CrawlerDetector = new Crawler.Crawler();
var userAgentString = navigator.userAgent;
// check the current visitor's useragent
if ( CrawlerDetector.isCrawler(userAgentString) )
{
// true if crawler user agent detected
}
// Output the name of the bot that matched (if any)
console.debug(CrawlerDetector.getMatches());
```## Contributing
If you find a bot/spider/crawler user agent that CrawlerDetect fails to detect, please submit a pull request with the regex pattern added to the `data` array in `./crawler/crawlers.js`.