An open API service indexing awesome lists of open source software.

https://github.com/javascriptdude/nginx_global_bot_blocker

Instructions On Blocking Bots Globally in nginx
https://github.com/javascriptdude/nginx_global_bot_blocker

Last synced: about 1 month ago
JSON representation

Instructions On Blocking Bots Globally in nginx

Awesome Lists containing this project

README

        

Sometimes you want to limit bots globally for all your sites in an nginx instance. These instructions illustrate a solution to this usesase.

1) Edit: `/etc/nginx/nginx.conf` and add the following block to the end of the `http` block:
```
## Global BOT Blocker ##
map $http_user_agent $_global_bot_blocker {
default 0;
~*(google|bing|yandex|msnbot) 1;
~*(AltaVista|Googlebot|Slurp|BlackWidow|Bot|ChinaClaw|Custo|DISCo|Download|Demon|eCatch|EirGrabber|EmailSiphon|EmailWolf|SuperHTTP|Surfbot|WebWhacker) 1;
~*(Express|WebPictures|ExtractorPro|EyeNetIE|FlashGet|GetRight|GetWeb!|Go!Zilla|Go-Ahead-Got-It|GrabNet|Grafula|HMView|Go!Zilla|Go-Ahead-Got-It) 1;
~*(rafula|HMView|HTTrack|Stripper|Sucker|Indy|InterGET|Ninja|JetCar|Spider|larbin|LeechFTP|Downloader|tool|Navroad|NearSite|NetAnts|tAkeOut|WWWOFFLE) 1;
~*(GrabNet|NetSpider|Vampire|NetZIP|Octopus|Offline|PageGrabber|Foto|pavuk|pcBrowser|RealDownload|ReGet|SiteSnagger|SmartDownload|SuperBot|WebSpider) 1;
~*(Teleport|VoidEYE|Collector|WebAuto|WebCopier|WebFetch|WebGo|WebLeacher|WebReaper|WebSauger|eXtractor|Quester|WebStripper|WebZIP|Wget|Widow|Zeus) 1;
~*(Twengabot|htmlparser|libwww|Python|perl|urllib|scan|Curl|email|PycURL|Pyth|PyQ|WebCollector|WebCopy|webcraw) 1;
}

```
* The above can be adjusted to your preference
* The above config will block curl and wget unless you alter the 'User-Agent' header

2) In the `location` block of your site, add the following `if ($_global_bot_blocker = 1) { return 403; }`