https://github.com/javascriptdude/nginx_global_bot_blocker
Instructions On Blocking Bots Globally in nginx
https://github.com/javascriptdude/nginx_global_bot_blocker
Last synced: about 1 month ago
JSON representation
Instructions On Blocking Bots Globally in nginx
- Host: GitHub
- URL: https://github.com/javascriptdude/nginx_global_bot_blocker
- Owner: JavaScriptDude
- License: mit
- Created: 2022-07-31T05:32:14.000Z (almost 3 years ago)
- Default Branch: main
- Last Pushed: 2022-07-31T05:34:54.000Z (almost 3 years ago)
- Last Synced: 2025-02-12T07:54:35.578Z (3 months ago)
- Size: 3.91 KB
- Stars: 0
- Watchers: 3
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: readme.md
- License: LICENSE
Awesome Lists containing this project
README
Sometimes you want to limit bots globally for all your sites in an nginx instance. These instructions illustrate a solution to this usesase.
1) Edit: `/etc/nginx/nginx.conf` and add the following block to the end of the `http` block:
```
## Global BOT Blocker ##
map $http_user_agent $_global_bot_blocker {
default 0;
~*(google|bing|yandex|msnbot) 1;
~*(AltaVista|Googlebot|Slurp|BlackWidow|Bot|ChinaClaw|Custo|DISCo|Download|Demon|eCatch|EirGrabber|EmailSiphon|EmailWolf|SuperHTTP|Surfbot|WebWhacker) 1;
~*(Express|WebPictures|ExtractorPro|EyeNetIE|FlashGet|GetRight|GetWeb!|Go!Zilla|Go-Ahead-Got-It|GrabNet|Grafula|HMView|Go!Zilla|Go-Ahead-Got-It) 1;
~*(rafula|HMView|HTTrack|Stripper|Sucker|Indy|InterGET|Ninja|JetCar|Spider|larbin|LeechFTP|Downloader|tool|Navroad|NearSite|NetAnts|tAkeOut|WWWOFFLE) 1;
~*(GrabNet|NetSpider|Vampire|NetZIP|Octopus|Offline|PageGrabber|Foto|pavuk|pcBrowser|RealDownload|ReGet|SiteSnagger|SmartDownload|SuperBot|WebSpider) 1;
~*(Teleport|VoidEYE|Collector|WebAuto|WebCopier|WebFetch|WebGo|WebLeacher|WebReaper|WebSauger|eXtractor|Quester|WebStripper|WebZIP|Wget|Widow|Zeus) 1;
~*(Twengabot|htmlparser|libwww|Python|perl|urllib|scan|Curl|email|PycURL|Pyth|PyQ|WebCollector|WebCopy|webcraw) 1;
}```
* The above can be adjusted to your preference
* The above config will block curl and wget unless you alter the 'User-Agent' header
2) In the `location` block of your site, add the following `if ($_global_bot_blocker = 1) { return 403; }`