https://github.com/fabriziosalmi/patterns
Automated OWASP CRS and Bad Bot Detection for Nginx, Apache, Traefik and HaProxy
https://github.com/fabriziosalmi/patterns
apache bad-requests bot-detection caddy caddyserver crs firewall-configuration firewall-rules malicious-url-detection mod-security nginx owasp waf web-application-firewall
Last synced: 3 days ago
JSON representation
Automated OWASP CRS and Bad Bot Detection for Nginx, Apache, Traefik and HaProxy
- Host: GitHub
- URL: https://github.com/fabriziosalmi/patterns
- Owner: fabriziosalmi
- License: mit
- Created: 2024-12-21T00:00:15.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2026-05-28T00:58:34.000Z (6 days ago)
- Last Synced: 2026-05-28T02:23:53.819Z (6 days ago)
- Topics: apache, bad-requests, bot-detection, caddy, caddyserver, crs, firewall-configuration, firewall-rules, malicious-url-detection, mod-security, nginx, owasp, waf, web-application-firewall
- Language: Python
- Homepage: https://fabriziosalmi.github.io/patterns/
- Size: 1.36 MB
- Stars: 307
- Watchers: 4
- Forks: 7
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- Funding: .github/FUNDING.yml
- License: LICENSE
- Code of conduct: CODE_OF_CONDUCT.md
- Security: SECURITY.md
Awesome Lists containing this project
README
Patterns
Production-grade WAF rules, on autopilot.
Automated OWASP Core Rule Set and
bad-bot patterns, converted into native configurations for
Nginx, Apache, Traefik, and HAProxy
— refreshed every day.
Documentation
·
Get started
·
Latest release
---
## Why Patterns
The OWASP Core Rule Set (CRS) is the de-facto open-source rule base behind ModSecurity, but plugging it into anything other than Apache is non-trivial. Patterns automates the whole pipeline:
1. Pull the latest CRS rules straight from upstream.
2. Convert them into the **native** syntax of each web server — not a generic shim.
3. Package the output as ready-to-deploy archives, refreshed every day by GitHub Actions.
You get equivalent protection across SQL injection, XSS, RCE, LFI, and bad-bot traffic, regardless of which proxy you run.
## Highlights
| | |
|---|---|
| **OWASP CRS coverage** | SQLi, XSS, RCE, LFI, RFI, plus generic anomaly and protocol-violation rules. |
| **Native output** | Nginx `map`/`if`, Apache `SecRule`, Traefik middleware TOML, HAProxy ACL files. |
| **Bad-bot blocking** | Curated User-Agent lists from public sources, with safe defaults that do **not** block major search engines. |
| **Daily refresh** | A scheduled GitHub Actions workflow rebuilds every backend and publishes a fresh release. |
| **Pre-built archives** | Skip the toolchain — download `nginx_waf.zip`, `apache_waf.zip`, `traefik_waf.zip`, or `haproxy_waf.zip`. |
| **Composable** | Each backend is a small Python converter on top of one JSON intermediate. Adding a new platform is a few hundred lines. |
> Using **Caddy**? See the dedicated [`caddy-waf`](https://github.com/fabriziosalmi/caddy-waf) project.
## Quick start
### Option 1 — download a pre-built release
```bash
# Pick the archive that matches your stack
curl -LO https://github.com/fabriziosalmi/patterns/releases/latest/download/nginx_waf.zip
unzip nginx_waf.zip -d /etc/nginx/waf_patterns
```
Then follow the [Nginx](https://fabriziosalmi.github.io/patterns/nginx),
[Apache](https://fabriziosalmi.github.io/patterns/apache),
[Traefik](https://fabriziosalmi.github.io/patterns/traefik), or
[HAProxy](https://fabriziosalmi.github.io/patterns/haproxy) integration guide.
### Option 2 — build from source
Requires **Python 3.11+**, `pip`, and `git`.
```bash
git clone https://github.com/fabriziosalmi/patterns.git
cd patterns
pip install -r requirements.txt
python owasp2json.py # 1. Fetch the latest OWASP CRS into owasp_rules.json
python json2nginx.py # 2. Convert into Nginx WAF config
python json2apache.py # …or Apache (ModSecurity)
python json2traefik.py # …or Traefik middleware
python json2haproxy.py # …or HAProxy ACL files
python badbots.py # 3. Generate bad-bot blocklists
```
Generated files land in `waf_patterns//`.
## Architecture
```text
┌─────────────────────┐ daily cron ┌──────────────────────┐
│ coreruleset/ │ ───────────────▶ │ owasp2json.py │
│ coreruleset (GH) │ │ → owasp_rules.json │
└─────────────────────┘ └──────────┬───────────┘
│
┌─────────────────┬──────────────────┬──────┴──────────┐
▼ ▼ ▼ ▼
json2nginx.py json2apache.py json2traefik.py json2haproxy.py
│ │ │ │
▼ ▼ ▼ ▼
nginx_waf.zip apache_waf.zip traefik_waf.zip haproxy_waf.zip
(published as a GitHub Release)
```
Each converter is independent, idempotent, and configured exclusively through environment variables (`INPUT_FILE`, `OUTPUT_DIR`). Full reference at [docs/api](https://fabriziosalmi.github.io/patterns/api).
## Repository layout
```text
patterns/
├── owasp2json.py # Pull and parse OWASP CRS into a JSON intermediate
├── json2nginx.py # JSON → Nginx (map + if directives)
├── json2apache.py # JSON → Apache (ModSecurity SecRule)
├── json2traefik.py # JSON → Traefik (middleware TOML)
├── json2haproxy.py # JSON → HAProxy (ACL files)
├── badbots.py # Public bot lists → per-platform blocklists
├── import_*_waf.py # Optional installers for each platform
├── waf_patterns/ # Generated outputs
│ ├── nginx/
│ ├── apache/
│ ├── traefik/
│ └── haproxy/
├── docs/ # VitePress documentation site
├── tests/ # Validation tests for each backend
└── .github/workflows/ # Daily build + release automation
```
## Integration in 60 seconds
### Nginx
```nginx
http {
include /etc/nginx/waf_patterns/nginx/waf_maps.conf;
include /etc/nginx/waf_patterns/nginx/bots.conf;
}
server {
include /etc/nginx/waf_patterns/nginx/waf_rules.conf;
if ($bad_bot) { return 403; }
}
```
### Apache (ModSecurity)
```apache
SecRuleEngine On
Include /etc/apache2/waf_patterns/apache/*.conf
```
### Traefik
```yaml
http:
routers:
app:
rule: "Host(`example.com`)"
service: app
middlewares: [waf-protection@file, bot-blocker@file]
```
### HAProxy
```haproxy
frontend http-in
bind *:80
acl waf_match path,url_dec -m reg -i -f /etc/haproxy/waf.acl
acl bad_bot hdr(User-Agent) -m reg -i -f /etc/haproxy/bots.acl
http-request deny deny_status 403 if waf_match || bad_bot
```
Full guides — with logging, whitelists, and tuning — live in the [docs](https://fabriziosalmi.github.io/patterns/).
## Bad-bot example output (Nginx)
```nginx
map $http_user_agent $bad_bot {
default 0;
"~*AhrefsBot" 1;
"~*SemrushBot" 1;
"~*MJ12bot" 1;
"~*GPTBot" 1;
}
if ($bad_bot) { return 403; }
```
The default list blocks SEO crawlers, AI training bots, and known scanners while explicitly **allowing** major search engines (Google, Bing, DuckDuckGo, Yandex, Baidu).
## Automation
| Workflow | Schedule | Purpose |
|----------|----------|---------|
| [`update_patterns.yml`](.github/workflows/update_patterns.yml) | Daily + manual | Re-fetch CRS, regenerate every backend, publish a release |
| [`test_nginx.yml`](.github/workflows/test_nginx.yml) | On PR | Validate generated Nginx rules against a live container |
| [`test_apache_docker.yml`](.github/workflows/test_apache_docker.yml) | On PR | Validate generated Apache rules against ModSecurity in Docker |
| [`docs.yml`](.github/workflows/docs.yml) | On `docs/` change | Build and deploy the VitePress docs to GitHub Pages |
All workflows run on **GitHub-hosted runners** (`ubuntu-latest`).
## Documentation
The full documentation lives at **[fabriziosalmi.github.io/patterns](https://fabriziosalmi.github.io/patterns/)** — built with [VitePress](https://vitepress.dev/) and deployed automatically.
- [Getting Started](https://fabriziosalmi.github.io/patterns/getting-started)
- [Nginx](https://fabriziosalmi.github.io/patterns/nginx) · [Apache](https://fabriziosalmi.github.io/patterns/apache) · [Traefik](https://fabriziosalmi.github.io/patterns/traefik) · [HAProxy](https://fabriziosalmi.github.io/patterns/haproxy)
- [Bad Bot Detection](https://fabriziosalmi.github.io/patterns/badbots)
- [API & Scripts Reference](https://fabriziosalmi.github.io/patterns/api)
## Contributing
1. Fork the repository.
2. Create a feature branch: `git checkout -b feature/your-change`.
3. Commit and push.
4. Open a pull request — the test workflows will run automatically.
See [CONTRIBUTING.md](CONTRIBUTING.md) for details and [SECURITY.md](SECURITY.md) for the disclosure policy.
## License
Released under the [MIT License](LICENSE).
## Resources
- [OWASP Core Rule Set](https://github.com/coreruleset/coreruleset)
- [ModSecurity](https://modsecurity.org/)
- [Nginx](https://nginx.org/) · [Apache HTTPD](https://httpd.apache.org/) · [Traefik](https://traefik.io/) · [HAProxy](https://www.haproxy.org/)
- [ai.robots.txt](https://github.com/ai-robots-txt/ai.robots.txt) — upstream AI-bot list
---