An open API service indexing awesome lists of open source software.

https://github.com/fabriziosalmi/patterns

Automated OWASP CRS and Bad Bot Detection for Nginx, Apache, Traefik and HaProxy
https://github.com/fabriziosalmi/patterns

apache bad-requests bot-detection caddy caddyserver crs firewall-configuration firewall-rules malicious-url-detection mod-security nginx owasp waf web-application-firewall

Last synced: 3 days ago
JSON representation

Automated OWASP CRS and Bad Bot Detection for Nginx, Apache, Traefik and HaProxy

Awesome Lists containing this project

README

          


Patterns

Patterns


Production-grade WAF rules, on autopilot.



Automated OWASP Core Rule Set and
bad-bot patterns, converted into native configurations for
Nginx, Apache, Traefik, and HAProxy
— refreshed every day.



Latest Release
Update workflow
Nginx tests
License
Documentation



Documentation
·
Get started
·
Latest release


---

## Why Patterns

The OWASP Core Rule Set (CRS) is the de-facto open-source rule base behind ModSecurity, but plugging it into anything other than Apache is non-trivial. Patterns automates the whole pipeline:

1. Pull the latest CRS rules straight from upstream.
2. Convert them into the **native** syntax of each web server — not a generic shim.
3. Package the output as ready-to-deploy archives, refreshed every day by GitHub Actions.

You get equivalent protection across SQL injection, XSS, RCE, LFI, and bad-bot traffic, regardless of which proxy you run.

## Highlights

| | |
|---|---|
| **OWASP CRS coverage** | SQLi, XSS, RCE, LFI, RFI, plus generic anomaly and protocol-violation rules. |
| **Native output** | Nginx `map`/`if`, Apache `SecRule`, Traefik middleware TOML, HAProxy ACL files. |
| **Bad-bot blocking** | Curated User-Agent lists from public sources, with safe defaults that do **not** block major search engines. |
| **Daily refresh** | A scheduled GitHub Actions workflow rebuilds every backend and publishes a fresh release. |
| **Pre-built archives** | Skip the toolchain — download `nginx_waf.zip`, `apache_waf.zip`, `traefik_waf.zip`, or `haproxy_waf.zip`. |
| **Composable** | Each backend is a small Python converter on top of one JSON intermediate. Adding a new platform is a few hundred lines. |

> Using **Caddy**? See the dedicated [`caddy-waf`](https://github.com/fabriziosalmi/caddy-waf) project.

## Quick start

### Option 1 — download a pre-built release

```bash
# Pick the archive that matches your stack
curl -LO https://github.com/fabriziosalmi/patterns/releases/latest/download/nginx_waf.zip
unzip nginx_waf.zip -d /etc/nginx/waf_patterns
```

Then follow the [Nginx](https://fabriziosalmi.github.io/patterns/nginx),
[Apache](https://fabriziosalmi.github.io/patterns/apache),
[Traefik](https://fabriziosalmi.github.io/patterns/traefik), or
[HAProxy](https://fabriziosalmi.github.io/patterns/haproxy) integration guide.

### Option 2 — build from source

Requires **Python 3.11+**, `pip`, and `git`.

```bash
git clone https://github.com/fabriziosalmi/patterns.git
cd patterns
pip install -r requirements.txt

python owasp2json.py # 1. Fetch the latest OWASP CRS into owasp_rules.json
python json2nginx.py # 2. Convert into Nginx WAF config
python json2apache.py # …or Apache (ModSecurity)
python json2traefik.py # …or Traefik middleware
python json2haproxy.py # …or HAProxy ACL files
python badbots.py # 3. Generate bad-bot blocklists
```

Generated files land in `waf_patterns//`.

## Architecture

```text
┌─────────────────────┐ daily cron ┌──────────────────────┐
│ coreruleset/ │ ───────────────▶ │ owasp2json.py │
│ coreruleset (GH) │ │ → owasp_rules.json │
└─────────────────────┘ └──────────┬───────────┘

┌─────────────────┬──────────────────┬──────┴──────────┐
▼ ▼ ▼ ▼
json2nginx.py json2apache.py json2traefik.py json2haproxy.py
│ │ │ │
▼ ▼ ▼ ▼
nginx_waf.zip apache_waf.zip traefik_waf.zip haproxy_waf.zip
(published as a GitHub Release)
```

Each converter is independent, idempotent, and configured exclusively through environment variables (`INPUT_FILE`, `OUTPUT_DIR`). Full reference at [docs/api](https://fabriziosalmi.github.io/patterns/api).

## Repository layout

```text
patterns/
├── owasp2json.py # Pull and parse OWASP CRS into a JSON intermediate
├── json2nginx.py # JSON → Nginx (map + if directives)
├── json2apache.py # JSON → Apache (ModSecurity SecRule)
├── json2traefik.py # JSON → Traefik (middleware TOML)
├── json2haproxy.py # JSON → HAProxy (ACL files)
├── badbots.py # Public bot lists → per-platform blocklists
├── import_*_waf.py # Optional installers for each platform
├── waf_patterns/ # Generated outputs
│ ├── nginx/
│ ├── apache/
│ ├── traefik/
│ └── haproxy/
├── docs/ # VitePress documentation site
├── tests/ # Validation tests for each backend
└── .github/workflows/ # Daily build + release automation
```

## Integration in 60 seconds

### Nginx

```nginx
http {
include /etc/nginx/waf_patterns/nginx/waf_maps.conf;
include /etc/nginx/waf_patterns/nginx/bots.conf;
}
server {
include /etc/nginx/waf_patterns/nginx/waf_rules.conf;
if ($bad_bot) { return 403; }
}
```

### Apache (ModSecurity)

```apache

SecRuleEngine On
Include /etc/apache2/waf_patterns/apache/*.conf

```

### Traefik

```yaml
http:
routers:
app:
rule: "Host(`example.com`)"
service: app
middlewares: [waf-protection@file, bot-blocker@file]
```

### HAProxy

```haproxy
frontend http-in
bind *:80
acl waf_match path,url_dec -m reg -i -f /etc/haproxy/waf.acl
acl bad_bot hdr(User-Agent) -m reg -i -f /etc/haproxy/bots.acl
http-request deny deny_status 403 if waf_match || bad_bot
```

Full guides — with logging, whitelists, and tuning — live in the [docs](https://fabriziosalmi.github.io/patterns/).

## Bad-bot example output (Nginx)

```nginx
map $http_user_agent $bad_bot {
default 0;
"~*AhrefsBot" 1;
"~*SemrushBot" 1;
"~*MJ12bot" 1;
"~*GPTBot" 1;
}

if ($bad_bot) { return 403; }
```

The default list blocks SEO crawlers, AI training bots, and known scanners while explicitly **allowing** major search engines (Google, Bing, DuckDuckGo, Yandex, Baidu).

## Automation

| Workflow | Schedule | Purpose |
|----------|----------|---------|
| [`update_patterns.yml`](.github/workflows/update_patterns.yml) | Daily + manual | Re-fetch CRS, regenerate every backend, publish a release |
| [`test_nginx.yml`](.github/workflows/test_nginx.yml) | On PR | Validate generated Nginx rules against a live container |
| [`test_apache_docker.yml`](.github/workflows/test_apache_docker.yml) | On PR | Validate generated Apache rules against ModSecurity in Docker |
| [`docs.yml`](.github/workflows/docs.yml) | On `docs/` change | Build and deploy the VitePress docs to GitHub Pages |

All workflows run on **GitHub-hosted runners** (`ubuntu-latest`).

## Documentation

The full documentation lives at **[fabriziosalmi.github.io/patterns](https://fabriziosalmi.github.io/patterns/)** — built with [VitePress](https://vitepress.dev/) and deployed automatically.

- [Getting Started](https://fabriziosalmi.github.io/patterns/getting-started)
- [Nginx](https://fabriziosalmi.github.io/patterns/nginx) · [Apache](https://fabriziosalmi.github.io/patterns/apache) · [Traefik](https://fabriziosalmi.github.io/patterns/traefik) · [HAProxy](https://fabriziosalmi.github.io/patterns/haproxy)
- [Bad Bot Detection](https://fabriziosalmi.github.io/patterns/badbots)
- [API & Scripts Reference](https://fabriziosalmi.github.io/patterns/api)

## Contributing

1. Fork the repository.
2. Create a feature branch: `git checkout -b feature/your-change`.
3. Commit and push.
4. Open a pull request — the test workflows will run automatically.

See [CONTRIBUTING.md](CONTRIBUTING.md) for details and [SECURITY.md](SECURITY.md) for the disclosure policy.

## License

Released under the [MIT License](LICENSE).

## Resources

- [OWASP Core Rule Set](https://github.com/coreruleset/coreruleset)
- [ModSecurity](https://modsecurity.org/)
- [Nginx](https://nginx.org/) · [Apache HTTPD](https://httpd.apache.org/) · [Traefik](https://traefik.io/) · [HAProxy](https://www.haproxy.org/)
- [ai.robots.txt](https://github.com/ai-robots-txt/ai.robots.txt) — upstream AI-bot list

---


Built and maintained by Fabrizio Salmi.