{"id":25460837,"url":"https://github.com/libops/captcha-protect","last_synced_at":"2025-10-24T19:06:49.490Z","repository":{"id":278039219,"uuid":"934306672","full_name":"libops/captcha-protect","owner":"libops","description":"Traefik middleware to add an anti-bot challenge to individual IPs in a subnet when traffic spikes are detected from that subnet","archived":false,"fork":false,"pushed_at":"2025-07-09T10:39:33.000Z","size":148,"stargazers_count":12,"open_issues_count":5,"forks_count":2,"subscribers_count":3,"default_branch":"main","last_synced_at":"2025-08-30T01:46:10.508Z","etag":null,"topics":["anti-bot","captcha","rate-limiter","traefik","traefik-plugin"],"latest_commit_sha":null,"homepage":"https://plugins.traefik.io/plugins/67b387474e61aa8e06b25368/captcha-protect","language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"unlicense","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/libops.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-02-17T15:59:37.000Z","updated_at":"2025-07-09T10:39:29.000Z","dependencies_parsed_at":"2025-03-15T07:24:36.300Z","dependency_job_id":"93d3e0e2-d733-4dee-a496-f4d7e2407bb3","html_url":"https://github.com/libops/captcha-protect","commit_stats":null,"previous_names":["libops/captcha-protect"],"tags_count":36,"template":false,"template_full_name":null,"purl":"pkg:github/libops/captcha-protect","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/libops%2Fcaptcha-protect","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/libops%2Fcaptcha-protect/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/libops%2Fcaptcha-protect/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/libops%2Fcaptcha-protect/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/libops","download_url":"https://codeload.github.com/libops/captcha-protect/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/libops%2Fcaptcha-protect/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":273227307,"owners_count":25067684,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-09-02T02:00:09.530Z","response_time":77,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["anti-bot","captcha","rate-limiter","traefik","traefik-plugin"],"created_at":"2025-02-18T05:04:46.599Z","updated_at":"2025-10-24T19:06:44.468Z","avatar_url":"https://github.com/libops.png","language":"Go","readme":"# Captcha Protect\n[![lint-test](https://github.com/libops/captcha-protect/actions/workflows/lint-test.yml/badge.svg)](https://github.com/libops/captcha-protect/actions/workflows/lint-test.yml)\n[![Go Report Card](https://goreportcard.com/badge/github.com/libops/captcha-protect)](https://goreportcard.com/report/github.com/libops/captcha-protect)\n\nTraefik middleware to challenge individual IPs in a subnet when traffic spikes are detected from that subnet, using a captcha of your choice for the challenge (turnstile, recaptcha, or hcaptcha). **Requires traefik `v2.11.1` or above**\n\nYou may have seen CAPTCHAs added to individual forms on the web to prevent bots from spamming submissions. This plugin extends that concept to your entire site (or specific routes on your site), effectively placing your entire site behind a CAPTCHA. However, the CAPTCHA is only triggered when a spike in traffic is detected from the same IP subnet. Once the CAPTCHA is successfully completed, that IP is no longer challenged, allowing uninterrupted browsing.\n\n\u003cdetails\u003e\u003csummary\u003eanti-bot decision tree\u003c/summary\u003e\n\n```mermaid\nflowchart TD\n    Client(Client accesses path on website) --\u003e IP{Has client passed captcha challenge in the last 24h?}\n    IP -- Yes --\u003e Continue(Go to original destination)\n    IP -- No --\u003e IP_BYPASS{Is client IP excluded by captcha-protect config?}\n    IP_BYPASS -- Yes --\u003e Continue(Go to original destination)\n    IP_BYPASS -- No --\u003e GOOD_BOT{Is client IP hostname in allowed bot list?}\n    GOOD_BOT -- No --\u003e PROTECTED_ROUTE{Is this route protected?}\n    GOOD_BOT -- Yes --\u003e CANONICAL_URL_BOT{Are there URL parameters?}\n    CANONICAL_URL_BOT -- Yes --\u003e PROTECTED_ROUTE{Is this route prefix in protectRoutes?}\n    CANONICAL_URL_BOT -- No --\u003e Continue(Go to original destination)\n    PROTECTED_ROUTE -- Yes --\u003e RATE_LIMIT{Is this IP in a range seeing increased traffic?}\n    PROTECTED_ROUTE -- No --\u003e Continue(Go to original destination)\n    RATE_LIMIT -- Yes --\u003e REDIRECT(Redirect to /challenge)\n    RATE_LIMIT -- No --\u003e Continue(Go to original destination)\n    REDIRECT --\u003e CHALLENGE{turnstile/recaptcha/hcaptcha challenge}\n    CHALLENGE -- Pass --\u003e Continue(Go to original destination)\n    CHALLENGE -- Fail --\u003e Stuck\n```\n\u003c/details\u003e\n\n## Config\n\n### Example\n\nBelow is an example `docker-compose.yml` with traefik as the frontend, and nginx as the backend. nginx is using this middleware to protect routes on the site that start with `/` (`protectRoutes: \"/\"`)\n\nSince the config values aren't specified, captcha-protect would use the default `rateLimit: 20` and `window: 86400` so any IPv4 in `X.Y.0.0/16` (or ipv6 in `/64`) could only access the site 20 times before individual IPs in that subnet are required to pass a captcha to continue browsing.\n\n```yaml\nnetworks:\n    default:\nservices:\n    nginx:\n        image: nginx:${NGINX_TAG}\n        labels:\n            traefik.enable: true\n            traefik.http.routers.nginx.entrypoints: http\n            traefik.http.routers.nginx.service: nginx\n            traefik.http.routers.nginx.rule: Host(`${DOMAIN}`)\n            traefik.http.services.nginx.loadbalancer.server.port: 80\n            traefik.http.routers.nginx.middlewares: captcha-protect@docker\n            traefik.http.middlewares.captcha-protect.plugin.captcha-protect.rateLimit: 0\n            traefik.http.middlewares.captcha-protect.plugin.captcha-protect.ipv4subnetMask: 8\n            traefik.http.middlewares.captcha-protect.plugin.captcha-protect.window: 864000\n            traefik.http.middlewares.captcha-protect.plugin.captcha-protect.protectRoutes: \"/\"\n            traefik.http.middlewares.captcha-protect.plugin.captcha-protect.captchaProvider: turnstile\n            traefik.http.middlewares.captcha-protect.plugin.captcha-protect.siteKey: ${TURNSTILE_SITE_KEY}\n            traefik.http.middlewares.captcha-protect.plugin.captcha-protect.secretKey: ${TURNSTILE_SECRET_KEY}\n            traefik.http.middlewares.captcha-protect.plugin.captcha-protect.goodBots: apple.com,archive.org,commoncrawl.org,duckduckgo.com,facebook.com,google.com,googlebot.com,googleusercontent.com,instagram.com,kagibot.org,linkedin.com,msn.com,openalex.org,twitter.com,x.com\n            traefik.http.middlewares.captcha-protect.plugin.captcha-protect.persistentStateFile: /tmp/state.json\n        networks:\n            default:\n                aliases:\n                  - nginx\n    traefik:\n        image: traefik:${TRAEFIK_TAG}\n        command: \u003e-\n            --api.insecure=false\n            --api.dashboard=false\n            --api.debug=false\n            --ping=true\n            --entryPoints.http.address=:80\n            --providers.docker=true\n            --providers.docker.network=default\n            --experimental.plugins.captcha-protect.modulename=github.com/libops/captcha-protect\n            --experimental.plugins.captcha-protect.version=v1.9.4\n        volumes:\n            - /var/run/docker.sock:/var/run/docker.sock:z\n            - /CHANGEME/TO/A/HOST/PATH/FOR/STATE/FILE:/tmp/state.json:rw\n        ports:\n            - \"80:80\"\n        networks:\n            default:\n                aliases:\n                    - traefik\n        healthcheck:\n            test: traefik healthcheck --ping\n        depends_on:\n            nginx:\n                condition: service_started\n```\n### Config options\n\n| **Parameter**           | **Type (Required)**     | **Default**              | **Description**                                                                                                                                                                                  |\n|-------------------------|-------------------------|--------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|\n| `mode`                  | `string`                | `prefix`                 | Must be: `prefix`, `suffix`, `regex`. Matching does not include query parameters. `excludeRoutes` always uses `prefix` except when `mode: regex`. Only use `regex` when needed                   |\n| `protectRoutes`         | `[]string` (required)   | `\"\"`                     | Comma-separated list of route prefixes/suffixes/regex patterns to protect.                                                                                                                       |\n| `excludeRoutes`         | `[]string`              | `\"\"`                     | Comma-separated list of route prefixes to **never** protect. e.g., `protectRoutes: \"/\"` protects the entire site. `excludeRoutes: \"/ajax\"` would never challenge any route starting with `/ajax` |\n| `captchaProvider`       | `string` (required)     | `\"\"`                     | The captcha type to use. Supported values: `turnstile`, `hcaptcha`, and `recaptcha`.                                                                                                             |\n| `siteKey`               | `string` (required)     | `\"\"`                     | The captcha site key.                                                                                                                                                                            |\n| `secretKey`             | `string` (required)     | `\"\"`                     | The captcha secret key.                                                                                                                                                                          |\n| `rateLimit`             | `uint`                  | `20`                     | Maximum requests allowed from a subnet before a challenge is triggered.                                                                                                                          |\n| `window`                | `int`                   | `86400`                  | Duration (in seconds) for monitoring requests per subnet.                                                                                                                                        |\n| `ipv4subnetMask`        | `int`                   | `16`                     | CIDR subnet mask to group IPv4 addresses for rate limiting.                                                                                                                                      |\n| `ipv6subnetMask`        | `int`                   | `64`                     | CIDR subnet mask to group IPv6 addresses for rate limiting.                                                                                                                                      |\n| `ipForwardedHeader`     | `string`                | `\"\"`                     | Header to check for the original client IP if Traefik is behind a load balancer.                                                                                                                 |\n| `ipDepth`               | `int`                   | `0`                      | How deep past the last non-exempt IP to fetch the real IP from `ipForwardedHeader`. Default 0 returns the last IP in the forward header                                                          |\n| `goodBots`              | `[]string` (encouraged) | *see below*              | List of second-level domains for bots that are never challenged or rate-limited.                                                                                                                 |\n| `protectParameters`     | `string`                | `\"false\"`                | Forces rate limiting even for good bots if URL parameters are present. Useful for protecting faceted search pages.                                                                               |\n| `protectFileExtensions` | `[]string`              | `\"\"`                     | Comma-separated file extensions to protect. By default, your protected routes only protect html files. This is to prevent files like CSS/JS/img from tripping the rate limit.                    |\n| `protectHttpMethods`    | `[]string`              | `\"GET,HEAD\"`             | Comma-separated list of HTTP methods to protect against                                                                                                                                          |\n| `exemptIps`             | `[]string`              | `privateIPs`             | CIDR-formatted IPs that should never be challenged. Private IP ranges are always exempt.                                                                                                         |\n| `exemptUserAgents`      | `[]string`              | `\"\"`                     | Comma-separated list of case-insensitive user agent **prefixes** to never challenge. e.g. `exemptUserAgents: edge` would never challenge useragents like \"Edge/12.4 ...\"                         |\n| `challengeURL`          | `string`                | `\"/challenge\"`           | URL where challenges are served. This will override existing routes if there is a conflict. Setting to blank will have the challenge presented on the same page that tripped the rate limit.     |\n| `challengeTmpl`         | `string`                | `\"./challenge.tmpl.html\"`| Path to the Go HTML template for the captcha challenge page.                                                                                                                                     |\n| `challengeStatusCode`   | `int`                   | `200`                    | HTTP Response status code to return when serving a challenge                                                                                                                                     |\n| `enableStatsPage`       | `string`                | `\"false\"`                | Allows `exemptIps` to access `/captcha-protect/stats` to monitor the rate limiter.                                                                                                               |\n| `logLevel`              | `string`                | `\"INFO\"`                 | Log level for the middleware. Options: `ERROR`, `WARNING`, `INFO`, or `DEBUG`.                                                                                                                   |\n| `persistentStateFile`   | `string`                | `\"\"`                     | File path to persist rate limiter state across Traefik restarts. In Docker, mount this file from the host.                                                                                       |\n\n\n### Good Bots\n\nTo avoid having this middleware impact your SEO score, it's recommended to provide a value for `goodBots`. By default, no bots will be allowed to crawl your protected routes beyond the rate limit unless their second level domain (e.g. `google.com`) is configured as a good bot.\n\nA good default value for `goodBots` would be:\n\n```\ngoodBots: apple.com,archive.org,duckduckgo.com,facebook.com,google.com,googlebot.com,googleusercontent.com,instagram.com,kagibot.org,linkedin.com,msn.com,openalex.org,twitter.com,x.com\n```\n\n**However** if you set the config parameter `protectParameters=\"true\"`, even good bots won't be allowed to crawl protected routes if a URL parameter is on the request (e.g. `/foo?bar=baz`). This `protectParameters` feature is meant to help protect faceted search pages.\n\n\n## Overriding the challenge template file\n\nYou probably will want to theme the CAPTCHA challenge page to match the style of your site.\n\nYou can do that by copying the [challenge.tmpl.html](./challenge.tmpl.html) file in this repo into your docker compose project, mounting it into your traefik container\n\n```yaml\n    traefik:\n        volumes:\n            - ./host/path/to/challenge.tmpl.html:/challenge.tmpl.html:ro\n```\n\nand pointing the middleware to your overridden template with\n\n```yaml\n            traefik.http.middlewares.captcha-protect.plugin.captcha-protect.challengeTmpl: \"/challenge.tmpl.html\"\n```\n\nWhen you override the challenge template, the process probably looks like:\n\n1. Copying some html file from your existing site (so the challenge looks like the rest of your site)\n2. Replacing some `\u003cdiv\u003e` in the HTML body for the file copied in step 1 with the `\u003cform\u003e...\u003c/form\u003e\u003cscript\u003e...\u003c/script\u003e` HTML tags/contents in this repo's default [challenge.tmpl.html](./challenge.tmpl.html). You must copy the `form` and `script` tags exactly as they are in the original challenge template. They use go's templating language to inject the proper site key and other variables into the HTML response when a challenge is presented\n3. You must also be sure to have this in the `\u003chead\u003e` of your overridden template:\n\n```\n    \u003cscript src=\"{{ .FrontendJS }}\" async defer referrerpolicy=\"no-referrer\"\u003e\u003c/script\u003e\n```\n\n## Similar projects\n\n- [Traefik RateLimit middleware](https://doc.traefik.io/traefik/middlewares/http/ratelimit/) - the core traefik ratelimit middleware will start sending 429 responses based on individual IPs, which might not be good enough to protect against traffic coming from distributed networks. Also, this plugin (captcha-protect) allows not including files in your rate limiter to avoid static assets from being counted in the rate limit.\n- [crowdsec-bouncer-traefik-plugin](https://github.com/maxlerebourg/crowdsec-bouncer-traefik-plugin) has a captcha option, but requires integrating with crowdsec to verify individual IPs. This plugin (captcha-protect) instead just checks the traffic actually visiting your site and verifies the traffic is from a person only when the traffic exceeds some rate limit you configure.\n\n## Attribution\n\n- the original implementation of this logic was [a drupal module called turnstile_protect](https://www.drupal.org/project/turnstile_protect). This traefik plugin was made to make the challenge logic even more perfomant than that Drupal module, and also to provide this bot protection to non-Drupal websites\n- making general captcha structs to support multiple providers was based on the work in [crowdsec-bouncer-traefik-plugin](https://github.com/maxlerebourg/crowdsec-bouncer-traefik-plugin)\n- in memory cache thanks to https://github.com/patrickmn/go-cache\n\n## When to enable regex\n\nWhen possible, you want to keep regex disabled as seen in the example benchmark below.\n\nHowever, when needed it can be enabled with `mode: regex`\n\n```\n$ go mod init bench\n$ cat \u003c\u003c EOF \u003e bench_test.go\npackage main\n\nimport (\n\t\"regexp\"\n\t\"strings\"\n\t\"testing\"\n)\n\nvar (\n\ttestPath = \"/api/v1/user/profile\"\n\tprefix   = \"/api/v1\"\n\tregex    = regexp.MustCompile(\"^/api/v1\")\n)\n\nfunc BenchmarkHasPrefix(b *testing.B) {\n\tfor i := 0; i \u003c b.N; i++ {\n\t\t_ = strings.HasPrefix(testPath, prefix)\n\t}\n}\n\nfunc BenchmarkRegexMatch(b *testing.B) {\n\tfor i := 0; i \u003c b.N; i++ {\n\t\t_ = regex.MatchString(testPath)\n\t}\n}\nEOF\n$ go test -bench=. -benchmem\n```\n\n```\ngoos: darwin\ngoarch: amd64\ncpu: Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz\nBenchmarkHasPrefix-12     \t340856451\t         3.415 ns/op\t       0 B/op\t       0 allocs/op\nBenchmarkRegexMatch-12    \t27992568\t        41.20 ns/op\t       0 B/op\t       0 allocs/op\nPASS\n```\n\n\n## How to monitor the rate limiter\n\nIf you set the `enableStatsPage` to true, it allows `exemptIps` to access /`captcha-protect/stats` to monitor the rate limiter. The key JSON key to look on the stats page is the top level \"rate\" key, which will list the subnets that are currently forced to be challenged based to request patterns and the `captcha-protect` configuration values used.\n\nIf you have use a computer within the `exemptIps`, and access to the command line tools `curl` and `jq`, here is a recipe for how to list the top 25 subnets being challenged...\n\n```bash\ncurl -s https://example.com/captcha-protect/stats |   jq -r '.rate | to_entries | sort_by(.value) | .[] | \"\\(.key): \\(.value)\"' |   tail -25\n```\n\nThis JSON state data is also found in the `state.json` file that you should have configured in your `docker-compose.yml` using the `persistentStateFile` setting and volume definition. NOTE: this file should only be changed by `captcha-protect` and not manually.\n\n## Troubleshooting\n\nHere is a way to troubleshoot your `captcha-protect` set up.\n\n### Verify that your Turnstile site-key is configured properly\n\nOne reason that may cause `captcha-protect` to not work is that the Cloudflare Turnstile widget site-key or private-key are not properly set for `captcha-protect` to access. Below is a way to confirm if the Turnstile site-key is configured correctly. (**WARNING**: There is currently no easy way to check the private-key, since the secret-key should never be displayed on a webpage or shared.)\n1. Visit the `captcha-protect` \"challenge\" URL which is set to `https://example.com/challenge` by default.\n1. You should see a web page that says \"Verifying connection\" \n\u003cbr\u003e**NOTE**: If you customized the `challengeTmpl` configuration, the page may say something different.\n1. Look at the HTML source code for the page `https://example.com/challenge`, by right-clicking on the page and selecting \"View page source\" (on Chrome).\n1. In the HTML source view that opens up, look for a `\u003cDiv\u003e` tag that has an attribute named `data-sitekey`, and check if its value matches your Cloudflare Turnstile widget sitekey value. \n\u003cbr\u003e**TIP**: You need to log in to your Cloudflare online account and go to the Turnstile section to see your site-key and secret-key values.\n1. If the site-key value did not match in the HTML `\u003cDiv\u003e` tag, then update the `docker-compose.yml` and/or `.env` file to correctly pass the site-key value. Also check if the Cloudflare Turnstile widget secret-key is set correctly.\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flibops%2Fcaptcha-protect","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Flibops%2Fcaptcha-protect","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flibops%2Fcaptcha-protect/lists"}