{"id":49496293,"url":"https://github.com/madic-creates/claude-alert-analyzer","last_synced_at":"2026-06-05T07:00:51.317Z","repository":{"id":349430077,"uuid":"1197560626","full_name":"madic-creates/claude-alert-analyzer","owner":"madic-creates","description":"AI-powered alert analysis, receives Alertmanager and CheckMK webhooks, gathers diagnostic context, and uses Claude for root-cause analysis via ntfy notifications","archived":false,"fork":false,"pushed_at":"2026-05-30T11:07:22.000Z","size":4716,"stargazers_count":3,"open_issues_count":1,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-05-30T11:23:29.567Z","etag":null,"topics":["alertmanager","checkmk","claude-ai","devops","kubernetes","monitoring","ntfy","webhook"],"latest_commit_sha":null,"homepage":"","language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/madic-creates.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":"NOTICE","maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-03-31T17:21:14.000Z","updated_at":"2026-05-30T11:07:25.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/madic-creates/claude-alert-analyzer","commit_stats":null,"previous_names":["madic-creates/claude-alert-analyzer"],"tags_count":119,"template":false,"template_full_name":null,"purl":"pkg:github/madic-creates/claude-alert-analyzer","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/madic-creates%2Fclaude-alert-analyzer","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/madic-creates%2Fclaude-alert-analyzer/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/madic-creates%2Fclaude-alert-analyzer/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/madic-creates%2Fclaude-alert-analyzer/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/madic-creates","download_url":"https://codeload.github.com/madic-creates/claude-alert-analyzer/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/madic-creates%2Fclaude-alert-analyzer/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":33932048,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-05T02:00:06.157Z","response_time":120,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["alertmanager","checkmk","claude-ai","devops","kubernetes","monitoring","ntfy","webhook"],"created_at":"2026-05-01T09:30:32.731Z","updated_at":"2026-06-05T07:00:51.301Z","avatar_url":"https://github.com/madic-creates.png","language":"Go","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Claude Alert Analyzer\n\n\u003ca href=\"docs/screenshot01.png\"\u003e\u003cimg src=\"docs/screenshot01.png\" alt=\"Analyzer Result\" height=\"400\" align=\"left\" hspace=\"20\"\u003e\u003c/a\u003e\n\nLLM-powered root-cause analysis for monitoring alerts. Incoming alerts from Alertmanager (Kubernetes) or CheckMK trigger automated diagnostic collection, which is sent to Claude for analysis. The resulting root-cause assessment is delivered to operators via [ntfy](https://ntfy.sh).\n\nInstead of staring at a 3 AM \"DiskPressure\" alert and manually running ten `kubectl` / `ssh` commands, you get a short markdown summary on your phone: likely cause, blast radius, suggested remediation — derived from real metrics, events, pod logs, and (for CheckMK hosts) live SSH diagnostics.\n\n\u003cbr clear=\"left\"\u003e\n\n## Blog posts (German)\n\n- [KI-gestützte Alert-Analyse für Kubernetes und CheckMK](https://www.geekbundle.org/ki-gestuetzte-alert-analyse-fuer-kubernetes-und-checkmk/)\n- [Claude Analyzer — Entwicklung](https://www.geekbundle.org/claude-analyzer-entwicklung/)\n\n## How it works\n\n```\nAlert fires → Webhook → Gather diagnostics → Claude / LLM API → ntfy notification\n```\n\nTwo independent analyzers share a common library but run as separate binaries:\n\n| Analyzer | Alert Source | Diagnostics Gathered |\n|----------|-------------|---------------------|\n| **k8s-analyzer** | Alertmanager webhook | Prometheus metrics, K8s events, pod status, pod logs + agentic `kubectl_exec` / `promql_query` loop |\n| **checkmk-analyzer** | CheckMK notification script | CheckMK REST API (host/service state), agentic SSH diagnostics |\n\nBoth deduplicate repeat alerts (configurable cooldown) and process work concurrently (5 workers, queue depth 20). All diagnostic output is passed through a secret-redaction filter before leaving the analyzer.\n\n## Quick start (Docker — CheckMK)\n\nThe fastest way to try the **checkmk-analyzer**. For Kubernetes, see [`docs/install-k8s.md`](docs/install-k8s.md).\n\nYou need:\n\n- An [Anthropic API key](https://console.anthropic.com/) or compatible token\n- An [ntfy](https://ntfy.sh) server\n- SSH key + `known_hosts` for the monitored hosts (unprivileged user)\n- A CheckMK automation user\n\n```bash\nmkdir -p ./ssh\ncp /path/to/id_ed25519  ./ssh/id_ed25519\ncp /path/to/known_hosts ./ssh/known_hosts\nchmod 600 ./ssh/id_ed25519\n\ndocker run -d --name checkmk-analyzer \\\n  --restart unless-stopped --read-only --cap-drop ALL --user 65534:65534 \\\n  -p 127.0.0.1:8080:8080 -p 127.0.0.1:9101:9101 \\\n  -v \"$(pwd)/ssh:/ssh:ro\" \\\n  -e WEBHOOK_SECRET=\"change-me\" \\\n  -e ANTHROPIC_API_KEY=\"sk-ant-...\" \\\n  -e CHECKMK_API_URL=\"https://checkmk.example.com/mysite/check_mk/api/1.0/\" \\\n  -e CHECKMK_API_USER=\"automation\" \\\n  -e CHECKMK_API_SECRET=\"...\" \\\n  -e NTFY_PUBLISH_URL=\"https://ntfy.example.com\" \\\n  -e NTFY_PUBLISH_TOPIC=\"checkmk-analysis\" \\\n  ghcr.io/madic-creates/claude-alert-checkmk-analyzer:latest\n\ncurl -sf http://127.0.0.1:8080/health\n```\n\nThen install the CheckMK notification script and create a notification rule — full steps in [`docs/install-checkmk.md`](docs/install-checkmk.md).\n\n## Container images\n\nPre-built images are published to GHCR on every push to `main`:\n\n```\nghcr.io/madic-creates/claude-alert-kubernetes-analyzer:latest\nghcr.io/madic-creates/claude-alert-checkmk-analyzer:latest\n```\n\n| Image | Base | Size |\n|-------|------|------|\n| `claude-alert-kubernetes-analyzer` | `scratch` | ~60 MB |\n| `claude-alert-checkmk-analyzer` | `alpine:3.23` | ~20 MB (includes `openssh-client`) |\n\n## Documentation\n\n**Install**\n\n- [`docs/install-k8s.md`](docs/install-k8s.md) — Kubernetes deployment (Kustomize manifests, Alertmanager wiring, RBAC)\n- [`docs/install-checkmk.md`](docs/install-checkmk.md) — CheckMK deployment (Docker / Compose, notification script, rules, optional `ai_context` host attribute)\n\n**Operations**\n\n- [`docs/configuration.md`](docs/configuration.md) — full env-var reference (shared, k8s, checkmk, LLM provider, storm robustness)\n- [`docs/observability.md`](docs/observability.md) — API endpoints, Prometheus metrics, scrape config, logging\n- [`docs/hardening.md`](docs/hardening.md) — runtime hardening, agentic-loop guardrails, RBAC and SSH details\n- [`docs/cost-and-storm-protection.md`](docs/cost-and-storm-protection.md) — operator guide for prompt caching, severity-based routing, token-cost dashboards, storm-mode and circuit-breaker rollout\n\n**Development \u0026 maintenance**\n\n- [`docs/development.md`](docs/development.md) — build, test, project layout, key patterns, CI/CD\n- [`docs/cost-and-storm-protection-internals.md`](docs/cost-and-storm-protection-internals.md) — architecture and component reference for the cost/storm features\n- [`docs/pre-commit.md`](docs/pre-commit.md) — pre-commit hook configuration\n- [`docs/renovate.md`](docs/renovate.md) — dependency update automation\n- [`docs/cleanup-ghcr.md`](docs/cleanup-ghcr.md) — GHCR tag retention\n\n## License\n\nLicensed under the [Apache License, Version 2.0](LICENSE). See [`NOTICE`](NOTICE) for attribution of bundled third-party software.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmadic-creates%2Fclaude-alert-analyzer","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmadic-creates%2Fclaude-alert-analyzer","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmadic-creates%2Fclaude-alert-analyzer/lists"}