https://github.com/FluidifyAI/Regen
Open-source incident management Alerts, on-call, AI post-mortems. Self-hosted alternative to PagerDuty & incident.io. Works with Prometheus, Grafana, Datadog, Slack, and Teams. Free forever, BYO-AI.
https://github.com/FluidifyAI/Regen
ai alerting devops grafana incident-management observability on-call open-source pagerduty-alternative prometheus self-hosted slack sre
Last synced: 23 days ago
JSON representation
Open-source incident management Alerts, on-call, AI post-mortems. Self-hosted alternative to PagerDuty & incident.io. Works with Prometheus, Grafana, Datadog, Slack, and Teams. Free forever, BYO-AI.
- Host: GitHub
- URL: https://github.com/FluidifyAI/Regen
- Owner: FluidifyAI
- License: other
- Created: 2026-02-07T17:13:23.000Z (4 months ago)
- Default Branch: main
- Last Pushed: 2026-05-23T19:10:22.000Z (28 days ago)
- Last Synced: 2026-05-23T20:22:09.834Z (28 days ago)
- Topics: ai, alerting, devops, grafana, incident-management, observability, on-call, open-source, pagerduty-alternative, prometheus, self-hosted, slack, sre
- Language: Go
- Homepage: https://fluidify.ai
- Size: 40.8 MB
- Stars: 38
- Watchers: 4
- Forks: 7
- Open Issues: 14
-
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
- Code of conduct: CODE_OF_CONDUCT.md
- Codeowners: .github/CODEOWNERS
- Security: .github/SECURITY.md
Awesome Lists containing this project
- awesome-sre-tools - Regen - Open-source, self-hosted incident management with alert ingestion, on-call scheduling, escalation policies, AI-powered post-mortems, and Slack/Teams integration. AGPLv3 — self-hosted alternative to PagerDuty and Grafana OnCall. (Incident Management / Incident Response / IT Alerting / On-Call / Container Orchestration)
README
Part of the FluidifyAI open-source suite
---
> Unlimited alert noise reduction and incidents, unlimited on-call schedules, and unlimited AI postmortems and handoff digests.
> The **one-stop alternative to PagerDuty + incident.io**, with **1-click import from Grafana OnCall/PagerDuty**.
---
---
## Features
- Incident lifecycle with immutable timeline
- On-call rotations, layers, overrides
- Escalation policies with multi-step timeouts
- Alert ingestion — Prometheus, Grafana, CloudWatch, generic webhook
- Slack integration — channels, bot commands, timeline sync
- Microsoft Teams integration — Adaptive Cards, bot commands
- 1-click migration from Grafana OnCall/PagerDuty
- AI incident summaries + post-mortem drafts (BYO key — OpenAI/Anthropic/Ollama)
- AI Postmortems, Handoffs, Summaries sych with Slack/Teams
- SSO / SAML — Okta, Azure AD, Google Workspace — **free, always**
- Docker Compose + Kubernetes Helm chart
- PostgreSQL HA + Redis Sentinel support
- No limits on incidents/AI features
---
## Integrations
| Category | Tools |
|---|---|
| **Alert ingestion** | Prometheus Alertmanager · Grafana · AWS CloudWatch · Generic webhook |
| **Chat** | Slack · Microsoft Teams · Telegram |
| **AI** | OpenAI · Anthropic · Ollama (BYO key — local or cloud) |
| **Auth** | SAML 2.0 — Okta · Azure AD · Google Workspace · any compliant IdP |
| **Migration** | Grafana OnCall · PagerDuty |
| **Deploy** | Docker Compose · Kubernetes Helm · bare metal |
---
## Highlights of AI Capabilities
### Incident Summarization
### Historical Pattern Matching
### Post-Mortem Agent
### Handoff Digest
---
## Fluidify Regen Vs Pagerduty/incident.io/Grafana Oncall
| | Regen | PagerDuty | incident.io | Grafana OnCall |
|---|---|---|---|---|
| Price | Free | $21–50/user/mo | $30+/user/mo | Archived |
| Self-hosted | ✅ | ❌ | ❌ | ✅ (archived) |
| Open source | AGPLv3 | ❌ | ❌ | Apache 2.0 |
| SSO | ✅ Free | 💰 Paid | 💰 Paid | ✅ Free |
| BYO AI | ✅ | ❌ | ❌ | ❌ |
| Agent-native | ✅ | ❌ | ❌ | ❌ |
| Alert + incident + on-call in one | ✅ | 💰💰💰 Paid | 💰💰💰 Paid | 💰💰💰 Paid |
| 1-Click imports | ✅ | ❌ | ❌ | ❌ |
---
> ## Migrate in 1 click from
>
> - [PagerDuty](docs/migrations/pagerduty.md)
> - [Grafana Oncall](docs/migrations/grafana-oncall.md)
---
## Installation
```bash
docker pull ghcr.io/fluidifyai/regen:latest
```
For detailed installation guides, see:
- [Docker](install-docker.md)
- [Docker Compose](install-docker-compose.md)
- [Kubernetes](install-kubernetes.md)
---
## Built for production
### Benchmark results (HA stack · Apple M2 / Colima · 2026-03-31)
| Scenario | Result |
|---|---|
| Webhook ingestion p99 | **< 10 ms** (target: < 200 ms) |
| Webhook sustained p50 / p95 | **1.55 ms / 2.82 ms** |
| API reads p95 (list / detail) | **4.42 ms / 2.83 ms** |
| Peak throughput (burst test) | **3,917 RPS — 0 × 5xx** |
| PostgreSQL failover RTO | **11 s** (Patroni + HAProxy, target: < 60 s) |
| Redis failover RTO | **5 s** (Sentinel 3-node quorum) |
| In-flight requests lost on rolling deploy | **0** |
> Production numbers will be higher — these were captured on a single-machine local HA stack.
> Reproduce yourself: `make load-test` and `make chaos-db`. Full methodology in [docs/RELIABILITY.md](docs/RELIABILITY.md).
### How it stays up
- **Zero-downtime deploys** — rolling restarts drain in-flight requests before pod shutdown (SIGTERM → 30 s drain → exit)
- **PostgreSQL HA** — Patroni manages automatic primary election; HAProxy re-routes to the new primary within one health-check interval (3 s). No app restart, no config change.
- **Redis Sentinel** — 3-node quorum detects primary loss; workers reconnect to new master automatically
- **Kubernetes-native** — HPA, health-gated rolling deploys, resource limits out of the box
- **Webhook flood protection** — rate limiter returns 429 before the DB sees load spikes; validated at 3,917 RPS with zero OOM events
- **Full observability** — `/metrics` (Prometheus) + pre-built Grafana dashboard in `deploy/grafana/`
### Send a test alert
```bash
curl -X POST http://localhost:8080/api/v1/webhooks/prometheus \
-H "Content-Type: application/json" \
-d '{
"receiver": "fluidify-regen",
"status": "firing",
"alerts": [{
"status": "firing",
"labels": {"alertname": "TestAlert", "severity": "critical"},
"annotations": {"summary": "Test alert from curl"},
"startsAt": "2024-01-01T00:00:00Z"
}]
}'
```
An incident is created automatically. If Slack is configured, a dedicated channel appears within seconds.
---
## Security
- **Authentication**: bcrypt (cost 12), timing-safe comparison, 5-attempt account lockout, HTTP-only SameSite=Strict session cookies
- **No SQL injection surface**: All database access uses GORM parameterized queries — no raw string interpolation
- **Webhook verification**: Slack (HMAC-SHA256 + replay protection), Teams (RSA/OIDC), CloudWatch (RSA + SSRF-safe cert validation)
- **Rate limiting**: Redis Lua script enforcing three tiers — 10/min on auth endpoints, 120/min unauthenticated, 600/min authenticated
- **Security headers**: CSP, HSTS (2 years), X-Frame-Options, X-Content-Type-Options, Permissions-Policy on every response
- **Container hardening**: non-root UID 1001, read-only filesystem, all Linux capabilities dropped
- **CORS**: explicit allowlist via `CORS_ALLOWED_ORIGINS`; dev-only fallback to localhost
- **Frontend**: no `dangerouslySetInnerHTML`, no secrets in bundle, session token never accessible to JavaScript
Review the **[Production Security Checklist](SECURITY.md#11-production-security-checklist)** — TLS, PostgreSQL password, Redis auth, and CORS origins for prerequisiste checklist.
Full security architecture: [SECURITY.md](SECURITY.md)
---
## Contributing
We love contributions big and small. This is how you join us:
```bash
# Start backend + dependencies
docker-compose up -d db redis
# Run backend with hot reload
cd backend && go run ./cmd/regen/... serve
# Run frontend with hot reload
cd frontend && npm install && npm run dev
```
Read for raising a PR:
- Read the setup & workflow in [CONTRIBUTING.md](CONTRIBUTING.md)
- Discover all developer commands with `make help`
- Have a big idea? [Let’s discuss it first](https://github.com/FluidifyAI/Regen/discussions)
---
## Support us
If you find Regen useful, consider supporting us by:
- Star this repo - It helps others discover Regen
- [Guide us](https://github.com/FluidifyAI/Regen/issues/new) - Every issue you raise goes into building
---
## License
[AGPLv3](LICENSE) — free forever, including SSO.
---
Built by FluidifyAI · your incident data belongs to you