{"id":48456145,"url":"https://github.com/aurevlan/whatisup","last_synced_at":"2026-05-10T02:05:28.438Z","repository":{"id":344379654,"uuid":"1169638327","full_name":"AurevLan/WhatIsUp","owner":"AurevLan","description":"Open-source uptime monitoring: multi-probe geographic correlation, browser scenario recorder, real-time dashboard, teams RBAC, SSO/OIDC, public status pages. One Docker command to deploy.","archived":false,"fork":false,"pushed_at":"2026-05-03T21:43:17.000Z","size":2312,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-05-03T22:27:54.459Z","etag":null,"topics":["alerting","devops","docker","fastapi","incident-management","infrastructure-monitoring","monitoring","multi-probe","open-source","playwright","python","self-hosted","slo","ssl","status-page","uptime","uptime-monitoring","vuejs"],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/AurevLan.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":"CONTRIBUTING.md","funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":".github/CODEOWNERS","security":"SECURITY.md","support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-03-01T01:19:29.000Z","updated_at":"2026-05-03T21:43:12.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/AurevLan/WhatIsUp","commit_stats":null,"previous_names":["aurevlan/whatisup"],"tags_count":35,"template":false,"template_full_name":null,"purl":"pkg:github/AurevLan/WhatIsUp","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AurevLan%2FWhatIsUp","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AurevLan%2FWhatIsUp/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AurevLan%2FWhatIsUp/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AurevLan%2FWhatIsUp/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/AurevLan","download_url":"https://codeload.github.com/AurevLan/WhatIsUp/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AurevLan%2FWhatIsUp/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32587823,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-03T22:12:39.696Z","status":"ssl_error","status_checked_at":"2026-05-03T22:09:10.534Z","response_time":103,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["alerting","devops","docker","fastapi","incident-management","infrastructure-monitoring","monitoring","multi-probe","open-source","playwright","python","self-hosted","slo","ssl","status-page","uptime","uptime-monitoring","vuejs"],"created_at":"2026-04-06T23:03:10.208Z","updated_at":"2026-05-03T23:01:27.672Z","avatar_url":"https://github.com/AurevLan.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003ch1 align=\"center\"\u003eWhatIsUp\u003c/h1\u003e\n\n\u003cp align=\"center\"\u003e\n  \u003cstrong\u003eThe self-hosted uptime platform that actually tells you \u003cem\u003ewhere\u003c/em\u003e things break — and stops shouting when it shouldn't.\u003c/strong\u003e\n\u003c/p\u003e\n\n\u003cp align=\"center\"\u003e\n  Multi-probe geographic correlation · real-time dashboard · SLO tracking · intelligent alerting · public status pages · mobile app.\n\u003c/p\u003e\n\n\u003cp align=\"center\"\u003e\n  \u003ca href=\"LICENSE\"\u003e\u003cimg alt=\"License: MIT\" src=\"https://img.shields.io/badge/License-MIT-blue.svg\"\u003e\u003c/a\u003e\n  \u003cimg alt=\"Version\" src=\"https://img.shields.io/badge/version-1.8.0\"\u003e \u003c!-- x-release-please-version --\u003e\n  \u003cimg alt=\"Python 3.14\" src=\"https://img.shields.io/badge/Python-3.14-blue\"\u003e\n  \u003cimg alt=\"Vue 3\" src=\"https://img.shields.io/badge/Vue-3.5-42b883\"\u003e\n  \u003cimg alt=\"FastAPI\" src=\"https://img.shields.io/badge/FastAPI-0.125+-009688\"\u003e\n  \u003cimg alt=\"PostgreSQL 16\" src=\"https://img.shields.io/badge/PostgreSQL-16-336791\"\u003e\n\u003c/p\u003e\n\n\u003cp align=\"center\"\u003e\n  \u003ca href=\"#quick-start\"\u003eQuick start\u003c/a\u003e ·\n  \u003ca href=\"#whats-new-in-15\"\u003eWhat's new in 1.5\u003c/a\u003e ·\n  \u003ca href=\"#why-whatisup\"\u003eWhy WhatIsUp\u003c/a\u003e ·\n  \u003ca href=\"#features\"\u003eFeatures\u003c/a\u003e ·\n  \u003ca href=\"#architecture\"\u003eArchitecture\u003c/a\u003e ·\n  \u003ca href=\"CHANGELOG.md\"\u003eChangelog\u003c/a\u003e\n\u003c/p\u003e\n\n---\n\n## Why WhatIsUp\n\nThere's no shortage of uptime tools. WhatIsUp focuses on three things most of them don't do well at once:\n\n- 🌍 **Real multi-probe correlation** — deploy lightweight probes in any datacenter, office, or region, and let WhatIsUp tell you if an outage is global, regional, or probe-local. One failed probe no longer means one false page.\n- 🔕 **Alerting that shuts up** — flapping suppression, incident groups, dependency-aware cascade suppression, maintenance windows, storm protection, and a brand-new **impact preview** (see v1.1 below) so you calibrate thresholds with data instead of vibes.\n- 🎛 **Self-hosted, batteries included** — one `docker compose up`, no SaaS lock-in, no per-monitor pricing. Playwright scenarios, SSO/OIDC, teams \u0026 RBAC, IaC import/export, and a mobile app all ship in the box.\n\nIt's built for teams who want Datadog-grade monitoring without Datadog-grade bills, and who'd rather own their data than rent it.\n\n---\n\n## What's new in 1.5\n\n**Wave 1 Tier 1 complete** — eight ⭐ items shipped at once to close the SRE adoption + UX backlog. 100% backward-compatible, two additive migrations.\n\n- ⌨️ **Command palette v2 + global shortcuts** (T1-10 / T1-15) — Cmd/Ctrl+K opens fuzzy search across monitors, incidents, recent items and actions. Hover a row to **pause/resume a monitor** or **acknowledge an incident** without leaving the palette. New keyboard shortcuts (`g d/m/i/a/p/s` for navigation, `c` to create, `/` to search, `?` for the cheatsheet) wired everywhere. Recent visits persist in `localStorage` (capped at 12).\n- 🔕 **Programmable alert silences** (T1-01) — new `AlertSilence` resource and `Silences` page in the sidebar. Mute alerts for one monitor or all of them during a known-noisy window (cert renewal, deploy storm) without distorting uptime. Built-in 15 m / 1 h / 4 h / 1 d duration presets, status badges (Active / Scheduled / Past). Dispatch is short-circuited *before* any external IO.\n- 📱 **Quick-ack \u0026 snooze from mobile push** (T1-04) — FCM payload now ships `actions: [ack, snooze_1h, snooze_4h]`. The mobile app calls the matching endpoint on tap, no need to open the UI. New `Incident.snooze_until` field with auto-clear on resolve. Bounded duration (5 min – 24 h).\n- ✅ **Multi-select bulk actions enriched** (T1-12) — `MonitorsView` gets two new dropdowns: **Move to group** and **Add tag**. `IncidentsView` gains a per-row checkbox plus an **Acknowledge all** button — one round-trip via the new `POST /incidents/bulk-ack` endpoint.\n- 🪄 **CreateMonitor wizard** (T1-14) — new 3-step flow (type → target → review + notifications) replaces the legacy modal for the four most common types (HTTP, TCP, DNS, heartbeat). Mobile-scrollable body. Falls back to the advanced form for scenario / composite / keyword / json_path.\n- ✨ **Polish: skeletons, empty states, replayable tour** (T1-16 / T1-18) — generic `SkeletonBox` / `Row` / `Text` components replace `animate-pulse` placeholders on Dashboard, Monitors and MonitorDetail. Six empty states standardised with contextual CTAs and doc links. The onboarding wizard is now replayable from any empty state via `?tour=1`.\n\nSee the full [CHANGELOG](CHANGELOG.md#150---2026-04-25) for the per-item breakdown, including 14 new pytest cases and 50 new vitest cases (suite at 161 green).\n\n---\n\n## What's new in 1.4\n\n- 🔗 **Shareable filter URLs** (T1-11) — MonitorsView and IncidentsView now persist their filters (search, status, type, group, days) via both the querystring *and* localStorage. Refresh keeps your view; copying the URL reproduces the exact same filtering for a teammate. A new generic `useFilterPreset` composable drives both views with 8 tests.\n- 🌍 **User timezone preference** (T1-13) — New `User.timezone` field (IANA, nullable). Default `null` means the browser's resolved zone is used. Settings page gets a \"Preferences\" card with a 45-zone picker + auto option. Dates across IncidentsView and MonitorDetailView now format in your selected zone; hover a date to see the absolute ISO + zone tooltip. Backend validates against `zoneinfo.available_timezones()` — invalid zones are rejected with 422.\n- 🆕 **`PATCH /auth/me`** — self-update endpoint limited to non-privileged fields (`full_name`, `timezone`). Escalation attempts are silently ignored.\n\nSee the full [CHANGELOG](CHANGELOG.md#140---2026-04-24) for details.\n\n---\n\n## What's new in 1.3\n\n- 📖 **Per-monitor runbooks** — attach an incident response procedure (markdown) to any monitor. A dedicated `Runbook` tab appears on the detail page *only* when enabled, and the rendered content is shown inline on open incidents in the incidents list — so the on-caller sees the steps-to-take without leaving the page. Unchecking the toggle wipes the content server-side (no orphan data). Built-in safe markdown renderer (headings, task checkboxes, code blocks, http/https links only) — no extra dependency.\n- ⚡ **Dashboard load-time — 8 s → \u003c 200 ms** — the monitors list aggregated queries were scanning 1.6 M `check_results` rows on every request. Sparkline switched from a `row_number() OVER (PARTITION BY)` window function (7 s) to a `JOIN LATERAL ... LIMIT 20` (3 ms, index-only per monitor), plus a new BRIN index on `check_results.checked_at` for time-window aggregates (P95 288 ms → 75 ms, uptime bulk 147 ms → 30 ms).\n- 🩹 **Stability fixes** — FastAPI `X-Forwarded-Proto` handling (no more HTTPS → HTTP redirect breakage behind nginx), `/monitors/graph` route order (was 422), sidebar menu click (vue-router 5 slot navigate rewrite), charts rendering (`apexchart` global registration), stricter CSP with external theme init, Cache-Control hardening on `/index.html`.\n\nSee the full [CHANGELOG](CHANGELOG.md#130---2026-04-23) for details.\n\n---\n\n## What's new in 1.1\n\n\u003ctable\u003e\n  \u003ctr\u003e\n    \u003ctd width=\"50%\"\u003e\u003cimg src=\"docs/screenshots/alert-matrix-cards.svg\" alt=\"Alert matrix with cards and impact preview\"\u003e\u003c/td\u003e\n    \u003ctd width=\"50%\"\u003e\u003cimg src=\"docs/screenshots/alert-templates.svg\" alt=\"One-click alerting templates\"\u003e\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003e\n      \u003cstrong\u003eAlert matrix v2 — cards + impact preview\u003c/strong\u003e\u003cbr\u003e\n      The per-monitor alerting panel is now a stack of collapsible cards: one card per condition, coloured channel chips, an \"Advanced\" section that hides noise, and a live \u003ccode\u003e≈ N / 30j\u003c/code\u003e badge that replays the last 30 days of data through your rules so you can calibrate thresholds \u003cem\u003ebefore\u003c/em\u003e they page you.\n    \u003c/td\u003e\n    \u003ctd\u003e\n      \u003cstrong\u003eOne-click alerting templates\u003c/strong\u003e\u003cbr\u003e\n      Apply a preset (Standard, Strict/Paging, Low noise) in a single click. Built-in templates ship seeded in the database and admins can create, edit and delete their own from a dedicated section in the Alerts page. Channels stay empty — you still decide where alerts fire.\n    \u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd colspan=\"2\"\u003e\u003cimg src=\"docs/screenshots/tags-rbac.svg\" alt=\"Monitor tags and tag-scoped RBAC\"\u003e\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd colspan=\"2\"\u003e\n      \u003cstrong\u003eMonitor tags \u0026 tag-scoped RBAC\u003c/strong\u003e\u003cbr\u003e\n      Label monitors with \u003ccode\u003eenv:prod\u003c/code\u003e, \u003ccode\u003eteam:backend\u003c/code\u003e, \u003ccode\u003etier:critical\u003c/code\u003e, whatever makes sense. Filter dashboards and lists by tag. Target alert rules at a tag (\u003ccode\u003eAlertRule.tag_selector\u003c/code\u003e) so one rule covers every monitor that carries it. Grant users \u003ccode\u003eview\u003c/code\u003e/\u003ccode\u003eedit\u003c/code\u003e/\u003ccode\u003eadmin\u003c/code\u003e access scoped to a tag via \u003ccode\u003eUserTagPermission\u003c/code\u003e.\n    \u003c/td\u003e\n  \u003c/tr\u003e\n\u003c/table\u003e\n\nSee the full [CHANGELOG](CHANGELOG.md#110---2026-04-14) for the complete list, including the removal of the never-implemented `uptime_below` condition.\n\n---\n\n## Screenshots\n\n| Dashboard | Monitor detail |\n|-----------|---------------|\n| ![Dashboard](docs/screenshots/dashboard.svg) | ![Monitor detail](docs/screenshots/monitor-detail.svg) |\n\n| Monitors list | Probe map |\n|--------------|-----------|\n| ![Monitors](docs/screenshots/monitors-view.svg) | ![Probes](docs/screenshots/probes-map.svg) |\n\n| Public status page | Scenario builder |\n|-------------------|-----------------|\n| ![Status page](docs/screenshots/public-status.svg) | ![Scenario](docs/screenshots/scenario-builder.svg) |\n\n| Alert matrix v2 | Alerting templates |\n|-----------------|-------------------|\n| ![Alert matrix](docs/screenshots/alert-matrix-cards.svg) | ![Templates](docs/screenshots/alert-templates.svg) |\n\n| Tags \u0026 RBAC | Browser extension recorder |\n|-------------|----------------------------|\n| ![Tags](docs/screenshots/tags-rbac.svg) | ![Extension](docs/screenshots/extension-recorder.svg) |\n\n---\n\n## Features\n\n### Monitoring\n- **HTTP / HTTPS** — status codes, redirect following, response time, SSL certificate expiry\n- **TCP** — port reachability (databases, SSH, SMTP, custom services)\n- **UDP** — datagram probe; ICMP port-unreachable = down, timeout = filtered/open\n- **DNS** — record resolution with optional value assertion (A, AAAA, CNAME, MX, TXT, NS); drift detection (baseline auto-learn); cross-probe consistency check with split-horizon support\n- **Keyword** — response body scan with optional negate mode\n- **JSON Path** — structured response validation (e.g. `$.status == \"ok\"`)\n- **SMTP** — banner + EHLO handshake with optional STARTTLS; measures banner-to-ready time\n- **Ping** — ICMP round-trip time via system `ping`\n- **Domain expiry** — WHOIS lookup; configurable warning days before domain expiration\n- **Browser scenarios** — multi-step Playwright automation (navigate, click, fill, assert, extract, screenshot) with Core Web Vitals (LCP, CLS, INP)\n- **Composite monitors** — aggregate multiple monitors with `all_up`, `any_up`, `majority_up`, or `weighted_up` rules; drives the full incident pipeline\n- **Heartbeat / cron monitoring** — dead-man's switch for scheduled jobs; unique ping URL per monitor\n- **Advanced assertions** — regex body check, response header validation (exact or `/regex/`), JSON Schema validation\n\n### Infrastructure\n- **Multi-probe architecture** — deploy lightweight probe agents in any location; correlate outages geographically\n- **Network type** — tag each probe as `external` (public internet) or `internal` (corporate LAN) to distinguish internal vs external failures\n- **Probe map on dashboard** — Leaflet world map with per-probe 24h uptime (🟢 ≥ 99 % / 🟡 ≥ 90 % / 🔴 \u003c 90 %) and online/offline status; auto-refreshes every 60 s\n- **Network scope per monitor** — restrict each monitor to `all`, `internal`, or `external` probes; useful for LAN-only services\n- **Probe groups** — admin-defined groups; assign probes and grant visibility to specific users\n- **City / address geocoding** — type any address or city to auto-resolve GPS coordinates (Nominatim, no API key)\n\n### Observability\n- **Real-time dashboard** — WebSocket push, no polling\n- **SLO / Error budget** — configurable target (%) and window (days); burn rate and budget-remaining tracking\n- **SLA reports** — custom date range, uptime %, incident list, P95 response time; JSON download\n- **Custom push metrics** — `POST /api/v1/metrics/{monitor_id}` for business KPIs (orders, latency…)\n- **Annotations** — timestamped notes on the monitor timeline (deployments, changes)\n- **Response time trend** — 6-hour rolling comparison with colour-coded indicator\n\n### Incidents \u0026 alerting\n- **Alert matrix v2 (1.1)** — card-based editor: one card per condition, coloured channel chips, repliable \"Advanced\" params (threshold, min-duration, re-notify, business-hours schedule), multi-select condition picker, and per-condition \"How it works\" help in plain language\n- **Impact preview (1.1)** — live `≈ N / 30j` badge on each rule, computed server-side by replaying the proposed configuration against the last 30 days of check results and incidents (statistical tail estimate for anomaly detection)\n- **Alerting templates (1.1)** — apply a preset (Standard, Strict/Paging, Low noise) in one click; templates are stored in DB and managed from a dedicated section in the Alerts page; superadmins create/edit their own, built-in templates are read-only\n- **Automatic incident lifecycle** — open on failure, resolve on recovery, flapping detection with per-monitor thresholds\n- **Incident groups** — monitors sharing the same failing probes within a 90 s window are grouped into one persistent incident group; one notification instead of N\n- **Monitor dependencies** — when a parent monitor is down, child incidents are automatically suppressed; eliminates cascade alert storms\n- **Alert storm protection** — per-rule rate cap (`storm_max_alerts` within `storm_window_seconds`); forced digest when threshold is exceeded\n- **Performance baseline alerting** — alert when response time exceeds a configurable multiple of the 7-day rolling hourly baseline\n- **Anomaly detection** — z-score against a 7-day rolling mean ± stddev, filtered to the same ±3 h window of the day so day/night traffic patterns are respected\n- **Tag-scoped alert rules (1.1)** — target a single rule at every monitor carrying a given tag via `AlertRule.tag_selector`\n- **Auto post-mortem** — Markdown report generated on incident resolution (timeline, alerts, metrics)\n- **Alert channels** — Email (SMTP), Webhook (HMAC-SHA256), Telegram Bot, Slack, PagerDuty, Opsgenie, [Signal](#signal-alerts), FCM (native mobile push)\n- **Persistent digest** — digest scheduling stored in Redis; survives server restarts\n- **Maintenance windows** — suppress alerts during planned downtime; group-level suppression support\n\n### Public status pages\n- **Shareable URL** — `/status/{slug}`, no login required\n- **90-day history bars** — daily uptime visualisation per component\n- **Incident timeline** — 30-day incident log with duration\n- **Email subscriptions** — visitors subscribe to outage updates; secure unsubscribe token\n\n### Platform\n- **Monitor tags \u0026 tag-scoped RBAC (1.1)** — label monitors with free-form `key:value` tags (`env:prod`, `team:backend`, `tier:critical`); filter lists and dashboards by tag; grant users `view`/`edit`/`admin` access scoped to a tag via `UserTagPermission`; one alert rule can target every monitor carrying a given tag\n- **Teams \u0026 RBAC** — create teams, invite members with 4 roles (`owner` \u003e `admin` \u003e `editor` \u003e `viewer`); monitors, groups, channels, and maintenance windows can be team-scoped; backward-compatible — single-user mode preserved when no teams are created\n- **SSO / OIDC** — OpenID Connect PKCE flow; link user accounts to any OIDC provider (Keycloak, Authentik, Auth0, Google…); optional auto-provisioning of new accounts on first login; configured entirely from the admin GUI (no restart required)\n- **Admin panel** — dedicated UI for user management (`is_active`, `can_create_monitors`), probe group access control, all-monitors view, and live OIDC settings\n- **Probe groups** — admin-defined groups linking probes to users; regular users see only the probes assigned to their groups\n- **Network scope** — per-monitor `network_scope` field (`all` / `internal` / `external`); restricts which probe types run each check (e.g. internal-only services stay on LAN probes)\n- **Multi-language** — English (default) and French; toggle in the top bar; persisted to `localStorage`\n- **Light / dark theme** — toggle in top bar; auto-detected from `prefers-color-scheme`; persisted to `localStorage`\n- **Onboarding wizard** — guided 4-step setup for new users (first monitor, first alert); auto-dismissed after completion\n- **Infrastructure-as-Code** — `GET /api/v1/config` exports full config as JSON; `PUT /api/v1/config` imports declaratively with diff, dry-run, and prune support; resources matched by name for idempotence\n- **Plugin architecture** — check types and alert channels use a registry-based plugin system; extend without modifying core code\n- **Bulk actions** — multi-select monitors; bulk enable / pause / delete / export CSV\n- **Audit trail** — every admin action logged with before/after diff\n- **Data retention** — configurable auto-purge of old check results (default: 90 days)\n- **One-command deploy** — interactive wizard generates secrets, `.env`, and starts the stack\n- **Accessibility** — `prefers-reduced-motion` support, skip-to-content link, ARIA labels on interactive elements\n\n### Browser extension — scenario recorder\n\nThe WhatIsUp Chrome extension records browser actions and sends them directly to a monitor:\n\n1. Click **Start recording** in the extension popup\n2. Navigate and interact with any website — clicks, form fills (including passwords), and navigations are captured automatically\n3. Click **Stop** then **Send to WhatIsUp** — the scenario is created as a monitor in one click\n\n**Security**: password values are stored as `{{password_N}}` placeholders in the step list; the real values are kept in a separate encrypted store, encrypted at rest with Fernet, and masked in all API responses. They are decrypted only when delivered to the probe at check time.\n\nInstall the extension from `extension/` by loading it as an unpacked extension in Chrome (`chrome://extensions → Load unpacked`).\n\n---\n\n## Minimum requirements\n\n### Central server (API + frontend + PostgreSQL + Redis)\n\n| Probes | Monitors | CPU | RAM | Disk | PostgreSQL | Redis |\n|--------|----------|-----|-----|------|------------|-------|\n| 1–3 | ≤ 50 | 2 vCPU | 2 GB | 20 GB SSD | shared (in-stack) | shared (in-stack) |\n| 3–10 | 50–200 | 4 vCPU | 4 GB | 40 GB SSD | shared or dedicated | shared |\n| 10–30 | 200–1 000 | 4–8 vCPU | 8 GB | 80 GB SSD | dedicated (4 GB RAM) | dedicated (1 GB) |\n| 30+ | 1 000+ | 8+ vCPU | 16 GB | 160 GB+ SSD | dedicated (8 GB+ RAM) | dedicated (2 GB+) |\n\n**Disk growth** — each check result row is ~300 bytes. With 200 monitors × 60 s interval × 5 probes, expect ~2.5 GB/month in PostgreSQL before retention purge (default: 90 days).\n\n### Probe agent\n\n| Mode | CPU | RAM | Notes |\n|------|-----|-----|-------|\n| HTTP / TCP / DNS / Ping only | 1 vCPU | 256 MB | Lightweight; runs on any VPS or Raspberry Pi |\n| With Playwright scenarios | 2 vCPU | 1 GB | Chromium loaded on demand; set `MAX_CONCURRENT_SCENARIOS=2` |\n| High-volume (100+ monitors) | 2 vCPU | 1–2 GB | Increase `MAX_CONCURRENT_CHECKS` (default: 10) |\n\n### Network\n\n| Component | Ports | Protocol |\n|-----------|-------|----------|\n| Central server (prod) | 80, 443 | HTTP/S (Nginx reverse proxy) |\n| Central server (dev) | 5173 (frontend), 8000 (API) | HTTP |\n| PostgreSQL | 5432 | TCP (internal only) |\n| Redis | 6379 | TCP (internal only) |\n| Probe → Server | 443 (or 8000 dev) | HTTPS outbound only |\n\n### Software\n\n- Docker ≥ 24 and Docker Compose v2\n- Linux amd64 or arm64 (all images are multi-arch)\n\n---\n\n## Quick start\n\n### Requirements\n\n- Docker ≥ 24 and Docker Compose v2\n- 2 GB RAM minimum (see [Minimum requirements](#minimum-requirements) for sizing)\n- Ports 80 / 443 available (production) or 5173 / 8000 (development)\n\n### Development (local)\n\n```bash\ngit clone https://github.com/AurevLan/WhatIsUp.git\ncd whatisup\n\n# Start all services (PostgreSQL, Redis, API, frontend, local probe)\ndocker compose up -d\n\n# Wait for all services to become healthy\ndocker compose ps\n```\n\n| Service | URL |\n|---------|-----|\n| Frontend (Vite dev server) | http://localhost:5173 |\n| API (FastAPI) | http://localhost:8000 |\n| API docs (Swagger UI) | http://localhost:8000/docs |\n\nOn first start an **admin account** and a **local probe** are created automatically. The admin password is written to `/shared/ADMIN_PASSWORD` inside the server container:\n\n```bash\ndocker compose exec server cat /shared/ADMIN_PASSWORD\n# Delete the file after reading\ndocker compose exec server rm /shared/ADMIN_PASSWORD\n```\n\n### Production deploy\n\n\u003e **Recommended** — use the interactive wizard for all deployments:\n\n```bash\nbash deploy.sh\n```\n\nThe wizard generates secrets, writes `.env`, starts the stack, and **displays the admin password on screen** before securely deleting the temp file. See [`deploy.sh`](#deploying-with-deploysh) below for details.\n\n#### Manual production setup\n\n```bash\n# 1. Copy and edit the environment file\ncp .env.example .env\n\n# 2. Generate required secrets\nSECRET_KEY=$(openssl rand -hex 32)\nFERNET_KEY=$(python3 -c \\\n  \"from cryptography.fernet import Fernet; print(Fernet.generate_key().decode())\")\n\n# Add to .env\necho \"SECRET_KEY=$SECRET_KEY\" \u003e\u003e .env\necho \"FERNET_KEY=$FERNET_KEY\" \u003e\u003e .env\n\n# 3. Start the production stack\ndocker compose -f docker-compose.prod.yml up -d\n\n# 4. Apply database migrations\ndocker compose -f docker-compose.prod.yml exec server alembic upgrade head\n```\n\n#### Environment variables\n\n| Variable | Required | Default | Description |\n|----------|----------|---------|-------------|\n| `SECRET_KEY` | ✅ prod | — | JWT signing key (`openssl rand -hex 32`) |\n| `FERNET_KEY` | ✅ prod | — | Fernet key for encrypting alert secrets at rest |\n| `DATABASE_URL` | ✅ | `postgresql+asyncpg://whatisup:whatisup@localhost/whatisup` | PostgreSQL connection string |\n| `REDIS_URL` | — | `redis://localhost:6379/0` | Redis connection string |\n| `CORS_ALLOWED_ORIGINS` | ✅ prod | `http://localhost:5173` | Comma-separated HTTPS origins |\n| `ENVIRONMENT` | — | `production` | Set to `development` to relax security checks |\n| `REGISTRATION_OPEN` | — | `true` | `false` = invite-only after first user |\n| `DATA_RETENTION_DAYS` | — | `90` | Days to keep check results (0 = keep forever) |\n| `SMTP_HOST` | — | `localhost` | SMTP server for email alerts |\n| `SMTP_PORT` | — | `587` | SMTP port |\n| `SMTP_USER` | — | — | SMTP username |\n| `SMTP_PASSWORD` | — | — | SMTP password |\n| `SMTP_FROM` | — | `noreply@example.com` | Sender address |\n| `OIDC_ENABLED` | — | `false` | Enable OIDC login (can also be set from admin GUI) |\n| `OIDC_ISSUER_URL` | — | — | OIDC provider discovery URL (e.g. `https://accounts.google.com`) |\n| `OIDC_CLIENT_ID` | — | — | Client ID registered with the OIDC provider |\n| `OIDC_CLIENT_SECRET` | — | — | Client secret (stored encrypted in DB when set from admin GUI) |\n| `OIDC_REDIRECT_URI` | — | — | Callback URL (leave empty to auto-detect from request base URL) |\n| `OIDC_SCOPES` | — | `openid email profile` | Space-separated OIDC scopes |\n| `OIDC_AUTO_PROVISION` | — | `true` | Create user accounts on first OIDC login |\n\n---\n\n## Deploying probe agents\n\nProbes are lightweight Python processes that run checks from a given location and report results to the central server. Deploy as many as you need in different datacenters, offices, or cloud regions.\n\n### 1. Register the probe\n\nGo to **Probes → Register probe** in the UI:\n1. Enter a **name** (e.g. `paris-dc1`) and **location** (any address, city, or landmark)\n2. Click **Locate** — Nominatim resolves the location to GPS coordinates automatically\n3. Choose **Network type**: `External` (public internet) or `Internal` (corporate LAN)\n4. Save — copy the API key displayed **only once**\n\n### 2. Run the probe\n\n```bash\ndocker run -d \\\n  --name whatisup-probe \\\n  --restart unless-stopped \\\n  -e CENTRAL_URL=https://your-whatisup.example.com \\\n  -e PROBE_API_KEY=wiu_your_api_key_here \\\n  -e PROBE_LOCATION=\"Paris DC1\" \\\n  ghcr.io/your-org/whatisup-probe:latest\n```\n\nOr with Docker Compose:\n\n```yaml\n# docker-compose.probe.yml\nservices:\n  probe:\n    image: ghcr.io/your-org/whatisup-probe:latest\n    restart: unless-stopped\n    environment:\n      CENTRAL_URL: https://your-whatisup.example.com\n      PROBE_API_KEY: wiu_your_api_key_here\n      PROBE_LOCATION: \"Paris DC1\"\n      MAX_CONCURRENT_CHECKS: \"10\"\n      HEARTBEAT_INTERVAL: \"15\"\n```\n\n### Probe environment variables\n\n| Variable | Required | Default | Description |\n|----------|----------|---------|-------------|\n| `CENTRAL_URL` | ✅ | — | WhatIsUp server base URL |\n| `PROBE_API_KEY` | ✅ | — | API key from probe registration |\n| `PROBE_LOCATION` | — | `unknown` | Display name in the UI |\n| `MAX_CONCURRENT_CHECKS` | — | `10` | Max parallel checks |\n| `MAX_CONCURRENT_SCENARIOS` | — | `2` | Max concurrent Playwright/Chromium instances (subset of `MAX_CONCURRENT_CHECKS`; reduce on low-memory machines) |\n| `HEARTBEAT_INTERVAL` | — | `15` | Seconds between server heartbeats |\n\n---\n\n## Signal alerts\n\nWhatIsUp sends Signal messages through a small REST gateway that runs alongside the server — it does not talk to Signal directly. The gateway project is [**bbernhard/signal-cli-rest-api**](https://github.com/bbernhard/signal-cli-rest-api), a maintained wrapper around the official `signal-cli`.\n\n### 1. Run the gateway\n\nAdd a service to your `docker-compose.yml`:\n\n```yaml\nsignal-api:\n  image: bbernhard/signal-cli-rest-api:latest\n  restart: unless-stopped\n  environment:\n    - MODE=normal\n  volumes:\n    - ./signal-data:/home/.local/share/signal-cli\n  ports:\n    - \"8080:8080\"\n```\n\n### 2. Register a phone number\n\nFollow the [gateway's README](https://github.com/bbernhard/signal-cli-rest-api#register-a-number). Typical flow:\n\n```bash\n# Request the SMS code\ncurl -X POST \"http://localhost:8080/v1/register/+33612345678\"\n\n# Enter the code you received\ncurl -X POST \"http://localhost:8080/v1/register/+33612345678/verify/123456\"\n```\n\n### 3. Add a Signal channel in WhatIsUp\n\nIn the UI: **Alerts → Add channel → Signal**, then fill:\n\n| Field | Example |\n|---|---|\n| **API URL** | `http://signal-api:8080` (internal hostname if the gateway is in the same Compose network) |\n| **Sender number** | `+33612345678` (E.164 format, the number you registered above) |\n| **Recipients** | `+33612345678, +33698765432` (comma-separated; Signal group IDs are also accepted as recipients) |\n\nClick **Test** to send a confirmation message. The channel configuration (`api_url`, `sender_number`, `recipients`) is encrypted at rest with Fernet like every other alert channel.\n\nImplementation: [`server/whatisup/services/channels/signal.py`](server/whatisup/services/channels/signal.py).\n\n---\n\n## Heartbeat monitoring (cron jobs)\n\nCreate a monitor of type **Heartbeat**, copy the generated ping URL, then call it from your job:\n\n```bash\n# In your crontab or CI pipeline\ncurl -s https://your-whatisup.example.com/api/v1/ping/your-heartbeat-slug\n```\n\nWhatIsUp opens an incident automatically if no ping arrives within `interval + grace` seconds.\n\n---\n\n## Custom push metrics\n\nPush any numeric metric from your application and visualise it alongside uptime data:\n\n```bash\ncurl -X POST https://your-whatisup.example.com/api/v1/metrics/{monitor_id} \\\n  -H \"Authorization: Bearer $TOKEN\" \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\"metric_name\": \"orders_per_minute\", \"value\": 42.5, \"unit\": \"req/min\"}'\n```\n\nMetrics appear as time-series graphs grouped by `metric_name` in the monitor detail view.\n\n---\n\n## API reference\n\nFull interactive documentation at `/docs` (Swagger UI) and `/redoc`.\n\n### Authentication\n\n```bash\nTOKEN=$(curl -s -X POST https://your-whatisup.example.com/api/v1/auth/login \\\n  -H \"Content-Type: application/x-www-form-urlencoded\" \\\n  -d \"username=admin@example.com\u0026password=your_password\" \\\n  | jq -r '.access_token')\n\ncurl https://your-whatisup.example.com/api/v1/monitors/ \\\n  -H \"Authorization: Bearer $TOKEN\"\n```\n\n### Selected endpoints\n\n| Method | Endpoint | Description |\n|--------|----------|-------------|\n| `GET` | `/api/v1/monitors/` | List monitors |\n| `POST` | `/api/v1/monitors/` | Create monitor |\n| `POST` | `/api/v1/monitors/bulk` | Bulk enable / pause / delete |\n| `POST` | `/api/v1/monitors/{id}/trigger-check` | Trigger immediate check |\n| `GET` | `/api/v1/monitors/{id}/slo` | SLO / error budget status |\n| `GET` | `/api/v1/monitors/{id}/report` | SLA report (custom date range) |\n| `GET` | `/api/v1/monitors/{id}/incidents/{inc}/postmortem` | Auto post-mortem (Markdown) |\n| `GET` | `/api/v1/monitors/{id}/annotations` | List timeline annotations |\n| `POST` | `/api/v1/metrics/{monitor_id}` | Push custom metric |\n| `GET` | `/api/v1/metrics/{monitor_id}` | List custom metrics |\n| `GET` | `/api/v1/public/pages/{slug}/monitors` | Public status page data (no auth) |\n| `POST` | `/api/v1/public/pages/{slug}/subscribe` | Subscribe to status page |\n| `GET` | `/api/v1/ping/{slug}` | Heartbeat ping |\n| `GET` | `/api/v1/config/` | Export full config (IaC) |\n| `PUT` | `/api/v1/config/` | Import declarative config (IaC) |\n| `POST` | `/api/v1/teams/` | Create team |\n| `GET` | `/api/v1/teams/` | List user's teams |\n| `POST` | `/api/v1/teams/{id}/members` | Add team member |\n| `GET` | `/api/v1/onboarding/status` | Onboarding progress |\n| `POST` | `/api/v1/onboarding/complete` | Mark onboarding done |\n| `GET` | `/api/v1/status/monitors` | External status API |\n\n---\n\n## Architecture\n\n```\n┌─────────────────────────────────────────────────────────┐\n│                        Browser                           │\n│   Vue 3 · Pinia · Vite · Tailwind · ApexCharts · Leaflet│\n│   vue-i18n (EN / FR)                                    │\n└───────────────────────┬─────────────────────────────────┘\n                        │ HTTP + WebSocket\n┌───────────────────────▼─────────────────────────────────┐\n│                    FastAPI server                         │\n│  auth · monitors · probes · alerts · metrics · ws        │\n│  slowapi · structlog · Alembic · Prometheus metrics      │\n└─────┬──────────────────┬──────────────────┬─────────────┘\n      │                  │                  │\n┌─────▼──────┐  ┌────────▼──────┐  ┌───────▼───────────┐\n│ PostgreSQL │  │     Redis     │  │   Probe agent(s)  │\n│  (main DB) │  │ cache · pub/  │  │  APScheduler      │\n│            │  │ sub · rate    │  │  Playwright        │\n└────────────┘  └───────────────┘  └───────────────────┘\n```\n\n| Layer | Location |\n|-------|----------|\n| API endpoints | `server/whatisup/api/v1/` |\n| ORM models | `server/whatisup/models/` |\n| Pydantic schemas | `server/whatisup/schemas/` |\n| Business logic | `server/whatisup/services/` |\n| Core (config, security, db) | `server/whatisup/core/` |\n| Probe agent | `probe/whatisup_probe/` |\n| Frontend | `frontend/src/` |\n\n---\n\n## Development\n\n### Tests \u0026 linting\n\n```bash\n# Backend (server + probe)\ncd server \u0026\u0026 pip install -e \".[dev]\" \u0026\u0026 pytest\ncd probe \u0026\u0026 pip install -e \".[dev]\" \u0026\u0026 pytest\nruff check . \u0026\u0026 ruff format .\npip-audit\n\n# Frontend (Vitest + jsdom)\ncd frontend\nnpm install\nnpm test\nnpm run lint\nnpm audit\n```\n\nTests also run inside Docker:\n\n```bash\ndocker compose run --rm --no-deps server pytest tests/\ndocker compose run --rm --no-deps probe pytest tests/\ndocker run --rm -v ./frontend:/app -w /app node:25-alpine npx vitest run\n```\n\n### Database migrations\n\n```bash\ncd server\n\n# Generate after model changes\nalembic revision --autogenerate -m \"short description\"\n\n# Apply\nalembic upgrade head\n\n# Rollback one step\nalembic downgrade -1\n```\n\n---\n\n## Deploying with `deploy.sh`\n\nThe root `deploy.sh` script is an interactive wizard (in French) that handles the entire production setup. Run it with:\n\n```bash\nbash deploy.sh\n```\n\n### Deployment modes\n\n| Mode | Description |\n|------|-------------|\n| **1 — Serveur + sonde centrale** | Full platform with a local probe (recommended for single-server setups) |\n| **2 — Serveur seul** | Server only; add remote probes later |\n| **3 — Sonde distante** | Standalone probe agent that auto-enrolls to an existing server via API |\n\n### What the wizard does\n\n1. **Checks dependencies** — Docker, Docker Compose, `curl`, `openssl`\n2. **Generates secrets** — `SECRET_KEY` (hex), `FERNET_KEY` (Fernet), PostgreSQL and Redis passwords\n3. **Prompts for configuration** — domain name, SMTP settings, DNS servers (for probe modes), Let's Encrypt email\n4. **Generates `.env` files** — `.env` for the server stack, `.env.probe` for remote probe mode; file permissions set to `600`\n5. **Self-signed certificate** — generates a temporary TLS cert if Let's Encrypt is not configured\n6. **Probe auto-enrollment** (mode 3) — registers the probe via `POST /api/v1/probes/register` and writes the API key to `.env.probe`\n7. **Starts the stack** — builds and launches Docker Compose services\n8. **Displays credentials** — reads the admin password from a temp file, displays it in a framed box, then deletes the file from the container (first boot only)\n\n\u003e **Tip**: for Let's Encrypt, ensure port 80 is reachable from the internet and set your DNS A record before running the wizard.\n\n---\n\n## Security\n\n- **JWT** — HS256, access 15 min + refresh 7 days, Redis-revocable\n- **OIDC / SSO** — PKCE authorization-code flow; `oidc_client_secret` encrypted at rest with Fernet; secret never returned by the API\n- **Probe auth** — `X-Probe-Api-Key` bcrypt 12 rounds + Redis cache 300 s\n- **WebSocket auth** — JSON message frame (`{\"type\":\"auth\",\"token\":\"…\"}`), never URL parameter\n- **Secrets at rest** — Fernet encryption for alert channel secrets (SMTP passwords, Telegram tokens, webhook secrets, PagerDuty / Opsgenie keys), OIDC client secret, **and** scenario variables (`secret: true`); `FERNET_KEY` is required in production (server refuses to start without it)\n- **SSRF protection** — all outbound HTTP requests (webhooks, OIDC discovery, probe checks, scenario navigation) validated against private/loopback/link-local IP ranges; redirect targets re-validated after following\n- **CORS** — explicit origins only; HTTP origins rejected in production\n- **CSP** — `default-src 'self'; script-src 'self'`\n- **Rate limiting** — all mutating endpoints rate-limited (30/min PATCH/DELETE, 60/min public pages); login 10/min, register 5/min, heartbeat 30/min, results 60/min, monitor creation 10/min\n- **Input validation** — Pydantic schemas use `extra=\"forbid\"` to reject unexpected fields on all create/update endpoints\n- **WebSocket** — per-IP connection limit enforced before the auth handshake; public slug validated against DB before accepting\n- **Ownership enforcement** — all mutating endpoints (including alert rule delete) verify resource ownership via JOIN; superadmin bypass is explicit\n- **Docker** — non-root user in all images; CPU/memory resource limits in production\n\nSee [SECURITY.md](SECURITY.md) for the responsible disclosure policy.\n\n---\n\n## Changelog\n\nSee [CHANGELOG.md](CHANGELOG.md) for the full version history.\n\n## License\n\nMIT — see [LICENSE](LICENSE).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Faurevlan%2Fwhatisup","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Faurevlan%2Fwhatisup","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Faurevlan%2Fwhatisup/lists"}