An open API service indexing awesome lists of open source software.

https://github.com/or-carmeli/kubequest

Interactive Kubernetes learning game - quizzes, incident simulations, daily challenges, and a global leaderboard. React 19 + Supabase + Vercel.
https://github.com/or-carmeli/kubequest

devops k8s kubernetes learning quiz react supabase vite

Last synced: about 1 month ago
JSON representation

Interactive Kubernetes learning game - quizzes, incident simulations, daily challenges, and a global leaderboard. React 19 + Supabase + Vercel.

Awesome Lists containing this project

README

          

# KubeQuest

**Kubernetes incident training for DevOps and SRE engineers.**

Practice real-world Kubernetes troubleshooting through interactive quizzes, War Room incident simulations, and daily challenges. Build production-grade debugging instincts before production does it for you.

[![Live](https://img.shields.io/badge/Live-kubequest.online-00D4FF?style=flat-square&logo=vercel)](https://www.kubequest.online/)
[![License: MIT](https://img.shields.io/badge/License-MIT-green?style=flat-square)](LICENSE)
[![CI](https://img.shields.io/github/actions/workflow/status/or-carmeli/KubeQuest/ci.yml?branch=main&style=flat-square&label=CI)](https://github.com/or-carmeli/KubeQuest/actions/workflows/ci.yml)
[![Tests](https://img.shields.io/badge/tests-742%2B%20passed-10B981?style=flat-square)](https://github.com/or-carmeli/KubeQuest/actions)
[![Supabase](https://img.shields.io/badge/Supabase-PostgreSQL-3ECF8E?style=flat-square&logo=supabase)](https://supabase.com)
[![Status](https://img.shields.io/badge/Status-status.kubequest.online-10B981?style=flat-square)](https://status.kubequest.online)

---

[kubequest.online](https://www.kubequest.online/) - no registration required, works instantly in guest mode.


KubeQuest Demo

---

## Table of Contents

- [Features](#features)
- [War Room](#war-room)
- [PRO Subscription](#pro-subscription)
- [Learning Path](#learning-path)
- [Tech Stack](#tech-stack)
- [Architecture](#architecture)
- [Billing Architecture](#billing-architecture)
- [Security Model](#security-model)
- [Observability](#observability)
- [CI/CD & Supply Chain Security](#cicd--supply-chain-security)
- [Local Development](#local-development)
- [Environment Variables](#environment-variables)
- [Testing](#testing)
- [Kubernetes Deployment](#kubernetes-deployment)
- [Project Structure](#project-structure)
- [Troubleshooting](#troubleshooting)
- [Contributing](#contributing)
- [Disclaimer](#disclaimer)
- [License](#license)

---

## Features

### Free Tier
- **Topic Quizzes** - 8 topics, 3 difficulty levels each, progressively unlocked
- **Daily Challenge** - 5 fresh questions every day
- **Mixed Quiz** - random questions across all topics
- **War Room** - 9 catalog incident scenarios (free)
- **Interview Mode** - mandatory timer, hints disabled, exam pressure
- **Leaderboard** - global ranking with tier badges
- **Achievements** - milestone-based reward system
- **Hebrew / English** - full bilingual support with RTL layout
- **Guest Mode** - no account needed; sign up to sync progress
- **Kubernetes Guide** - built-in cheatsheet for quick reference
- **Roadmap View** - visual learning path through all topics

### PRO Tier
- **Unlimited Generated Incidents** - template-based War Room scenarios with randomized variations
- **Architecture Scenarios** - infrastructure decision-making simulations
- **Advanced SRE Training** - production-grade troubleshooting drills

---

## War Room

The War Room is KubeQuest's flagship feature: an interactive incident simulator that presents realistic Kubernetes production failures.

Users investigate system signals using simulated `kubectl` commands, discover clues stage by stage, and identify root causes of failures including misconfigured Services, missing ConfigMaps, broken probes, storage provisioning failures, DNS resolution issues, network policy blocks, OOM kills, and node disk pressure.

**9 catalog incidents** are available for free. Each is scored on investigation quality, efficiency, hints used, and root cause accuracy. Solved incidents contribute to the global leaderboard.

**Generated incidents** (PRO) use templates with randomized parameters - different namespaces, service names, pod hashes, and failure variations - providing unlimited practice without repeating the same scenario.

---

## PRO Subscription

KubeQuest uses a centralized FREE vs PRO feature gating architecture with server-authoritative subscription state.

### How It Works

1. User clicks "Upgrade to PRO" in the upgrade modal
2. Frontend calls a Supabase Edge Function (`create-checkout`)
3. Edge Function creates a Lemon Squeezy checkout session and returns the URL
4. User completes payment on Lemon Squeezy's hosted checkout
5. Lemon Squeezy sends a webhook to the `lemonsqueezy-webhook` Edge Function
6. Webhook verifies the HMAC-SHA256 signature and upserts subscription state into `user_subscriptions`
7. Frontend reads subscription via `get_my_subscription()` RPC - returns `tier: pro` or `tier: free`

### Security Model

- **Server-authoritative**: PRO access is determined solely by the `user_subscriptions` table, populated by verified webhooks
- **Fail-closed**: any error in subscription fetch defaults to FREE tier
- **No client trust**: localStorage cannot grant PRO access (DEV-only override is dead-code-eliminated in production builds)
- **Webhook signature verification**: HMAC-SHA256 via Web Crypto API
- **RLS**: users can only read their own subscription row; only the service role (webhooks) can write

### Billing Management

PRO users can manage their subscription through Lemon Squeezy's customer portal, accessible via the "Manage Billing" button in the upgrade modal. Cancellation removes PRO access at the end of the billing period.

---

## Learning Path

| # | Topic | Key Areas |
|---|-------|-----------|
| 1 | **Workloads & Scheduling** | Pods, Deployments, StatefulSets, DaemonSets, Jobs, taints/tolerations, HPA |
| 2 | **Networking & Services** | Services, Ingress, CoreDNS, NetworkPolicy, kube-proxy |
| 3 | **Cluster Operations** | kubeadm init/join/upgrade, etcd backup & restore, Static Pods, certificates |
| 4 | **Config & Secrets** | ConfigMaps, Secrets, RBAC, ServiceAccounts, Pod Security Admission |
| 5 | **Storage & Helm** | PV/PVC, StorageClass, dynamic provisioning, access modes, Helm |
| 6 | **Troubleshooting & Debugging** | CrashLoopBackOff, ImagePullBackOff, Node NotReady, DNS issues, probe failures |
| 7 | **OS & Linux Deep Dive** | Processes, memory, CPU, networking, container runtime internals |
| 8 | **Argo & GitOps** | ArgoCD, sync policies, ApplicationSets, App of Apps |

---

## Tech Stack

| Layer | Technology |
|-------|-----------|
| Frontend | [React 19](https://react.dev) + [Vite](https://vitejs.dev) |
| Backend | [Supabase](https://supabase.com) (PostgreSQL + Auth + Edge Functions) |
| Billing | [Lemon Squeezy](https://lemonsqueezy.com) (checkout, webhooks, customer portal) |
| Hosting | [Vercel](https://vercel.com) (Edge Network + CDN) |
| Error Tracking | [Sentry](https://sentry.io) (production errors + Web Vitals) |
| Analytics | [Vercel Analytics](https://vercel.com/analytics) + custom event catalog |
| CI/CD | GitHub Actions (build, test, scan, sign, attest) |
| Supply Chain | Cosign + Trivy + SBOM + Provenance |
| Security | CSP, HSTS, CORS, RLS, CodeQL, npm audit, Gitleaks, Kyverno |
| Monitoring | Supabase Edge Functions + pg_cron (60s health checks) |
| Testing | [Vitest](https://vitest.dev) (742+ tests) |
| Containerization | Docker (multi-stage, nginx:alpine, ~25MB) |

---

## Architecture

### Runtime

```mermaid
flowchart TB
USER([User])

subgraph Frontend["Frontend (Vercel)"]
SPA["React SPA"]
PWA["Service Worker
Cache + Recovery"]
end

subgraph Backend["Supabase"]
AUTH["Authentication"]
RPC["RPC Functions
Scoring + Gating"]
EDGE["Edge Functions
Billing + Health"]
end

subgraph Billing["Lemon Squeezy"]
CHECKOUT["Checkout"]
WEBHOOK["Webhook"]
PORTAL["Customer Portal"]
end

subgraph Database["PostgreSQL"]
DB[("Quiz Data
Subscriptions
Progress")]
end

USER -->|HTTPS| SPA
SPA --> PWA
SPA --> AUTH
SPA --> RPC
RPC --> DB
EDGE --> DB
WEBHOOK -->|HMAC verified| EDGE
EDGE --> CHECKOUT
EDGE --> PORTAL

style Frontend fill:#111827,stroke:#00D4FF,stroke-width:2px,color:#ffffff
style Backend fill:#111827,stroke:#A855F7,stroke-width:2px,color:#ffffff
style Billing fill:#111827,stroke:#10B981,stroke-width:2px,color:#ffffff
style Database fill:#111827,stroke:#F59E0B,stroke-width:2px,color:#ffffff
```

### Server-Authoritative Scoring

The client never sees correct answers until after submission. All scoring runs through `SECURITY DEFINER` RPC functions with rate limiting. Quiz questions use stable content-derived keys for deduplication across content reseeds.

### Feature Gating

```
featureGating.js → Feature catalog (FREE/PRO per feature)
subscriptionTier.js → Server-synced tier cache
useProAccess.js → React hook: { tier, isPro, canAccess, loading }
UpgradeModal.jsx → Checkout CTA / Manage Billing / Guest prompt
```

All gating decisions flow through `hasFeatureAccess(featureKey, userTier)`. No component directly checks subscription state.

---

## Billing Architecture

```mermaid
sequenceDiagram
actor User
participant App as React App
participant EF as Edge Function
participant LS as Lemon Squeezy
participant DB as PostgreSQL

User->>App: Click "Upgrade to PRO"
App->>EF: create-checkout (JWT auth)
EF->>LS: Create checkout session
LS-->>EF: Checkout URL
EF-->>App: { url }
App->>LS: Redirect to checkout
LS->>User: Payment form
User->>LS: Complete payment
LS->>EF: Webhook (HMAC-SHA256)
EF->>DB: Upsert user_subscriptions
User->>App: Return (?checkout=success)
App->>DB: get_my_subscription()
DB-->>App: { tier: "pro" }
```

### Edge Functions

| Function | Purpose | Auth |
|----------|---------|------|
| `create-checkout` | Creates Lemon Squeezy checkout URL | JWT required |
| `create-portal` | Returns customer portal URL for billing management | JWT required |
| `lemonsqueezy-webhook` | Processes subscription events, upserts DB | HMAC signature |
| `health-check` | Monitors all services every 60s | Service role |

### Subscription Table

The `user_subscriptions` table uses provider-agnostic column names (`provider_customer_id`, `provider_subscription_id`, `provider_price_id`) to support billing provider changes without schema migrations.

---

## Security Model

Six layers of defense - no layer trusts the one above it:

| Layer | Controls |
|-------|----------|
| Edge | HTTPS enforced, HSTS (1 year, preload), strict CORS |
| Application | CSP (no inline scripts), X-Frame-Options DENY, COOP/CORP |
| API | `SECURITY DEFINER` RPCs, rate limiting (120 checks/minute) |
| Database | Row Level Security on all tables, server-side answer validation |
| Billing | Webhook signature verification, fail-closed entitlement, no client trust |
| Code | CodeQL, Trivy, Gitleaks, npm audit, Kyverno, Dependabot |

---

## Observability

### Health Monitoring

A self-monitoring loop built on Supabase:
1. **pg_cron** triggers a health-check Edge Function every 60 seconds
2. The function checks DB, API, Quiz Engine, Leaderboard, and Auth
3. Results are written to status tables (append-only for history)
4. 3 consecutive failures auto-create an incident and send an email alert via Resend
5. The frontend polls and renders a live status page at [status.kubequest.online](https://status.kubequest.online)

### Error Tracking

[Sentry](https://sentry.io) captures production errors including uncaught exceptions, render crashes, and RPC failures. PII is scrubbed before transmission. Web Vitals (LCP, CLS, INP) are reported as breadcrumbs.

### Synthetic Monitoring

GitHub Actions workflows verify production health externally:
- `synthetic-monitor.yml` - every 6 hours: homepage, API, response time, security headers
- `uptime.yml` - every 30 minutes: homepage and auth endpoint

Documentation: [docs/monitoring.md](docs/monitoring.md) | [docs/observability.md](docs/observability.md)

---

## CI/CD & Supply Chain Security

### PR Gate

All checks must pass before merge:

| Check | Tool |
|-------|------|
| Build | Vite production build |
| Test | Vitest (742+ tests) |
| RPC Signatures | Validates Supabase RPC parameter contracts |
| npm Audit | Dependency vulnerability scan |
| Gitleaks | Secret scanning |
| CodeQL | Static analysis |
| Trivy | Container vulnerability scan |
| K8s Policies | Kyverno manifest validation |

### Publish Pipeline

Push to `main` triggers Release Please (version bump + CHANGELOG). Version tags trigger Docker build, Trivy scan, GHCR push, SBOM + provenance attestation, and Cosign keyless signing.

### Supply Chain Security

- **Cosign** keyless image signing via GitHub OIDC
- **SBOM** attached to every published image
- **Provenance** attestation (`mode=max`)
- **Trivy** fails on HIGH/CRITICAL CVEs
- **Kyverno** enforces trusted registries, no-latest tags, resource limits
- **Dependabot** weekly updates for npm, Docker, Actions

---

## Local Development

### Prerequisites

- Node.js 18+
- A [Supabase](https://supabase.com) account (optional - guest mode works without it)

### Setup

```bash
git clone https://github.com/or-carmeli/KubeQuest.git
cd KubeQuest
npm install
cp .env.example .env # add credentials
npm run dev # http://localhost:5173
```

### Scripts

```bash
npm run dev # development server
npm run build # production build
npm run preview # preview production build locally
npm run test # run tests (vitest)
```

---

## Environment Variables

### Frontend (Vercel / `.env`)

| Variable | Required | Description |
|----------|----------|-------------|
| `VITE_SUPABASE_URL` | Yes | Supabase project URL |
| `VITE_SUPABASE_ANON_KEY` | Yes | Supabase anonymous key |
| `VITE_SENTRY_DSN` | No | Sentry DSN for error tracking |
| `VITE_SENTRY_ENVIRONMENT` | No | Sentry environment tag |

### Supabase Edge Functions (secrets)

| Secret | Required | Description |
|--------|----------|-------------|
| `LEMONSQUEEZY_API_KEY` | Yes | Lemon Squeezy API key |
| `LEMONSQUEEZY_STORE_ID` | Yes | Lemon Squeezy store ID |
| `LEMONSQUEEZY_VARIANT_ID` | Yes | Product variant for PRO subscription |
| `LEMONSQUEEZY_WEBHOOK_SECRET` | Yes | Webhook signing secret (HMAC-SHA256) |
| `SITE_URL` | Yes | Production URL for checkout redirects |
| `RESEND_API_KEY` | No | Email alerts for health check incidents |

> Guest mode, quizzes, and War Room catalog work without any billing secrets. Billing secrets are only needed for PRO subscription functionality.

---

## Testing

742+ tests covering:

| Area | What's Tested |
|------|--------------|
| Quiz state | Persistence, resume, cross-type isolation, corrupt data recovery |
| Scoring | Server-authoritative validation, deduplication, stable question keys |
| Security | Auth guards, rate limit propagation, RPC parameter validation, input sanitization |
| Feature gating | FREE/PRO access, tier resolution, fail-closed behavior |
| Billing | Subscription status mapping, active/inactive transitions, checkout/portal stubs |
| Analytics | Event catalog completeness, uniqueness, error resilience |
| Incidents | Unlock progression, deterministic sort, completion invariants |
| Architecture | Scenario unlock logic, scoring calculations |

```bash
npm run test
```

---

## Kubernetes Deployment

The `k8s/` directory contains production-ready manifests for self-hosting on any Kubernetes cluster.

| Manifest | Purpose |
|----------|---------|
| `namespace.yaml` | Isolated namespace `kubequest` |
| `deployment.yaml` | 2 replicas, resource limits, liveness + readiness probes |
| `service.yaml` | ClusterIP on port 80 |
| `ingress.yaml` | nginx Ingress with TLS via cert-manager |
| `hpa.yaml` | HPA: 2-10 pods at 70% CPU |

```bash
kubectl apply -f k8s/
```

Docker image available at `ghcr.io/or-carmeli/kubequest`.

---

## Project Structure

```
src/
App.jsx # Main application
main.jsx # Entry point (Sentry, Web Vitals, React mount)
api/ # Supabase RPCs (quiz, monitoring, analytics)
config/ # Feature gating, billing, subscription tier, analytics events
content/ # Quiz questions, daily challenges, incidents, scenarios
components/ # UI components (quiz, roadmap, stats, upgrade modal)
architecture/ # Architecture scenario components
shared/ # Reusable UI primitives
hooks/ # Custom React hooks (useProAccess, useIsMobile, useTimeRange)
utils/ # Helpers (storage, i18n, telemetry, vercel env detection)
features/
sandbox/ # War Room incident engine
data/ # 9 catalog incidents
engine/ # Command execution, scoring, progress, hints, generation
components/ # Terminal, result card, investigation replay, share
views/ # WarRoomView (incident list + investigation)
public/
sw.js # Service worker (offline cache, stale asset recovery)
boot.js # Blank-screen safety net, SW registration
supabase/
migrations/ # 83 database migrations
functions/ # Edge Functions (billing, health check, metrics)
config.toml # Function-level JWT configuration
k8s/ # Kubernetes manifests
policies/ # Kyverno admission policies
.github/
workflows/ # 10 CI/CD workflows
dependabot.yml # Weekly dependency updates
docs/ # Monitoring, observability, privacy, K8s policies
```

---

## Troubleshooting

### "Unexpected token '<'" errors in Sentry

Vercel Analytics and Speed Insights scripts (`/_vercel/*.js`) are only loaded on production hostnames (`kubequest.online`). On localhost or `vite preview`, these components are not rendered, preventing HTML-as-JS parse errors.

### Stale assets after deployment

The service worker detects MIME type mismatches when Vercel serves HTML for old hashed asset URLs. It purges caches and triggers a client reload automatically. A blank-screen safety net in `boot.js` shows recovery UI if React fails to render within 8 seconds.

### Billing "not configured" errors

Edge Functions require Lemon Squeezy secrets to be set via `supabase secrets set`. Without them, checkout returns a user-facing error and users stay on the FREE tier (fail-closed).

### Quiz answer RPC failures

The `check_quiz_answer` and `check_daily_answer` RPCs use `stable_question_key` for content-derived identification. Client-side validation rejects null/empty keys before calling the RPC.

---

## Contributing

Contributions welcome - new questions, bug fixes, incident scenarios.
See [CONTRIBUTING.md](CONTRIBUTING.md) for setup instructions and guidelines.

---

## Disclaimer

KubeQuest is an independent training platform and is not affiliated with, sponsored by, or endorsed by the Linux Foundation, the CNCF, or any certification body. CKA and Kubernetes are trademarks of the Cloud Native Computing Foundation.

---

## License

[MIT](LICENSE) - Or Carmeli