{"id":48037229,"url":"https://github.com/ponack/crucible-iap","last_synced_at":"2026-06-16T23:01:08.768Z","repository":{"id":347842269,"uuid":"1195455927","full_name":"ponack/crucible-iap","owner":"ponack","description":"Self-hosted Infrastructure Automation Platform — open-source Spacelift or Terraform Cloud alternative","archived":false,"fork":false,"pushed_at":"2026-06-13T22:24:01.000Z","size":10845,"stargazers_count":8,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2026-06-13T23:07:46.337Z","etag":null,"topics":["automation","devops","golang","infrastructure-as-code","oidc","opentofu","self-hosted","sveltekit","terraform"],"latest_commit_sha":null,"homepage":"https://www.forgedinfeatherstechnology.com/crucible-iap","language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ponack.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":"SECURITY.md","support":null,"governance":null,"roadmap":"docs/roadmap.md","authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-03-29T17:36:14.000Z","updated_at":"2026-06-13T22:24:04.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/ponack/crucible-iap","commit_stats":null,"previous_names":["ponack/crucible-iap"],"tags_count":130,"template":false,"template_full_name":null,"purl":"pkg:github/ponack/crucible-iap","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ponack%2Fcrucible-iap","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ponack%2Fcrucible-iap/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ponack%2Fcrucible-iap/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ponack%2Fcrucible-iap/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ponack","download_url":"https://codeload.github.com/ponack/crucible-iap/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ponack%2Fcrucible-iap/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":34426745,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-16T02:00:06.860Z","response_time":126,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["automation","devops","golang","infrastructure-as-code","oidc","opentofu","self-hosted","sveltekit","terraform"],"created_at":"2026-04-04T14:00:00.125Z","updated_at":"2026-06-16T23:01:08.754Z","avatar_url":"https://github.com/ponack.png","language":"Go","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Crucible IAP — Infrastructure Automation Platform\n\n\u003cimg src=\"assets/crucible-logo-transparent.png\" alt=\"Crucible IAP\" width=\"160\" /\u003e\n\nA self-hosted, privacy-first alternative to Spacelift. Push code → Crucible plans it → review → apply. State, policy, and audit trail stay on your own infrastructure.\n\n| | |\n| :---: | :---: |\n| ![Dashboard](assets/screenshots/dashboard.png) | ![Plan approval](assets/screenshots/run-plan-approval.png) |\n| **Dashboard** — active runs, approvals, recent activity. | **Plan → confirm → apply** — review the plan diff, then click Confirm. |\n| ![Stack detail with tags](assets/screenshots/stack-detail-tags.png) | ![Self-service blueprints](assets/screenshots/blueprint-list.png) |\n| **Stack detail** — tags, pinning, dependency graph, drift config. | **Blueprints** — parameterized templates app teams deploy via a form. |\n| ![Blueprint detail](assets/screenshots/blueprint-details.png) | ![Policy test playground](assets/screenshots/policy-test-playground.png) |\n| **Blueprint detail** — param table, types, defaults, publish controls. | **Policy playground** — test OPA/Rego rules against synthetic input. |\n| ![Policy git sources](assets/screenshots/policy-git-sources.png) | |\n| **Policy GitOps** — sync `.rego` files from a git repo on push. | |\n\n\u003e **Not a technical user?** Visit the [Crucible IAP product page](https://www.forgedinfeatherstechnology.com/crucible-iap) for screenshots, feature highlights, and an overview of what Crucible can do for your team.\n\n[![CI](https://github.com/ponack/crucible-iap/actions/workflows/ci.yml/badge.svg)](https://github.com/ponack/crucible-iap/actions/workflows/ci.yml)\n[![Latest Release](https://img.shields.io/github/v/release/ponack/crucible-iap)](https://github.com/ponack/crucible-iap/releases/latest)\n[![Go Report Card](https://goreportcard.com/badge/github.com/ponack/crucible-iap/api)](https://goreportcard.com/report/github.com/ponack/crucible-iap/api)\n[![zizmor](https://img.shields.io/badge/zizmor-passing-brightgreen)](https://github.com/ponack/crucible-iap/actions/workflows/ci.yml)\n![License: AGPL-3.0](https://img.shields.io/badge/license-AGPL--3.0-blue)\n![Status: Release Candidate](https://img.shields.io/badge/status-release%20candidate-green)\n\n---\n\nCrucible IAP orchestrates OpenTofu, Terraform, Ansible, and Pulumi runs with policy enforcement, built-in state storage, drift detection, and a full audit trail — all running in your own infrastructure with no SaaS dependency.\n\n## Features\n\n| Area | What you get |\n| ---- | ------------ |\n| **GitOps \u0026 runs** | Push or PR triggers a tracked run (plan → confirm → apply) with PR comments and commit status checks. Works with GitHub, GitLab, Gitea, Gogs, Bitbucket Cloud, and Azure DevOps (HMAC-verified webhooks; ADO Basic-auth). Tracked / proposed / destroy / drift run types, auto-apply, and scheduled drift detection. |\n| **Policy-as-code** | OPA/Rego policies at `pre_plan`, `post_plan`, `pre_apply`, `trigger`, `login`, `approval`, and `validation` hooks. Blocking denies + non-blocking warnings. Approval gating on blast radius. GitOps sync — store `.rego` files in a git repo and Crucible syncs them on push (GitHub / GitLab, HMAC-verified). Standalone `/policies/test` playground with OPA evaluation trace. Compliance policy packs (SOC 2, CIS AWS Foundations, HIPAA, PCI-DSS) — one-click install and attach to any stack. Continuous validation — periodic policy checks against live state with configurable intervals and status-change alerts. Full append-only audit log. |\n| **State \u0026 runners** | OpenTofu, Terraform, Ansible, and Pulumi. Built-in Terraform HTTP backend on MinIO (zero config) or per-stack S3 / GCS / Azure Blob overrides. Each run in a fresh, read-only, capability-dropped Docker container — cosign-signed, digest-pinned runner image. |\n| **Secrets \u0026 identity** | Per-stack OIDC workload identity federation with AWS, GCP, Azure, Vault, Authentik, or any OIDC IdP — no static cloud credentials. Encrypted stack env vars + reusable variable sets. External secret stores: AWS Secrets Manager, Vault KV v2, Bitwarden, Vaultwarden. Optional BYOK — wrap the vault master key with your own AWS KMS, HashiCorp Vault Transit, or Azure Key Vault key, with online rotation and no server restart. See [`docs/security.md`](docs/security.md) for crypto details. |\n| **Auth \u0026 access** | SSO via OIDC (Authentik, Okta, GitHub, Keycloak, anything OIDC) with PKCE, or single-operator local auth. Org-level RBAC (viewer / member / admin) with automatic IdP group → role mapping on login. Per-stack `viewer` / `approver` membership and service-account API tokens for CI. Rate-limited, hardened login. |\n| **Observability** | Embedded Grafana on `/monitoring` (8 panels, 30 s refresh). Prometheus + Grafana shipped in-box. Per-stack Slack, Discord, Teams, Gotify, ntfy, and email notifications. Webhook delivery log with full payload inspection. Outgoing webhooks (HMAC-signed HTTP POST) for PagerDuty, ServiceNow, or custom tooling. Infracost monthly cost delta surfaced in every run. |\n| **Deployment** | Single `docker compose up` — Caddy, API, Worker, UI, PostgreSQL, MinIO, Prometheus, Grafana. Let's Encrypt TLS via Caddy or the `external-proxy` profile for nginx / Traefik / your own Caddy. Optional bundled Authentik IdP. Built-in Terraform module registry, private provider registry, and stack templates. External agents: drop `docker-compose.agent.yml` + `.env.agent` on any remote host and run `docker compose -f docker-compose.agent.yml up -d` — no other files needed. |\n\nFull feature list with security/crypto specifics: [`docs/operator-guide.md`](docs/operator-guide.md) and [`docs/security.md`](docs/security.md).\n\n## Quick start\n\n\u003e **Just trying it out?** Follow [`docs/quickstart.md`](docs/quickstart.md) for a 15-minute local walkthrough (bundled Caddy on `https://localhost` with local auth) that takes you from `docker compose up` to a plan → confirm → apply run.\n\n**Prerequisites:** `docker` (with compose v2), `openssl`, and a free port 443 (or 80/443 if using Let's Encrypt).\n\n```bash\ncp .env.example .env\n# Edit .env — set CRUCIBLE_BASE_URL, CRUCIBLE_SECRET_KEY, POSTGRES_PASSWORD, etc.\n\n# Create the runner network once (used by ephemeral job containers)\ndocker network create crucible-runner\n\ndocker compose up -d\n```\n\nCrucible IAP will be available at `https://localhost` (self-signed cert — accept the browser warning on first visit). Caddy provisions a real TLS certificate automatically when `CRUCIBLE_BASE_URL` is a public hostname and `CADDY_ACME_EMAIL` is set.\n\nAfter login, the dashboard greets you with a one-click path to your first stack. Click **Create your first stack** to connect a Git repo — or use one of the starter templates:\n\n| Template | Tool | Use this template |\n| --- | --- | --- |\n| `crucible-quickstart` | OpenTofu | [Use template →](https://github.com/ponack/crucible-quickstart/generate) |\n| `crucible-quickstart-terragrunt` | Terragrunt | [Use template →](https://github.com/ponack/crucible-quickstart-terragrunt/generate) |\n\n## Deployment options\n\n### Bundled Caddy (default)\n\nZero-config TLS via Let's Encrypt or self-signed. Everything in one `docker compose up`.\n\n```bash\ndocker compose up -d\n```\n\n### External reverse proxy\n\nUse your existing nginx, Traefik, or Caddy instance instead.\n\n```bash\ndocker network create crucible-runner   # if not already created\ndocker compose --profile external-proxy up -d\n```\n\nThe API binds to `127.0.0.1:8080` and the UI to `127.0.0.1:3000` by default. Point your proxy at those addresses. Ready-to-use config examples are in [`deploy/proxy-examples/`](deploy/proxy-examples/):\n\n| File | Proxy |\n| ---- | ----- |\n| [`nginx.conf`](deploy/proxy-examples/nginx.conf) | nginx |\n| [`traefik.yml`](deploy/proxy-examples/traefik.yml) | Traefik v3 |\n| [`caddy-standalone.Caddyfile`](deploy/proxy-examples/caddy-standalone.Caddyfile) | Caddy (external) |\n\n### Bundled Authentik IdP (optional)\n\nAdd `--profile authentik` to include a self-hosted Authentik instance. Skip this if you already have an IdP.\n\n```bash\n# Default Caddy + Authentik\ndocker compose --profile authentik up -d\n\n# External proxy + Authentik\ndocker compose --profile external-proxy --profile authentik up -d\n```\n\n## Architecture\n\n```text\nGitHub / GitLab / Bitbucket / Azure DevOps webhook\n    │\n    ▼\nBrowser / CI\n    │\n    ▼\nReverse proxy (Caddy bundled, or nginx / Traefik / your own)\n    │\n    ├── /auth, /api, /health  →  Crucible API (Go + Echo)\n    │                                │\n    │                     ┌──────────┼──────────────┐\n    │                     ▼          ▼              ▼\n    │               PostgreSQL     MinIO       OPA engine\n    │               (DB + queue    (state,     (embedded,\n    │                + audit log)   plans,      Rego)\n    │                               logs)\n    │                     │\n    │              River job queue (PostgreSQL)\n    │                     │\n    │           Crucible Worker (separate container)\n    │           (no public ports, has Docker socket)\n    │                     │\n    │           Docker SDK → ephemeral runner container\n    │                        (tofu / terraform / ansible / pulumi)\n    │\n    │  — or —\n    │\n    │           crucible-agent (external host, any cloud / on-prem)\n    │           polls /api/v1/agent/claim → runs same Docker images\n    │           streams logs back → reports outcome\n    │\n    └── /*  →  Crucible UI (SvelteKit SSR)\n```\n\nSee [docs/architecture.md](docs/architecture.md) for the full design including security model, state backend protocol, and policy evaluation hooks.\n\n## Documentation\n\n| Document | Description |\n| -------- | ----------- |\n| **[docs/quickstart.md](docs/quickstart.md)** | **Your first stack in 15 minutes — start here** |\n| [docs/iac-101.md](docs/iac-101.md) | **New to Infrastructure as Code?** Plan/apply/state explained, why Crucible vs raw Terraform |\n| [docs/glossary.md](docs/glossary.md) | All Crucible + Terraform / Pulumi / Ansible / OPA / VCS terminology in one place |\n| [docs/troubleshooting.md](docs/troubleshooting.md) | Common user-facing errors and fixes — plan failed, state locked, policy denied, webhook silent, etc. |\n| [docs/architecture.md](docs/architecture.md) | Component diagram, request flow, security model, DB schema |\n| [docs/operator-guide.md](docs/operator-guide.md) | Deployment, configuration reference, external worker agents, backup, monitoring, troubleshooting |\n| [docs/security.md](docs/security.md) | Threat model, hardening checklist, vulnerability reporting. GitHub Actions workflows audited with [zizmor](https://github.com/zizmorcore/zizmor) on every PR. |\n| [docs/policies.md](docs/policies.md) | Rego policy authoring guide — all policy types, input/output shapes, examples |\n| [docs/guides/policy-gitops.md](docs/guides/policy-gitops.md) | Policy-as-code GitOps — sync `.rego` files from a git repo, webhook setup, mirror mode |\n| [docs/policies/README.md](docs/policies/README.md) | Ready-to-use policy templates (no-destroy, blast radius, tags, EC2 allowlist, public access, approval gates, and more) |\n| [docs/roadmap.md](docs/roadmap.md) | Expanded roadmap with implementation notes |\n| [docs/guides/team-setup.md](docs/guides/team-setup.md) | Org roles, per-stack RBAC, approval policies, recommended starter policy set |\n| [docs/guides/projects.md](docs/guides/projects.md) | Hierarchical org → project → stack model with per-project RBAC for multi-team deployments |\n| [docs/guides/cli.md](docs/guides/cli.md) | The `crucible` command-line client — trigger runs, check status, approve, scripting patterns |\n| [docs/guides/variable-sets.md](docs/guides/variable-sets.md) | Reusable bundles of env vars attached to many stacks at once — credentials, monitoring tokens, shared tags |\n| [docs/guides/tags.md](docs/guides/tags.md) | Color-coded stack labels for filtering, grouping, and policy-driven approval gates |\n| [docs/guides/external-secrets.md](docs/guides/external-secrets.md) | Fetch secrets from AWS Secrets Manager, HashiCorp Vault, Bitwarden SM, or Vaultwarden at run time |\n| [docs/guides/stack-templates.md](docs/guides/stack-templates.md) | Reusable stack configurations — create templates, deploy from them, good design practices |\n| [docs/guides/blueprints.md](docs/guides/blueprints.md) | Self-service stack deployment — platform teams publish blueprints, app teams fill in a form and deploy |\n| [docs/guides/provider-registry.md](docs/guides/provider-registry.md) | Private Terraform provider registry — publish binaries, configure Terraform, GPG signing |\n| [docs/guides/deploy-aws-ecs.md](docs/guides/deploy-aws-ecs.md) | Production deployment on AWS — ECS Fargate (API), EC2 (worker), RDS, S3, ALB |\n| [docs/guides/deploy-gcp-cloudrun.md](docs/guides/deploy-gcp-cloudrun.md) | Production deployment on GCP — Cloud Run (API), Compute Engine (worker), Cloud SQL, GCS |\n| [docs/guides/deploy-azure-aca.md](docs/guides/deploy-azure-aca.md) | Production deployment on Azure — Container Apps (API), Azure VM (worker), PostgreSQL Flexible Server, Blob Storage |\n| [docs/guides/aws.md](docs/guides/aws.md) | AWS credentials, S3 remote state backend, minimal IAM role, recommended AWS policies |\n| [docs/guides/gcp.md](docs/guides/gcp.md) | GCP credentials, GCS remote state backend, Workload Identity Federation, minimal IAM roles |\n| [docs/guides/azure.md](docs/guides/azure.md) | Azure credentials, Blob Storage remote state, federated identity, minimal role assignments |\n| [docs/guides/digitalocean.md](docs/guides/digitalocean.md) | DigitalOcean Cloud — API token setup, Droplets, Spaces remote state, cost-control policies |\n| [docs/guides/hetzner.md](docs/guides/hetzner.md) | Hetzner Cloud — hcloud token, server provisioning, ARM vs x86, Hetzner Robot notes |\n| [docs/guides/kubernetes.md](docs/guides/kubernetes.md) | Kubernetes + Helm — cluster vs workload stack split, auth options, CRDs, Helm release patterns |\n| [docs/guides/aws-nuke.md](docs/guides/aws-nuke.md) | Automated AWS sandbox account cleanup with aws-nuke — three-stack setup, dry-run verification, and a self-resetting demo loop |\n| [docs/guides/proxmox.md](docs/guides/proxmox.md) | End-to-end guide: managing Proxmox VMs with GitOps and policy enforcement |\n| [docs/guides/ansible.md](docs/guides/ansible.md) | End-to-end guide: running Ansible playbooks with check → confirm → apply and policy enforcement |\n| [docs/guides/pulumi.md](docs/guides/pulumi.md) | End-to-end guide: running Pulumi programs with preview → confirm → up and built-in MinIO state backend |\n| [docs/guides/terragrunt.md](docs/guides/terragrunt.md) | End-to-end guide: Terragrunt multi-module stacks, `run-all` lifecycle, built-in state backend wiring, multi-environment layout |\n| [docs/guides/cloudflare.md](docs/guides/cloudflare.md) | Managing Cloudflare infrastructure as code — bootstrap with cf-terraforming, Crucible stack setup, OPA policies |\n| [docs/guides/spacelift-migration.md](docs/guides/spacelift-migration.md) | Migrating from Spacelift to Crucible — concept mapping, state migration paths, and a working Cloudflare example |\n| [docs/guides/tfc-migration.md](docs/guides/tfc-migration.md) | Migrating from Terraform Cloud / Terraform Enterprise — concept mapping, state pull, Sentinel→Rego translation |\n| [docs/guides/github-actions.md](docs/guides/github-actions.md) | Triggering Crucible runs from GitHub Actions workflows — build/deploy chains, PR previews, scheduled cleanup, workflow_dispatch |\n\n## Connecting a Git repository\n\nEvery stack has a unique webhook URL and secret. Find them on the stack detail page in the UI, or via the API:\n\n```http\nGET /api/v1/stacks/:id\n→ { \"webhook_url\": \"https://crucible.example.com/api/v1/webhooks/\u003cstack-id\u003e\",\n    \"webhook_secret\": \"...\" }\n```\n\n### GitHub\n\n1. Go to your repository → **Settings** → **Webhooks** → **Add webhook**\n2. **Payload URL** — paste the `webhook_url` from above\n3. **Content type** — `application/json`\n4. **Secret** — paste the `webhook_secret`\n5. **Which events?** — choose **Let me select individual events**, then tick **Pushes** and **Pull requests**\n6. Click **Add webhook**\n\nCrucible will now create a **tracked** run (plan → confirm → apply) on every push to the stack's configured branch, and a **proposed** run (plan only, no apply) on every pull request.\n\n### GitLab\n\n1. Go to your project → **Settings** → **Webhooks** → **Add new webhook**\n2. **URL** — paste the `webhook_url`\n3. **Secret token** — paste the `webhook_secret`\n4. Tick **Push events** and **Merge request events**\n5. Click **Add webhook**\n\n### Rotating the secret\n\nIf the secret is ever exposed, rotate it without downtime:\n\n```bash\ncurl -X POST https://crucible.example.com/api/v1/stacks/\u003cid\u003e/webhook/rotate \\\n  -H \"Authorization: Bearer \u003caccess-token\u003e\"\n# → { \"webhook_secret\": \"\u003cnew-secret\u003e\" }\n```\n\nUpdate the secret in your repository's webhook settings immediately after.\n\n## Run types\n\n| Trigger | Run type | What happens |\n| --- | --- | --- |\n| Push to tracked branch | `tracked` | Plan → wait for human confirmation → apply |\n| Push to tracked branch (`auto_apply=true`) | `tracked` | Plan → auto-apply if policy passes |\n| Pull request / Merge request | `proposed` | Plan only — result posted, no apply |\n| Manual (from UI or API) | `tracked` / `proposed` / `destroy` | As configured |\n| Drift detection | `proposed` | Plan only — alerts on diff |\n\n## State backend configuration\n\nPoint any OpenTofu or Terraform stack at Crucible's built-in state backend:\n\n```hcl\nterraform {\n  backend \"http\" {\n    address        = \"https://crucible.example.com/api/v1/state/\u003cstack-id\u003e\"\n    lock_address   = \"https://crucible.example.com/api/v1/state/\u003cstack-id\u003e\"\n    unlock_address = \"https://crucible.example.com/api/v1/state/\u003cstack-id\u003e\"\n    username       = \"\u003cstack-id\u003e\"\n    password       = \"\u003cstack-token-secret\u003e\"\n  }\n}\n```\n\nStack tokens are managed in the UI (Settings → Tokens) or via the API. State is stored in MinIO with full version history.\n\n## Cloud OIDC workload identity federation\n\nCrucible acts as its own OIDC identity provider. Every run mints a short-lived signed JWT that cloud providers exchange for temporary credentials — no static cloud secrets are stored in Crucible. AWS, GCP, Azure, HashiCorp Vault, Authentik, Keycloak, Zitadel, Dex, and any OIDC-compatible IdP are supported; configure per-stack or set an org-level default in **Settings → General → Cloud OIDC Default**.\n\nPer-provider setup (IAM trust policies, GCP workload identity pools, Entra federated credentials, Vault JWT roles, IdP exchange endpoints) and the JWT claims reference are in [`docs/operator-guide.md#cloud-oidc-workload-identity-federation`](docs/operator-guide.md#cloud-oidc-workload-identity-federation).\n\n## Policy-as-code\n\nAttach OPA/Rego policies to stacks to enforce guardrails before runs are allowed to apply:\n\n```rego\npackage crucible\n\n# Deny any plan that would destroy a resource\nplan := result if {\n  result := {\n    \"deny\":    deny_msgs,\n    \"warn\":    warn_msgs,\n    \"trigger\": [],\n  }\n}\n\ndeny_msgs contains msg if {\n  input.resource_changes[_].change.actions[_] == \"delete\"\n  msg := \"destroy operations require an explicit destroy run\"\n}\n\nwarn_msgs contains msg if {\n  input.resource_changes[_].change.actions[_] == \"update\"\n  msg := sprintf(\"resource %s will be updated\", [input.resource_changes[_].address])\n}\n```\n\nPolicy types: `post_plan` (most common), `pre_plan`, `pre_apply`, `trigger` (downstream stacks), `login`.\n\n## Development\n\nRequirements: Go 1.25+, Node.js 22+, pnpm, Docker\n\n```bash\n# Start dependencies (PostgreSQL + MinIO only)\ndocker compose -f deploy/docker-compose.dev.yml up -d\n\n# Start API (migrations run automatically on startup)\ncd api \u0026\u0026 go run ./cmd/crucible-iap\n\n# Run UI\ncd ui \u0026\u0026 pnpm install \u0026\u0026 pnpm dev\n```\n\nThe UI dev server proxies `/api` and `/auth` to the API at `localhost:8080` automatically.\n\n### Running tests\n\n```bash\n# Unit tests (no DB needed)\ncd api \u0026\u0026 go test ./internal/policy/...\n\n# Integration tests (requires PostgreSQL)\nexport TEST_DATABASE_URL=postgres://crucible:crucible@localhost:5432/crucible_test?sslmode=disable\ncd api \u0026\u0026 go test -race ./...\n```\n\n## Roadmap\n\n- [x] OIDC authentication with personal org auto-provisioning\n- [x] Stack management (CRUD, tokens, policies)\n- [x] Run lifecycle state machine (queued → planning → unconfirmed → applying → finished)\n- [x] Terraform/OpenTofu HTTP state backend\n- [x] Ephemeral Docker runner with MinIO plan artifact storage\n- [x] OPA/Rego policy evaluation engine\n- [x] Append-only audit log (tamper-resistant at DB level)\n- [x] GitHub and GitLab webhook ingestion (push + PR/MR events)\n- [x] List pagination on all collection endpoints\n- [x] RBAC enforcement (viewer / member / admin) + org invite flow\n- [x] Settings UI — member management, role changes, invite links\n- [x] Automatic migrations on startup\n- [x] Prometheus metrics + Grafana dashboards (built-in, served at `/grafana`)\n- [x] Monitoring page — Grafana panels embedded in the Crucible UI; eight panels: HTTP request rate, error rate, latency, run completions, queue depth, active runs, stack count, and run success rate; no separate Grafana tab needed for day-to-day observability\n- [x] Org-level Gotify and ntfy defaults — configure default push notification endpoints in Settings; new stacks inherit them\n- [x] Structured `/health` endpoint (DB status, version, uptime)\n- [x] Policy management UI + drift detection scheduling\n- [x] Operator documentation + security hardening guide\n- [x] Stack-level environment variables — AES-256-GCM encrypted at rest, injected into runner containers; plain (non-secret) values are shown inline in the UI with per-row Edit/Replace actions; secret values remain write-only\n- [x] PR/MR feedback — plan summary comments and commit status checks on GitHub and GitLab\n- [x] Slack notifications — configurable per-stack event subscriptions\n- [x] Gotify notifications — per-stack Gotify server URL + encrypted app token; fires on plan complete, run succeeded/failed\n- [x] ntfy notifications — per-stack ntfy topic URL + optional Bearer token; fires on plan complete, run succeeded/failed\n- [x] External secret store integrations — AWS Secrets Manager (Sig v4, no SDK), HashiCorp Vault KV v2 (token + AppRole), Bitwarden Secrets Manager (E2E decryption), Vaultwarden (self-hosted; PBKDF2/Argon2id + AES-CBC vault crypto)\n- [x] Multi-cloud state backend options — S3 / S3-compatible (Sig v4), GCS (JWT + OAuth2), Azure Blob Storage (SharedKeyLite)\n- [x] Gitea and Gogs webhook support — modern X-Hub-Signature-256 compat + legacy X-Gitea-Signature fallback\n- [x] Per-stack VCS provider config (github/gitlab/gitea) with custom instance base URL for self-hosted deployments\n- [x] Remote state sharing — cross-stack `terraform_remote_state` with per-relationship tokens minted on the source stack and injected as env vars at run time\n- [x] Auto-remediate drift — automatically queue a tracked apply run after a drift detection run reports changes\n- [x] Artifact retention policy — configurable retention period for plan files and run logs; deleted on a daily background sweep\n- [x] Org-level notification defaults — pre-fill Slack webhook and VCS provider config for new stacks\n- [x] Intuitive dashboard — landing page showing org-wide health at a glance: active/failed runs, stacks with drift, recent audit events, and inline approve/discard/cancel actions without navigating into individual stacks\n- [x] External worker agents — deploy `crucible-agent` on any host with Docker access; agents poll the Crucible API for queued runs, execute them locally, and stream logs back; multiple agents per pool with `FOR UPDATE SKIP LOCKED` claim safety; stacks assign to a pool via Settings → Runner; separate optional binary, not bundled with the main image\n- [x] Stack dependency graph — first-class upstream/downstream relationships with automatic downstream triggers after a successful apply; cycle detection via recursive CTE\n- [x] Variable sets — define a shared group of env vars once and attach to multiple stacks; eliminates repetition across similar stacks\n- [x] Stack templates / blueprints — create new stacks pre-filled from a saved template (tool, repo, branch, project root, auto-apply, drift settings)\n- [x] Manual run with variable overrides — trigger a one-off run with temporary env var overrides without changing stack config\n- [x] Service account API tokens — machine-readable tokens not tied to a user session, for CI pipelines and automation\n- [x] CI linting — gofmt, go vet, gocyclo, ineffassign, misspell, staticcheck run on every PR; `make lint` target for local use\n- [x] Ansible support — check → confirm → apply lifecycle with PLAY RECAP parsing, inventory auto-detection, and destroy playbook support\n- [x] Pulumi support — preview → confirm → up lifecycle with built-in MinIO DIY S3 backend, TypeScript/JavaScript/Python runners, and changeSummary parsing for PR comments\n- [x] Email notifications — SMTP (STARTTLS/SMTPS/plaintext) per-stack email address; fires on plan complete, run succeeded/failed; configured in Settings → Notifications\n- [x] Webhook delivery log — record of incoming webhook payloads and whether they triggered a run, to debug missed or skipped events\n- [x] Webhook re-delivery — re-trigger a run from any past delivery directly in the UI; replays the stored payload without requiring a new push or manual re-configuration\n- [x] Environment TTL / auto-destroy — set a scheduled destroy time on any stack; a background scheduler fires a destroy run at the deadline and clears the TTL so it only fires once; prevents dev/feature environment sprawl\n- [x] Terraform provider caching — provider binaries cached in MinIO after first download; subsequent runs restore from cache before `terraform init` so registry downloads are skipped; platform-filtered (linux_amd64 / linux_arm64); cache miss is non-fatal (falls back to registry automatically)\n- [x] Terraform module registry — private module registry backed by MinIO; implements the Terraform Module Registry Protocol v1 (`/.well-known/terraform.json` discovery, versions, download, archive, search); publish via UI upload or git-tag auto-publish; README auto-extracted from archive and rendered as markdown; download count tracked; yank individual versions; service account tokens authenticate the Terraform CLI via `~/.terraformrc`\n- [x] Resource explorer — browse Terraform state resources in the UI with filtering by type and address\n- [x] Policy-as-code GitOps — sync `.rego` files from a git repository into policies on push; HMAC-verified webhooks, GitHub and GitLab support, mirror mode, policy type inference from directory structure or inline `# crucible:type` comments\n- [x] Cost estimation — integrate with Infracost or similar to surface per-run cost delta alongside the plan summary\n- [x] Fine-grained RBAC — per-stack viewer/approver roles in addition to the org-wide admin/member/viewer hierarchy; restricted stacks hidden from non-members\n- [x] Exportable config — download a full JSON snapshot of stacks, policies, variable sets, templates, blueprints, and worker pools; non-secret env vars included in plaintext, secret vars as name-only placeholders; import on any instance with conflict-skip semantics (existing resources matched by name are never overwritten); both operations audit-logged; Settings → Export / Import tab\n- [x] Custom run hooks — per-stack pre/post-plan and pre/post-apply bash scripts; configured in the stack settings UI, injected as env vars, executed inside the runner container; a non-zero exit fails the run\n- [x] Context-aware approval policies — OPA `approval` hook evaluates plan context (run type, trigger, add/change/destroy counts, stack name) and returns `require_approval: true` to gate runs behind explicit sign-off; `deny` fails the run immediately\n- [x] Startup config validation — `RUNNER_MEMORY_LIMIT` and `RUNNER_CPU_LIMIT` validated at boot; server refuses to start on invalid values rather than silently running containers unbounded\n- [x] OIDC workload identity federation — Crucible acts as its own OIDC identity provider; each run receives a short-lived signed JWT; configure per-stack or set an org-level default in Settings → General to exchange it for temporary credentials with AWS, GCP, Azure, HashiCorp Vault, Authentik, or any generic OIDC-compatible IdP — no static secrets in Crucible\n- [x] Notification test buttons — one-click test delivery for org-level Slack, Gotify, and ntfy endpoints directly from Settings → Notifications; confirms credentials are wired correctly without waiting for a run\n- [x] Per-stack RBAC on remote state links — configuring a cross-stack `terraform_remote_state` link now requires at least approver role on the source stack; prevents org members from granting access to state they cannot manage\n- [x] Auth endpoint rate hardening — per-IP rate limits tightened on `/auth/callback` (OAuth code exchange) and `/auth/refresh` (token renewal); service account tokens lock out after 20 failures per 5-minute window per IP\n- [x] argon2id token hashing — service account and stack tokens upgraded from unsalted SHA-256 to argon2id (32 MB / 2 iterations / 1 thread); existing tokens lazily upgraded on first use with no forced rotation; new SA token format embeds UUID for O(1) point-lookup\n- [x] httpOnly session cookies — refresh token moved from localStorage to a server-set `crucible_refresh` httpOnly `SameSite=Strict` cookie; access token kept in JS memory only; page-reload session restored transparently via silent cookie exchange\n- [x] Stack dependency flow diagram — upstream/downstream relationships visualised as an SVG flow diagram on the stack detail page; bezier-curve arrows, indigo-highlighted current stack, clickable dep nodes; zero new dependencies\n- [x] Org context switching after invite acceptance — accepting an org invite now immediately switches the active session to the invited org; users land on that org's stacks without needing to log out and back in\n- [x] Org name editing — admins can rename their organisation from Settings → Organisation; slug remains stable for URL routing\n- [x] Scheduled runs — cron-based plan, apply, or destroy runs per stack independent of code pushes; standard 5-field cron expressions (`0 2 * * *` = 2 am daily); next run time shown inline; worker polls every minute and enqueues the appropriate run type automatically\n- [x] Stack locking / maintenance mode — per-stack flag that prevents new runs from being queued; operators set it before manual cloud console changes and release it when done; prevents race conditions during incident response; lock reason shown as an amber banner on the stack page\n- [x] Run annotations — free-text operator note on any run (\"deployed for hotfix\", \"reverting per oncall\"); closes the audit gap between who triggered a run and why; inline click-to-edit on the run detail page\n- [x] Generic outgoing webhooks — fire arbitrary HTTP POST on run state changes to PagerDuty, ServiceNow, Jira, or custom tooling; HMAC-signed, configurable per event type, delivery log with retry\n- [x] SSO group → role mapping — automatically assign org roles from IdP group claims on every login; eliminates manual invite management for large teams on Authentik, Okta, Keycloak, or GitHub\n- [x] Cost estimation — integrate Infracost (self-hosted server supported) to surface per-run monthly cost delta alongside the plan summary\n- [x] IaC security scanning — built-in Checkov / Trivy scan post-plan; findings surfaced as structured results in the run detail alongside OPA policy output; configurable severity threshold to block apply\n- [x] Private provider registry — full Terraform Provider Registry Protocol v1; upload binaries per OS/arch, SHA-256 checksums served dynamically, GPG public key management per namespace for `terraform providers lock`; air-gapped deployments reference providers via `source = \"host/namespace/type\"`\n- [x] Per-stack run concurrency cap — set `max_concurrent_runs` on any stack; worker enforces the cap at job start and fails the run immediately if the limit is reached; 0 / unset = unlimited\n- [x] Self-service infrastructure blueprints — platform teams define blueprints (repo, tool, params) and publish them; app teams deploy new stacks by filling in a form without touching IaC config; params are injected as encrypted `TF_VAR_*` env vars; string, number, bool, and select param types; per-param env prefix; deploy runs atomically in a single transaction\n- [x] OPA policy test playground — standalone `/policies/test` page; pick any saved policy, paste synthetic JSON, run it and see allow/deny/warn/trigger results with optional OPA evaluation trace; genuine differentiator — neither Spacelift nor TF Cloud has this built in\n- [x] PR preview environments — auto-create a stack from a template when a PR opens, auto-destroy when it closes; branch name drives workspace isolation; pairs with stack dependencies for full per-PR environment chains\n- [x] AI run troubleshooting — one-click \"Explain failure\" on failed runs; sends log context to the Claude API and returns a structured root-cause explanation and suggested fix; API key set in Settings → AI troubleshooting (no container restart needed) or via `ANTHROPIC_API_KEY` env var\n- [x] Multi-provider AI troubleshooting — AI provider is now configurable from Settings; choose Anthropic (Claude) or any OpenAI-compatible endpoint (OpenAI, OpenRouter, OpenWebUI, Ollama, etc.); custom base URL and model fields added; existing `ANTHROPIC_API_KEY` deployments carry over automatically via migration 059\n- [x] Stack clone — one-click Clone button on any stack detail page; copies tool/repo config, runner image, hooks, worker pool, env vars (re-encrypted under the new stack's key), and tags into a new draft stack; state, runs, tokens, and notification secrets are not copied\n- [x] Run re-trigger — ↺ Re-trigger button on any terminal run (failed, canceled, discarded, finished); creates a new run with the same type and variable overrides against the current branch head; respects stack lock and per-stack RBAC\n- [x] AWS Cloud OIDC session duration default — saving an AWS Cloud OIDC config without specifying a session duration no longer returns a 500; defaults to 3600 s\n- [x] OIDC token injection — runner no longer fails with \"container rootfs is marked read-only\"; token written from the entrypoint into tmpfs after container start instead of via CopyToContainer before start\n- [x] Worker OIDC base URL — `CRUCIBLE_BASE_URL` added to the worker service environment; without it the JWT issuer was empty and no OIDC token was minted, causing \"No valid credential sources found\" on every OIDC-configured run\n- [x] Provider cache restore crash — empty provider cache no longer kills the run; `jq -r '.keys[]'` on an empty array fed an empty key into the restore loop, causing `rm -f` to target the cache directory itself and exit non-zero under `set -e`\n- [x] Dark / light mode switcher — system preference detected on first visit; persists to `localStorage`; sun/moon toggle in the sidebar header; native browser elements (scrollbars, inputs) follow the theme via CSS `color-scheme`; no flash of wrong theme on hard reload; smooth 150ms transition when toggling\n- [x] Sidebar header polish — logo size increased; version badge and theme toggle relocated from the bottom footer to the top header bar (flush right of the wordmark), keeping the footer to email + sign out only; dashboard page no longer capped at a fixed max-width\n- [x] Forge UI — teal-slate design system with OKLCH hue-shifted zinc scale; Heroicons icon sidebar with grouped nav sections (Core / Config / Ops) and teal active-state left-border; RunLifecycle 5-step rail on every run detail page; terminal-style log viewer; full indigo→teal color migration across all 33 UI pages\n- [x] Toast notifications — all 48 browser-native `alert()` popups replaced with a teal-accented toast store; error / success / info variants auto-dismiss after 4.5 s and stack in the bottom-right corner; `aria-live` polite for screen readers\n- [x] Consistent empty states — shared `EmptyState` component with a teal icon badge, heading, and subtext on all 10 list pages (stacks, runs, policies, worker pools, variable sets, blueprints, templates, module registry, provider registry, audit log)\n- [x] Cold / Hot / Neutral Forge theme switcher — three named forge themes selectable from color-swatch dot buttons in the sidebar; Cold Forge (teal-slate, hue 185 OKLCH, `#2DD4BF`), Hot Forge (copper-amber, hue 42 OKLCH, `#D4883C`), Neutral Forge (standard zinc, hue 286, `#818cf8` indigo accent — pre-redesign look); each combines independently with dark / light mode for six total palette combinations; persists to `localStorage` with anti-FOUC protection\n- [x] Dashboard redesign — priority-zone layout (Action Required → Live → History); stat cards with Heroicons icons and alert-colour tinting when values are non-zero; \"+ New stack\" CTA in the header; update-available notice demoted to a subtle link; version display removed from the main content area\n- [x] Ember / Frost / Gold Forge themes — three additional forge palette options: Ember Forge (hue 15, `#E05252` deep red), Frost Forge (hue 220, `#60A5FA` arctic blue), Gold Forge (hue 78, `#F5C542` warm gold); sidebar now shows six dot swatches; each has a paired light-mode variant and anti-FOUC protection\n- [x] Stack tags — org-scoped, color-coded labels managed in Settings → Tags; attach multiple tags to any stack; tag pills visible in the stack list and stack detail header\n- [x] Tag filtering — filter the stacks list and runs list by one or more tags via a dropdown with color swatches and active filter pills\n- [x] Stack pinning — pin any stack to float it to the top of the stacks list; toggle via the pin icon on each row\n- [x] Bulk approve — \"Approve all (N)\" button on the dashboard Action Required zone; confirms all pending runs in one click\n- [x] Starter policies — [`ponack/crucible-policies`](https://github.com/ponack/crucible-policies) public repo with 8 ready-made OPA policies (destroy protection, cost gates, business-hours controls, and more); one-click quick-connect from the Policy Git Sources page\n- [x] Policy sync fix — `org_id` ambiguity in the policy git sync worker caused all sync jobs to fail silently; fixed column qualification in the JOIN query\n- [x] Policy input shape — cost fields (`cost_add`, `cost_change`, `cost_remove`) added to OPA policy input; `approval` type added to UI type labels, badges, templates, and input schema\n- [x] Blueprint export/import — export any blueprint as a portable JSON file (`schema_version` + all config + params, no ids or timestamps); import via file upload or paste on the blueprints list page; imported blueprints start as drafts; audit-logged\n- [x] Run navigation correctness — navigating away from a confirmed plan+apply run no longer shows the previous run's data; stale `pollFinal` callbacks from the confirm flow are now cancelled on navigation; the stack detail page refreshes its run list every 10 s so newly created runs appear without a page reload\n- [x] Blueprint params fix — blueprints with no parameters no longer cause the detail page to hang on \"Loading…\"; Go's `json:\"omitempty\"` was silently dropping the `params` field from API responses when empty; the detail and deploy pages now guard against a missing field defensively\n- [x] Blueprint detail hang (all blueprints) — `tool_version`, `repo_url`, `runner_image`, and `drift_schedule` were also tagged `omitempty`; any blueprint with empty values for those fields caused the same Svelte 5 re-render hang; all four fields now always present in API responses\n- [x] Blueprint detail and deploy page hang (final fix) — `Array.sort()` mutates in place; calling it on a Svelte 5 reactive proxy throws `state_unsafe_mutation`, which Svelte catches and rolls back — leaving the page permanently on \"Loading...\"; fixed by spreading a copy before sorting: `[...params].sort(...)`\n- [x] Stack clone — duplicate any existing stack into a new one pre-filled with all config (tool, repo, branch, env vars, policies, notifications, tags, runner pool, hooks, drift settings); clone action available from the stack list and stack detail page; cloned stack starts with no runs and no state\n- [x] Run re-trigger — queue a new run directly from any historical run without navigating back to the stack; retrigger button on the run detail page creates a fresh run for the same stack and immediately navigates to it\n- [x] Discord notifications — per-stack Discord incoming webhook; fires on plan complete, run succeeded/failed; org-level default in Settings → Notifications inherited by new stacks; one-click test delivery\n- [x] Microsoft Teams notifications — per-stack Teams incoming webhook or Power Automate HTTP trigger; same event coverage as Discord; org-level default; one-click test delivery\n- [x] Audit log JSON export — audit log export now supports `?format=json` in addition to CSV; returns a JSON array of all matching events respecting active filters; available from the audit log page alongside the existing CSV export\n- [x] Audit log action filter — free-text action input replaced with a category dropdown (run.*, stack.*, policy.*, org.*, blueprint.*, varset.*, sa.*, tag.*) for cleaner filtering without knowing exact action names\n- [x] Approval timeout auto-discard — configurable global timeout (in hours) in Settings → Notifications; a background sweep every 5 minutes discards runs stuck in `unconfirmed` or `pending_approval` beyond the deadline and records a `run.approval_expired` audit event; set to 0 to disable\n- [x] Approval expiry on startup — the approval expiry sweep now runs immediately when the server starts, not only on the first 5-minute tick; runs that expired during a server restart or downtime are caught and discarded within seconds of boot\n- [x] Audit JSON export streaming — large audit exports no longer buffer all rows into memory before responding; rows are marshalled and flushed individually so memory usage is O(1) regardless of result set size\n- [x] OIDC provider init failure — if the OIDC provider URL is unreachable at startup the server now logs a structured error with the issuer URL and exits cleanly rather than panicking with a Go stack trace\n- [x] CLI tool — `crucible` standalone Go binary (`api/cmd/crucible`) for use in CI pipelines and terminals; config stored in `~/.config/crucible/config.yaml` with `CRUCIBLE_URL` / `CRUCIBLE_TOKEN` env overrides and per-invocation `--url` / `--token` flags; commands: `configure` (interactive setup), `stacks list`, `stacks show \u003cid\u003e`, `runs list [--stack \u003cid\u003e]`, `runs trigger \u003cstack-id\u003e [--type proposed|tracked|destroy]`, `runs approve`, `runs confirm`, `runs discard`, `runs status [--watch]`; `--json` flag emits raw API JSON for scripting; `-q` / `--quiet` prints only the ID\n- [x] OpenTofu / Terraform version pinning — per-stack `tool_version` field; when set, the runner downloads the exact binary from the official release URLs (OpenTofu: GitHub releases, Terraform: HashiCorp releases) before execution; arch-detected at runtime (amd64 / arm64); empty = use the version baked into the runner image; propagated through all enqueue paths (manual, webhook, drift, schedule, TTL, downstream triggers, auto-apply)\n- [x] State version diff — state snapshot captured after every successful apply (MinIO, keyed by serial, deduped on `stack_id + serial`); State History section on the stack detail page lists up to 50 versions with serial, resource count, linked run, and timestamp; click Diff to load a structured diff against the previous serial: added resources in green, removed in red, changed in yellow with per-attribute before/after values; string values capped at 512 chars\n- [x] GitHub App integration — first-class GitHub App support; register an App in Settings → GitHub App (encrypted credentials, auto-generated webhook + setup URLs); install flow with HMAC-signed state round-trip; single global webhook per App replacing per-stack webhook URLs; per-stack installation picker; push and PR events dispatched via App auth; installation access tokens used for PR comments and commit status checks; backwards-compatible with existing PAT-based stacks\n- [x] GitHub App setup guidance — Settings → GitHub App now shows a 6-step setup guide with required permissions (Contents/Metadata/Pull requests/Commit statuses), webhook events (Push/Pull request/Create), and inline \"where to find this\" hints for every registration form field; post-registration URL section labels each URL with the exact GitHub field name to paste into; stack detail page shows an \"App token active\" / \"Using PAT\" status badge so the current auth mode is visible at a glance\n- [x] GitHub App discoverability — Settings → GitHub App shows a \"Ready — now connect to stacks\" callout once an installation exists; stack Integrations section shows a cross-reference hint linking directly to the GitHub App authentication section when an App is available but the stack is not yet connected; Callback URL vs Setup URL distinction clarified in both the setup guide and the URL wiring section\n- [x] GitHub App installation sync — \"Sync from GitHub\" button calls the GitHub API directly to recover installations missed when the Setup URL was not configured before first install; amber warning in the empty installations state explains the required ordering and directs users to sync if the callback was missed\n- [x] GitHub App bug fixes — installations now correctly appear in the UI after registration (pgx binary scan failure on `timestamptz` → `string` silently dropped every row; fixed to `time.Time`); sync errors surface the full GitHub API response body instead of a generic message; Docker build `ERR_PNPM_LOCKFILE_CONFIG_MISMATCH` resolved by aligning pnpm to v10 in Dockerfile and CI\n- [x] UI polish — settings sections (IaC scanning, Infracost, Slack/Discord/Teams/Gotify/ntfy/SMTP) now show a teal \"Configured\" badge matching the existing AI Troubleshooting indicator; filter bar controls on Stacks and Runs pages stay fixed-width at all viewport sizes (CSS specificity fix); Policies, Registry, Providers, Blueprints, Stack Templates, Variable Sets, and Monitoring pages aligned to the full-width layout used by Dashboard and Audit Log; Settings content pane padding unified to match the rest of the app\n- [x] Stack health scoring — per-stack health badge (healthy / degraded / unhealthy / unknown) on the stacks list, computed from the ratio of finished to failed runs across the last 10 non-drift terminal runs; no migration required (pure SQL subquery); green / amber / red / grey dot with percentage tooltip; unknown shown for stacks with no run history\n- [x] Multi-org support — single Crucible instance hosting multiple isolated organizations; instance-admin role with dedicated sidebar section; cross-org CRUD (create, archive/restore, force-add members) via 9 admin endpoints; `CRUCIBLE_INSTANCE_ADMIN_EMAIL` bootstrap; `CRUCIBLE_DISABLE_PERSONAL_ORGS` for MSP/managed-tenant deployments; archived orgs filtered from all member surfaces; audit events on every admin action; migration 071\n- [ ] RustFS object storage — replace the bundled MinIO with RustFS for a fully Rust-native, S3-compatible object store; same API surface, lower resource footprint\n- [x] Projects / Spaces — hierarchical org → project → stack layout; per-project RBAC (admin / member / viewer) with org-member fallback; stacks optionally assigned to a project (unassigned stacks remain visible to all org members); project list page with stack and member counts; project detail with stacks tab and members tab; project filter on the stacks list; project assignment in stack create and edit forms; per-project member management with inline role changes; audit events on project create and delete\n- [x] Bitbucket Cloud and Azure DevOps VCS — first-class webhook ingestion, PR comments, and commit status checks; Bitbucket uses HMAC-SHA256 (`X-Hub-Signature`) + Basic workspace:app_password for API calls; Azure DevOps uses Basic-auth webhook validation + Basic `:PAT` for Commit Statuses and PR Threads APIs; `vcs_username` column added to `stacks` for Bitbucket workspace identification; migration 069\n- [x] ChatOps approvals — approve, confirm, and discard runs directly from Slack, Teams, Discord, Gotify, ntfy, and email notifications; HMAC-SHA256-signed action links (24 h TTL, keyed to `CRUCIBLE_SECRET_KEY`) embedded in plan notifications for all six channels; clicking a link performs the action server-side and redirects to the run detail page; no Slack App, bot token, or extra installation required — works with any incoming webhook; confirm/approve and discard links vary by run state (`unconfirmed` vs `pending_approval`)\n- [x] Terragrunt support — first-class `tool: terragrunt` value; `run-all plan/apply/destroy` orchestration with `--terragrunt-non-interactive`; binary auto-downloaded from GitHub releases per run (`CRUCIBLE_TOOL_VERSION` sets the version, default `0.72.1`); arch-detected at runtime (amd64 / arm64) and cached in `/tmp/versioned/`; drift detection supported (run-all plan exit code); `TF_HTTP_*` state backend wired to Crucible's built-in state API; Terragrunt option added to stack create, stacks list filter, blueprints, and stack templates UI; green tool badge on stack detail page; no plan artifact upload (run-all produces per-module plans, not a single binary)\n- [x] Customer-managed encryption keys (BYOK) — wrap the vault master key with a key in your own KMS; AWS KMS (Sig v4 to `TrentService.Encrypt`/`.Decrypt`), HashiCorp Vault Transit (token or AppRole), and Azure Key Vault (AAD client-credentials → wrapkey/unwrapkey) all supported with no vendor SDK dependencies; auth credentials live in env vars so the vault can boot without first decrypting any rows; admin UI at Settings → BYOK with status, enable (with wrap/unwrap canary test before commit), online rotation, and disable; each transition re-encrypts every vault-protected row in a single transaction and atomically swaps the in-memory master post-commit so no server restart is needed; transitions emit `byok.enabled` / `byok.rotated` / `byok.disabled` audit events; migration 070\n- [x] Audit log SIEM streaming — event-driven fan-out of every audit event to any number of per-org destinations; destinations: Splunk (HEC, newline-delimited JSON), Datadog Logs API, Elasticsearch (`_bulk` NDJSON, Basic auth or API key), generic webhook (HMAC-SHA256 `X-Crucible-Signature`), GCP SecOps / Chronicle (UDM batch, RSA JWT service account auth), Wazuh (REST API or TCP syslog), Graylog (GELF HTTP); configs vault-encrypted at rest; River worker (`SIEMDeliveryWorker`) picks up the job within milliseconds of `audit.Record`; delivery status tracked in `siem_event_deliveries` (status, attempts, last error, timestamp); test-connection endpoint per destination; UI at **Settings → SIEM Streaming** with destination table, type-specific config modal, and delivery log with filter; migration 068\n- [x] Compliance policy packs — installable OPA bundles for SOC 2 (CC6.1, CC6.7, CC7.2), CIS AWS Foundations (IAM root keys, S3 public access, password policy), HIPAA (PHI encryption, audit logging), and PCI-DSS (no public ingress, TLS enforcement); Rego bodies embedded in the binary and served as first-class `policy_git_sources` rows (`pack_slug` column); catalog at **Policies → Compliance Packs**; one-click pack attach/detach on stack detail page; pack policies wired into `post_plan` evaluation via `stack_policy_sources` join; sync queues a live fetch from [`ponack/crucible-policies`](https://github.com/ponack/crucible-policies); migrations 065–066\n- [x] Plan diff between runs — `tofu show -json` output uploaded alongside the binary plan artifact (MinIO, `plans/{runID}.json`, no DB column); `GET /api/v1/stacks/:id/plan-diff?from=\u0026to=` compares `resource_changes` between any two plan runs on the same stack — new resources in green, removed in red, changed in yellow with per-attribute before/after values (strings capped at 512 chars, no-op resources skipped); Plan Comparison section on the stack detail page with from/to run dropdowns and inline diff render; pre-feature runs return a clear 404\n- [x] Notification channel icons — ntfy icon updated to the official SimpleIcons path (correct teal `#317F6F` brand colour, replacing the generic bell); Gotify icon updated to the multi-colour dashboardicons logo (tan/cyan/white official palette on a white rounded background, replacing the generic chat-bubble-lightning glyph)\n- [x] Budget alerts — per-stack plan-change thresholds (max adds, changes, destroys); breach fires notifications through all configured channels (Slack, Discord, Teams, Gotify, ntfy, email) via a dedicated `BudgetAlert` notifier method; optional `plan_block_on_alert` flag holds the run in `unconfirmed` instead of auto-applying when any threshold is exceeded; thresholds configured in the stack detail settings form; migration 064 adds nullable INT columns + bool flag to stacks\n- [x] In-app run analytics — `GET /api/analytics/runs?days=N` (7–90 day window) returns daily run counts by status, per-stack summaries (total, success %, plan add/change/destroy totals), and org-wide overview; `/analytics` page with overview cards, CSS bar chart by day, and per-stack breakdown table; Analytics entry added to sidebar\n- [x] Continuous validation — periodic OPA policy re-evaluation against current Terraform state, independent of run lifecycle; new `validation` policy type evaluates `input.state`; per-stack configurable interval (minutes, 0 = disabled); River worker fetches state from MinIO, evaluates all attached validation policies, stores result in `stack_validation_results`, updates `validation_status` on the stack; status-change notifications fan out to all configured channels (Slack, Discord, Teams, Gotify, ntfy, email); status dot on stacks list table; result history and manual Validate button on stack detail page; migration 067\n- [x] Infracost cost analytics + per-stack budget threshold — **Analytics → Cost estimates** tab surfaces the Infracost estimated monthly cost delta already collected per run (no new cloud credentials needed); org-wide cost overview (net delta, cost add/remove, runs with cost data), daily cost-add bar chart, and per-stack breakdown table with over-budget row highlighting; `GET /analytics/costs?days=N` endpoint (1–90 day window); stack detail page gains an Infracost sparkline showing the last 20 cost-bearing runs (bars turn red when a run exceeds the threshold); `budget_threshold_usd` field on stacks: when set, `checkBudgetThresholds` fires a budget alert notification (and optionally blocks auto-apply) when a run's Infracost `cost_add` exceeds the USD limit; configured in the stack settings Budget alerts section alongside the existing plan-change thresholds; migration 072\n- [x] Bug fixes (v0.8.37) — three runner / worker fixes shipped together: (1) **Infracost binary was missing** from the runner image — `infracost breakdown` is now installed in the Dockerfile and called after every plan phase so cost estimates populate in the Analytics page; (2) **Terraform state lost after apply** — the runner now writes `crucible_backend_override.tf.json` into the working directory before `init` to force the HTTP backend regardless of whether the user's code declares one; without this, code with no backend block silently wrote state to the ephemeral tmpfs and it vanished on container exit; the stack detail Resources tab now distinguishes a load error from genuinely empty state; (3) **Long-running jobs killed after 1 minute** — River's `JobTimeoutDefault` is 1 minute and was never overridden, so any run exceeding 60 seconds was cancelled with a misleading \"run timed out after 90 minutes\" error; `RunWorker.Timeout()` now returns `-1` to disable River's own timeout while `streamAndWait` manages the real deadline independently from `context.Background()`\n- [x] Bug fix (v0.8.38) — **run log viewer showed wrong run's output**: clicking a run from the dashboard could display log lines from a completely different run; root cause: pgxpool does not `UNLISTEN` when a connection is returned to the pool, so a connection that had subscribed to run A's log channel retained that subscription and delivered buffered notifications from run A when reused for run B's SSE stream; fix: `UNLISTEN *` before `conn.Release()` in the log SSE handler; bonus fix: `pending_approval` runs now correctly serve their stored plan log from object storage instead of opening a live LISTEN that never receives data\n- [x] Per-resource cost breakdown (v0.9.0) — Infracost was producing per-resource detail in its breakdown JSON but the runner discarded all of it and only reported aggregate totals; now the runner parses `.projects[].diff.resources[]` and ships each line item to a new `POST /api/v1/internal/runs/:id/cost-resources` endpoint; the run detail page gains a collapsible *Cost breakdown* section showing each resource's before/after/delta with the biggest movers surfaced first (sorted by absolute delta DESC); aggregate cost reporting unchanged for back-compat; migration 073 adds `run_cost_resources` with cascade-delete and a unique constraint on `(run_id, resource_address)`\n- [x] Per-org concurrent-run quota (v0.9.0) — protects shared / MSP deployments from any single org monopolising the worker queue; new `org_quotas` table (NULL = unlimited; orgs without a row default to unlimited); `CheckConcurrentQuota` called from every user-initiated run creation path (UI Create, Trigger Drift, Retrigger, webhook); 429 on overflow with a descriptive message; webhook delivery log records `org_quota_exceeded` so missed triggers are visible; counts queued / preparing / planning / applying / unconfirmed / pending_approval (stuck-awaiting-confirmation runs still consume a slot — intentional); system-internal inserts (drift scheduler, TTL cleanup, auto-remediate from finalize) bypass the cap so scheduled work can't be starved; new **Settings → Resource Quotas** page with current-usage card (count + progress bar at 80% / 100% thresholds), admin-only edit + member status view; migration 074\n- [x] Approval escalation notifications (v0.9.0) — runs sitting in `unconfirmed` or `pending_approval` longer than a per-stack threshold fire a one-time ⏰ escalation notification through the stack's existing notification channels (Slack, Discord, Teams, Gotify, ntfy, email) — useful for paging on-call when an apply has been waiting too long; race-safe via `UPDATE ... RETURNING` so two workers can't double-fire; escalation only notifies, the existing `approval_timeout_hours` path still owns auto-discarding stuck runs; per-stack `escalation_after_minutes` input in stack settings; amber `⏰ Escalated · \u003ctime\u003e` line on the run detail page once fired; `run.escalated` audit event with `after_minutes` context; migration 075\n- [x] Major documentation expansion (v0.9.0) — **For IaC beginners:** new `docs/iac-101.md` (what is IaC, plan/apply/state, why Crucible vs raw Terraform), `docs/glossary.md` (every Crucible + Terraform / Pulumi / Ansible / OPA / VCS term in one place), `docs/troubleshooting.md` (user-facing errors — plan failed, state locked, policy denied, webhook silent, OIDC loop, drift); the quickstart's flat *What's next* table is replaced with role-based funnels (Infrastructure / Platform / Security / Power user). **For underdocumented features:** `docs/guides/cli.md` (the `crucible` CLI was fully implemented but invisible), `docs/guides/projects.md` (hierarchical org → project → stack RBAC), `docs/guides/variable-sets.md`, `docs/guides/tags.md`, `docs/guides/external-secrets.md` (AWS SM / HC Vault / Bitwarden SM / Vaultwarden setup). **For new audiences:** `docs/guides/digitalocean.md`, `docs/guides/hetzner.md` (cost-conscious clouds), `docs/guides/kubernetes.md` (cluster-vs-workload split, three auth options, Helm patterns, CRD gotchas), `docs/guides/github-actions.md` (build-and-deploy, PR-preview cleanup, `workflow_dispatch` patterns with both CLI and curl recipes), `docs/guides/tfc-migration.md` (Terraform Cloud → Crucible). **Plus** `docs/guides/terragrunt.md` and a `ponack/crucible-quickstart-terragrunt` template repo for the same Use-this-template workflow as the OpenTofu starter\n- [x] Monorepo path filters on triggers (v0.9.1) — per-stack `trigger_paths` glob list evaluated against the union of added / modified / removed files in the push payload; runs only fire when at least one changed file matches at least one glob; default behaviour unchanged when the list is empty; gobwas/glob syntax (`**`, `*`, `?`, character classes); skipped events recorded in `webhook_deliveries` with reason `path_filter_no_match`; GitHub / Gitea / Gogs / GitLab push payloads carry changed-file data inline so the filter applies there; Bitbucket / Azure DevOps push events and all PR events default-allow (their payloads don't include changed files); closes the biggest current gap vs Digger, Atlantis, and Scalr for teams running monorepos; migration 076\n- [x] Sequential approver chains (v0.9.1) — beyond single-approval gating, define an ordered list of `{name, approver_user_ids[]}` on the stack; a run that enters `pending_approval` advances step by step (any approver in step N satisfies it; step N+1 is notified only after step N approves); run advances to `unconfirmed` only when every step has an approval; per-step state stored in `run_chain_approvals` with `UNIQUE (run_id, step_index, approver_id)` to prevent dup approvals; new `Notifier.ChainStepReady` broadcasts the next-step prompt through every configured channel (Slack / Discord / Teams / Gotify / ntfy / email); `GET /api/v1/runs/:id/chain` returns the per-step progress for the run detail view; stack edit gets a card-per-step builder; run detail gets a numbered list with status dots (✓ approved, pulsing amber current, grey future); validates that every `approver_user_id` is an active org member on PATCH; empty chain preserves the existing single-approval semantics; migration 078\n- [x] Webhook push policies (v0.9.1) — per-stack `skip_commit_message_patterns` (case-sensitive substring match) and `skip_actors` (case-insensitive equality on the webhook sender's login) reject incoming events before they create a run; typical use is silencing `[skip ci]` commits and Dependabot / Renovate bots; actor parsing wired for GitHub (`sender.login`), GitLab (`user_username` / `user.username`), and the GitHub-compatible Gitea / Gogs payloads; Bitbucket and Azure DevOps actor parsing is on the v0.10 roadmap; skipped events show in `webhook_deliveries` with reason `skip_commit_message:\u003cpattern\u003e` or `skip_actor:\u003clogin\u003e` for traceability; migration 077\n- [x] Stack edit form sidebar nav (v0.9.2) — the stack edit form had grown to 30+ fields on a single scroll; replaced with a left-sidebar layout matching the existing `/settings` pattern. Nine sections: General, Runtime, Validation, Approvals, Webhooks, Scheduling, Lifecycle hooks, PR previews, Budget alerts. Same form state, same save semantics — users navigate by topic instead of scrolling. New Stack page unchanged (already minimal)\n- [x] Settings section tabs (v0.9.2) — both the General settings page (8 sections: Account, Runner, Retention, Cloud OIDC, IaC scanning, Infracost, AI, Instance info) and the Notifications page (8 channels: Slack, Discord, Teams, Gotify, ntfy, Email, Approval timeout, VCS defaults) gain horizontal section tabs so one topic is visible at a time; tabs are at the top of the content area rather than a second sidebar (avoids sidebar-in-sidebar inside the outer `/settings` layout)\n- [x] DNS / domain requirements documented (v0.9.2) — new \"Do I need a DNS domain?\" table in `docs/operator-guide.md` with eight scenarios (quickstart on laptop, internal VPN, Let's Encrypt, public VCS webhooks, OIDC SSO, cloud OIDC federation, ChatOps) and per-row reasoning. One-liner pointer added to `docs/quickstart.md` Prerequisites so new users land on the answer fast\n- [x] Stack detail page section tabs (v0.9.3) — the stack detail page previously rendered 25 read-only sections in one 3,800-line scroll; now grouped into six tabs at the top: **Overview** (stack details, lifecycle hooks ro, continuous validation, cost sparkline, recent runs), **State** (resources, state history, plan comparison, remote state sources, external backend, state tokens), **Config** (tags, env vars, variable sets, dependencies), **Policies** (policies + packs), **Notifications \u0026 integrations** (notifications, VCS / secret integrations, GitHub App, Cloud OIDC), and **Webhooks \u0026 access** (webhook URL, deliveries, outgoing, access list, module publishing). Tab strip hidden while editing so the edit-form sidebar from v0.9.2 keeps full width\n- [x] Active-tab theme consistency (v0.9.3) — every navigation surface now reads the Forge theme's `--accent`, `--accent-muted`, and `--accent-border` CSS variables for the active state, replacing the inconsistent mix of hard-coded zinc-800 / amber-400 / amber-300 colours. Affects the outer `/settings` sidebar, the stack edit sidebar, and the General / Notifications horizontal tab strips. Active highlight automatically tracks the chosen theme\n- [x] Supply-chain transparency: SBOM + cosign attestation (v0.9.4) — every tagged release now ships a CycloneDX SBOM for the runner image, generated by `anchore/sbom-action` against the just-built digest. The SBOM is bound to the image digest via `cosign attest --type cyclonedx` (registry-side) and attached to the GitHub Release as a downloadable asset. Pairs with the existing cosign signature + image-digest pin in `RUNNER_DEFAULT_IMAGE`. Downstream operators can verify what's inside the container they pull\n- [x] UX polish: typed-confirm destructive flows + auth error detail (v0.9.4) — new reusable `TypedConfirmModal` component matching the look of the stack-destroy modal; wired into Stack delete (type the stack name), Project delete (type the project name), and Run delete (type `DELETE`). Replaces the prior single-click `window.confirm()`. **Plus:** required-field asterisks on the New Stack form (Name / Tool / Repo URL); OIDC `error` / `error_description` URL params surfaced on the auth callback page instead of a generic \"Failed to parse auth token\"; the Settings → Account section is no longer a dead-end card and gets a Sign-out button plus quick links to API tokens and Organization\n- [x] URL-driven tab state (v0.9.4) — every tabbed page wires its active section to a URL query param: `/settings?tab=`, `/settings/notifications?tab=`, `/stacks/\u003cid\u003e?tab=` (detail) and `?edittab=` (edit form). Reload preserves the active tab, bookmarks deep-link to a specific tab, shared links open on the right tab, and browser back/forward moves between tabs. The default tab on each page omits the param to keep the canonical URL clean. Invalid IDs fall back to the default\n- [x] Stack detail header description wrap fix (v0.9.4) — the stack title row stuffed breadcrumb, tool badge, description, and tag pills into one flex container; with many action buttons on the right, descriptions on mid-size stacks would wrap one word per line. Description now sits on its own line below the badge row with `break-words`; badges and tag pills stay on the same row with `flex-shrink-0` so they keep their natural widths\n- [x] SSE log stream disconnect indicator (v0.9.5) — the run detail page streams logs over Server-Sent Events; previously a dropped connection (browser tab backgrounded, network blip, reverse-proxy idle timeout) would silently freeze the log without telling the user the stream had ended. Now a yellow banner appears the moment the `EventSource` enters the `CLOSED` state with a manual **Reconnect** button; reconnect re-opens the stream and resumes appending from where it left off. No effect on completed runs (their final log is served from object storage and doesn't need a stream)\n- [x] Skeleton loaders across list and detail pages (v0.9.5) — the bare *Loading…* text on 22 list / detail pages was replaced with content-shaped skeletons (three variants: line, card, and table-row, with a CSS shimmer). Each page picks the variant that matches its content so the loading state telegraphs the real layout instead of a vertical bar of text. Faster perceived loads without changing the data fetch\n- [x] List-page polish — Policies search + Runs filter URL persistence (v0.9.5) — the Policies list gains a client-side search input that filters by name, description, or type with a live `N of M` counter; the Runs list now persists status / type / stack / tag filters in the URL query string so reload, bookmark, and shared links keep the user's filter selection. `goto({ replaceState: true, keepFocus: true, noScroll: true })` keeps the navigation cheap and non-disruptive\n- [x] Bulk delete on the runs list (v0.9.5) — checkbox column with a sticky bulk-action bar that appears once one or more runs are selected; the **Select all** header checkbox selects only terminal runs (finished, failed, canceled, discarded) since those are the only ones the backend will delete. Confirmation reuses the typed-confirm modal (type `DELETE`); the modal stays open during the client-side delete loop showing per-item progress and a final toast summarising successes and failures\n- [x] Global keyboard shortcuts (v0.9.5) — vim-style shortcuts: `?` toggles a help modal listing every binding; `g` + `d / p / s / r / l / a / o / ,` jumps to Dashboard / Projects / Stacks / Runs / Policies / Audit / Worker Pools / Settings; `/` focuses the current page's primary search input; `Esc` closes the help modal. A floating pill appears while a `g`-prefix is pending so users see the second key is being awaited (1.5s timeout). Listener ignores keystrokes in input / textarea / contenteditable surfaces, doesn't intercept Ctrl / Cmd / Alt combos, and a `?` button in the sidebar footer makes the feature discoverable\n- [x] Mobile responsiveness (v0.9.6) — the desktop sidebar (224px fixed) used to eat half the screen on a phone, making the app effectively unusable on mobile. Sidebar now collapses to an overlay drawer on `\u003cmd`, opened via a hamburger in a slim mobile-only top bar at the top of `\u003cmain\u003e` (with the Crucible wordmark); drawer auto-closes on route change. List-page card wrappers switched from `overflow-hidden` to `overflow-x-auto overflow-y-hidden` (both axes non-visible so border-radius clipping is preserved) so wide tables scroll horizontally inside the card. Filter rows on /runs and /stacks gain `flex-wrap` so widgets stack instead of squashing. Page padding drops to `p-4` on phones (16px reclaimed per side). Closes the last Tier-D UX item\n- [x] Karpathy-lens code review + cleanup (v0.9.6) — installed the [karpathy-skills plugin](https://github.com/multica-ai/andrej-karpathy-skills) and self-reviewed the codebase. Acted on three findings: (1) **dropped unused escape-hatches** in `KeyboardShortcuts.svelte` — `data-no-shortcuts` attribute and `data-shortcut=\"search\"` selector fallback both had zero callers (Karpathy #2 Simplicity First); (2) **extracted the Policies tab** out of the 4,101-line `stacks/[id]/+page.svelte` god component as a sibling component (`stack-detail/PoliciesTab.svelte`), shaving 171 lines / 6 `$state` declarations / 6 imports / 4 helpers / 2 derived from the parent and validating the pattern for the remaining tabs; policies + packs + catalog now lazy-load on tab open instead of pre-fetching on every stack page visit; (3) **deferred** the proposed stack-table schema split — 40 columns in `updateStackReq` look like overload but the cost (6 migrations, JOINs everywhere) outweighs the benefit when the columns aren't actually causing day-to-day pain\n- [x] Backend test scaffold + backfilled tests (v0.9.6) — the existing `runs/statemachine_test.go` had usable DB-integration helpers but they were package-private, so the rest of the codebase (116 .go files, only 8 with tests) couldn't reuse them. New `internal/testutil` package exports `Pool(t)`, `InsertOrg`, `InsertUser`, `InsertStack`, `InsertRun` — all register `t.Cleanup`, skip locally when `TEST_DATABASE_URL` is unset, run against the CI Postgres service container otherwise. Refactored `statemachine_test.go` to use them. Backfilled tests for three recently-shipped features that landed without coverage: `pathsMatchAnyGlob` (path filter, 13 table-driven cases including the \"all-globs-invalid → no-filter\" safety claim), `pushPolicySkipReason` (push policy, 10 cases covering case-sensitivity rules and check ordering), and `chain.go` (sequential approver chain advancement, 6 integration tests covering decode, step advancement, ineligibility rejection, no-op on satisfied chains). Going forward, any new feature is one import away from a test\n- [x] Worker pool auto-scaling signals (v0.9.7) — Crucible can't reach the operator's infrastructure to spawn agents, but it can publish per-pool demand signals so the operator's platform (Kubernetes HPA, Compose, CloudWatch) reacts. Three new Prometheus gauges labelled by `pool_id` + `pool_name`: `crucible_worker_pool_queue_depth` (runs in `queued` waiting for an agent), `crucible_worker_pool_running_runs` (runs actively executing on the pool), and `crucible_worker_pool_seen` (1 if any agent checked in within the last 60s). Polled every 30s alongside the existing gauges; `.Reset()` first so deleted pools have their labels cleared on the next tick. Disabled pools excluded from the scrape. `docs/operator-guide.md` gains a Kubernetes HPA recipe (via `prometheus-adapter` external metrics), Docker Compose scaling notes, and a recommended scaling-policy starting point\n- [x] Per-project monthly cost quotas (v0.9.7) — extends the v0.9.0 concurrent-run quota with monthly Infracost-sum caps at the project level. New `monthly_budget_usd` + `budget_enforcement` columns on projects (migration 079) — NULL budget means \"no quota,\" enforcement is `warn` (notify only) or `block` (notify + inhibit auto-apply, run sits in `unconfirmed` for human override). New `MonthToDateSpend` and `CheckCostQuota` helpers wire into the post-plan gate alongside the existing stack-level `plan_block_on_alert`. Project edit form (admin-only) gains the budget input + enforcement dropdown; project header shows a `$X of $Y this month` meter that goes amber at 80% and red at 100%. Spend semantics: only `finished` runs count toward MTD (actuals, not forecasts) to avoid double-counting the run being checked. Per-tag quotas deferred — multi-tag stacks raise budget-ownership ambiguity; revisit after per-project gets real use\n- [x] Compliance evidence export (v0.9.7) — downloadable ZIP bundle for SOC 2 / HIPAA / PCI evidence requests. Admin picks a window (and optionally a project + tag filter) at **Settings → Compliance export**; server streams a bundle containing the period's runs, audit events, policy results, and chain approvals in both CSV and JSON, plus a manifest with HMAC-SHA256 signature so the recipient can verify the bundle wasn't tampered with after generation. Signature uses the existing `CRUCIBLE_SECRET_KEY` so no new operator secret is required (verify via `openssl dgst -sha256 -hmac \"$SECRET\" manifest.json`). Synchronous build for v1; the UI sets the expectation for 10-30s on large windows. Per-stack PDFs, scheduled exports, and async job-queue mode deferred — JSON/CSV covers every documented audit framework requirement\n- [x] Cost forecasting + block-on-forecast (v0.9.7) — pairs with the per-project quota above. New `block_on_forecast` BOOLEAN column on projects (migration 080); when TRUE the post-plan gate also fires on run-rate projections, not just actuals. Forecast model is deliberately simple: `mtd × days_in_month / max(days_elapsed, 1)` — equals MTD on the last day, full linear extrapolation on day 1, no Holt-Winters / seasonality / multi-month trend fit (operators want a predictable rule, not a black box). `CheckCostQuota` gains `Forecast`, `ForecastExceeded`, and `BlockOnForecast` fields; finalize gate emits a forecast-breach notification with text like *\"forecast: $X end-of-month at current rate (budget $Y, MTD $Z)\"* and the existing `budget_enforcement='block'` path inhibits auto-apply. Project header gets a thin pip on the budget bar where the forecast lands plus an \"On pace for $X by month-end\" subline\n- [x] CVE remediation — vitest + cve-lite (v0.9.7) — bumped `vitest` from 2.1.9 to 4.1.0 to resolve GHSA-5xrq-8626-4rwp (CSRF in the dev server's webSocket handler). Pinned `cve-lite-cli` and added a new CI step (`pnpm exec cve-lite . --no-open --fail-on critical`) so future critical CVEs in `package.json` / `package-lock.json` block the PR build instead of being noticed only at the next dependabot run\n- [x] Multi-stack DAG orchestration (v0.9.8) — closes the biggest open workflow gap vs Spacelift's \"Run Configurations.\" Four shipped pieces: **(1) Fan-in coordination** — when an upstream finishes, the downstream only triggers if it has no in-flight run AND every other upstream is \"newer\" than its last run-start (uninitialised upstreams treated as non-blocking so new edges don't deadlock the graph). Replaces the dumb fan-out where N upstreams produced N parallel downstream runs per upstream-finish event. **(2) Per-edge conditional triggers** (migration 081) — single-condition predicate on the upstream's just-finished run: field ∈ {`type`, `plan_add`, `plan_change`, `plan_destroy`, `cost_change`, `is_drift`}, op ∈ {==, !=, \u003e, \u003c, \u003e=, \u003c=}, value. Downstream fires only when predicate matches. State-output-based predicates deferred (would require MinIO state read + caching design). **(3) Per-edge retry policy** (migration 082) — `retry_count` (0–10) + `retry_backoff_seconds` (1–3600); failed dep-triggered runs auto-reschedule with exponential backoff (`delay = backoff × 2^attempt`). New `triggered_by_dep_id` + `retry_attempt` columns on runs let the failure path find the edge and count attempts. **(4) Visual DAG view** at `/dependencies` — org-wide read-only graph with hand-rolled layered + barycenter layout (no dagre dep); nodes are clickable, edges show predicate / retry as inline pill badges, hover highlights adjacent relationships\n- [ ] SBOM + cosign attestation for every release — superseded by the v0.9.4 implementation; left here as a stale roadmap row pending readme cleanup\n- [ ] VSCode extension — trigger runs, watch logs, view stack health, and edit `.rego` policies with OPA language-server completion directly in the editor; authenticates with the existing API token; closes the IDE-integration gap vs Spacelift\n- [ ] Multi-region run execution — geographic routing on worker pools so EU stacks run in an EU pool and US stacks in a US pool; metadata-driven routing keys on the pool; useful for data-residency-constrained customers\n- [ ] Private module registry enhancements — module dependency graph visualisation, version-constraint solver across stacks, and a `terraform test` integration that gates module publishing on the test suite passing; pairs with the existing built-in Terraform Module Registry Protocol registry\n\n## License\n\n[AGPL-3.0-or-later](LICENSE) — free to self-host forever. Commercial licenses available for proprietary or SaaS use.\n\n---\n\nBuilt by [Forged in Feathers Technology](https://www.forgedinfeatherstechnology.com) · [Crucible IAP product page](https://www.forgedinfeatherstechnology.com/crucible-iap)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fponack%2Fcrucible-iap","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fponack%2Fcrucible-iap","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fponack%2Fcrucible-iap/lists"}