{"id":47978170,"url":"https://github.com/runcycles/cycles-server-events","last_synced_at":"2026-04-26T15:01:16.734Z","repository":{"id":348371406,"uuid":"1197723750","full_name":"runcycles/cycles-server-events","owner":"runcycles","description":"Event delivery service for the Cycles ecosystem — asynchronous webhook dispatch with HMAC signing, retry, and pluggable transports","archived":false,"fork":false,"pushed_at":"2026-04-23T14:05:22.000Z","size":261,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-04-23T15:34:09.926Z","etag":null,"topics":["ai-agents","ai-governance","cycles-protocol","docker","event-driven","observability","webhook-server"],"latest_commit_sha":null,"homepage":"https://runcycles.io","language":"Java","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/runcycles.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":"AUDIT.md","citation":null,"codeowners":".github/CODEOWNERS","security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-03-31T20:35:19.000Z","updated_at":"2026-04-23T14:05:17.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/runcycles/cycles-server-events","commit_stats":null,"previous_names":["runcycles/cycles-server-events"],"tags_count":9,"template":false,"template_full_name":null,"purl":"pkg:github/runcycles/cycles-server-events","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/runcycles%2Fcycles-server-events","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/runcycles%2Fcycles-server-events/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/runcycles%2Fcycles-server-events/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/runcycles%2Fcycles-server-events/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/runcycles","download_url":"https://codeload.github.com/runcycles/cycles-server-events/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/runcycles%2Fcycles-server-events/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32301330,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-26T09:34:17.070Z","status":"ssl_error","status_checked_at":"2026-04-26T09:34:00.993Z","response_time":129,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai-agents","ai-governance","cycles-protocol","docker","event-driven","observability","webhook-server"],"created_at":"2026-04-04T10:58:49.169Z","updated_at":"2026-04-26T15:01:16.724Z","avatar_url":"https://github.com/runcycles.png","language":"Java","funding_links":[],"categories":[],"sub_categories":[],"readme":"[![CI](https://github.com/runcycles/cycles-server-events/actions/workflows/ci.yml/badge.svg)](https://github.com/runcycles/cycles-server-events/actions)\n[![License](https://img.shields.io/badge/license-Apache%202.0-blue)](LICENSE)\n[![Coverage](https://img.shields.io/badge/coverage-95%25+-brightgreen)](https://github.com/runcycles/cycles-server-events/actions)\n\n# Runcycles Event Server\n\nEvent delivery service for the Cycles ecosystem. Consumes events from Redis and delivers them to webhook endpoints with HMAC-SHA256 signing, exponential backoff retry, auto-disable on consecutive failures, and AES-256-GCM secret encryption at rest.\n\n**Spec:** [complete-budget-governance-v0.1.25.yaml](https://github.com/runcycles/cycles-server-admin/blob/main/complete-budget-governance-v0.1.25.yaml)\n\n## Architecture\n\n```\ncycles-server-admin                    cycles-server (runtime)\n    │ tenant/budget CRUD,                  │ reservation ops,\n    │ api_key/policy lifecycle             │ budget thresholds,\n    │                                      │ rate spike detection\n    │                                      │\n    └──────────┐              ┌────────────┘\n               ▼              ▼\n         EventService.emit() → save event + find matching subscriptions\n         WebhookDispatchService → create PENDING delivery + LPUSH dispatch:pending\n               │\n               ▼\nRedis ──BRPOP──► cycles-server-events (DispatchLoop)\n                    │\n                    ├── DeliveryHandler: load delivery + event + subscription\n                    ├── SubscriptionRepository: decrypt signing secret (AES-256-GCM)\n                    ├── WebhookTransport: HTTP POST with HMAC-SHA256 signature\n                    ├── On success: mark SUCCESS, reset consecutive failures\n                    ├── On failure + retries left: exponential backoff → RETRYING\n                    ├── On failure + retries exhausted: FAILED + increment consecutive failures\n                    └── On consecutive failures \u003e= threshold: subscription → DISABLED\n```\n\nEvent sources (per spec `source` field): `cycles-admin`, `cycles-server`, `expiry-sweeper`, `anomaly-detector`\n\n### Why a separate service?\n\n| Concern | Admin / Runtime Servers | Events Service |\n|---------|------------------------|----------------|\n| Workload | Synchronous CRUD + reservation ops | Asynchronous delivery, variable latency |\n| Scaling | Scale with API traffic | Scale with webhook volume |\n| Failure isolation | Servers stay responsive during delivery backlog | Delivery retries don't block API |\n| Concurrency | Multiple instances | Multiple instances safe (BRPOP is atomic) |\n\n## Quick Start\n\n### Full stack (with admin + runtime server)\n\n```bash\n# From cycles-server-admin directory\ndocker compose -f docker-compose.full-stack.yml up\n```\n\nServices: Redis (6379), Admin (7979), Runtime Server (7878), Events (7980)\n\n### Standalone (requires existing Redis)\n\n```bash\nREDIS_HOST=localhost REDIS_PORT=6379 java -jar target/cycles-server-events-*.jar\n```\n\n## Configuration\n\n| Variable | Default | Description |\n|----------|---------|-------------|\n| `REDIS_HOST` | localhost | Redis hostname |\n| `REDIS_PORT` | 6379 | Redis port |\n| `REDIS_PASSWORD` | (empty) | Redis password |\n| `WEBHOOK_SECRET_ENCRYPTION_KEY` | (empty) | AES-256-GCM key for signing secret encryption (base64-encoded 32 bytes). If empty, secrets stored/read as plaintext (backward compatible). |\n| `dispatch.pending.timeout-seconds` | 5 | BRPOP blocking timeout (seconds) |\n| `dispatch.retry.poll-interval-ms` | 5000 | Retry queue poll interval (ms) |\n| `RETRY_BATCH_SIZE` | 100 | Max retries to requeue per poll cycle |\n| `dispatch.http.timeout-seconds` | 30 | HTTP request timeout for webhook delivery |\n| `dispatch.http.connect-timeout-seconds` | 5 | HTTP connect timeout |\n| `MAX_DELIVERY_AGE_MS` | 86400000 | Maximum delivery age (ms). Deliveries older than this after events service outage are auto-failed instead of delivered stale. Default: 24 hours. |\n| `EVENT_TTL_DAYS` | 90 | Redis TTL for `event:{id}` keys (days). Spec: \"90 days hot.\" |\n| `DELIVERY_TTL_DAYS` | 14 | Redis TTL for `delivery:{id}` keys (days). |\n| `RETENTION_CLEANUP_INTERVAL_MS` | 3600000 | How often to trim expired ZSET index entries (ms). Default: 1 hour. |\n\n### Generating the encryption key\n\n```bash\nopenssl rand -base64 32\n```\n\nThe same key must be configured in both `cycles-server-admin` and `cycles-server-events`. Admin encrypts secrets on write; events decrypts on read.\n\n## Signing Secret Lifecycle\n\n```\n1. Client creates subscription (POST /v1/admin/webhooks)\n   └── optionally provides signing_secret, or admin auto-generates one (whsec_...)\n\n2. Admin stores encrypted secret in Redis\n   └── webhook:secret:{subscriptionId} = AES-256-GCM(secret, WEBHOOK_SECRET_ENCRYPTION_KEY)\n   └── Returns plaintext secret to client ONCE in WebhookCreateResponse (never again)\n\n3. Events service reads + decrypts secret on each delivery\n   └── CryptoService.decrypt(redis.get(\"webhook:secret:{id}\"))\n   └── Backward compatible: plaintext secrets (no \"enc:\" prefix) returned as-is\n\n4. PayloadSigner computes HMAC-SHA256(JSON payload, decrypted secret)\n   └── Sent as X-Cycles-Signature: sha256=\u003chex\u003e header\n\n5. Webhook receiver verifies signature using their copy of the secret\n```\n\n## HMAC-SHA256 Signature Verification\n\nThe `X-Cycles-Signature` header contains `sha256=\u003chex\u003e` where `\u003chex\u003e` is the HMAC-SHA256 of the raw JSON request body using the subscription's signing secret as the key.\n\n**Why HMAC?** Without it, anyone who discovers a webhook URL can send fake events. HMAC proves both identity (shared secret) and integrity (body hash). Same standard used by GitHub, Stripe, and Slack webhooks.\n\n**Verification example (Python):**\n\n```python\nimport hmac, hashlib\n\ndef verify(body: bytes, secret: str, signature: str) -\u003e bool:\n    expected = \"sha256=\" + hmac.new(\n        secret.encode(), body, hashlib.sha256\n    ).hexdigest()\n    return hmac.compare_digest(expected, signature)\n```\n\n## Webhook Delivery Headers\n\n| Header | Value | Description |\n|--------|-------|-------------|\n| `Content-Type` | `application/json` | Always JSON |\n| `X-Cycles-Signature` | `sha256=\u003chex\u003e` | HMAC-SHA256 of body (if signing secret configured) |\n| `X-Cycles-Event-Id` | `evt_abc123...` | For deduplication (at-least-once delivery) |\n| `X-Cycles-Event-Type` | `budget.exhausted` | Event type for routing |\n| `User-Agent` | `cycles-server-events/0.1.25.8` | Service identifier |\n| `X-Cycles-Trace-Id` | `\u003c32-hex-lowercase\u003e` | W3C trace-id (spec v0.1.25.27) — always present |\n| `traceparent` | `00-\u003ctrace-id\u003e-\u003c16-hex-span\u003e-\u003cflags\u003e` | W3C Trace Context v00 — always present. `\u003cflags\u003e` preserves upstream sampling when `WebhookDelivery.traceparent_inbound_valid=true` (spec v0.1.25.28), else `01` |\n| `X-Request-Id` | `\u003crequest-id\u003e` | Originating HTTP request id — present when `event.request_id` is populated |\n| Custom headers | Per subscription | From `WebhookSubscription.headers` map |\n\n## Retry Policy\n\nDefault: 5 retries with exponential backoff (1s, 2s, 4s, 8s, 16s), capped at 60s max delay.\n\n| Setting | Default | Range | Description |\n|---------|---------|-------|-------------|\n| `max_retries` | 5 | 0-10 | Retries after initial failure (6 total attempts) |\n| `initial_delay_ms` | 1000 | 100-60000 | Delay before first retry |\n| `backoff_multiplier` | 2.0 | 1.0-10.0 | Multiplier per retry |\n| `max_delay_ms` | 60000 | 1000-3600000 | Maximum delay cap |\n\nAuto-disable: after `disable_after_failures` (default 10) consecutive delivery failures, the subscription status is set to `DISABLED`. Reset to 0 on any successful delivery.\n\n## Delivery Status Lifecycle\n\n```\nPENDING ──HTTP 2xx──► SUCCESS (reset consecutive_failures)\n    │\n    └──non-2xx──► RETRYING ──retry──► SUCCESS\n                      │\n                      └──max retries exceeded──► FAILED\n                                                    │\n                                                    └──consecutive \u003e= threshold──► subscription DISABLED\n```\n\n## Redis Keys (shared with cycles-server-admin)\n\n| Key | Type | Written By | Read By | Description |\n|-----|------|-----------|---------|-------------|\n| `dispatch:pending` | LIST | Admin (LPUSH) | Events (BRPOP) | Delivery IDs awaiting processing |\n| `dispatch:retry` | ZSET | Events (ZADD) | Events (ZRANGEBYSCORE) | Retry queue (score = timestamp) |\n| `delivery:{id}` | STRING | Admin (SET) | Events (GET/SET) | Delivery record JSON |\n| `event:{id}` | STRING | Admin (SET) | Events (GET) | Event record JSON |\n| `webhook:{id}` | STRING | Admin (SET) | Events (GET/SET) | Subscription JSON |\n| `webhook:secret:{id}` | STRING | Admin (SET, encrypted) | Events (GET, decrypts) | AES-256-GCM encrypted signing secret |\n\n### Concurrent safety\n\nMultiple events service instances can safely BRPOP from the same `dispatch:pending` list — BRPOP is atomic, so each delivery is processed by exactly one consumer. No distributed locking needed.\n\n### TTL and retention\n\n| Key | TTL | Cleanup |\n|-----|-----|---------|\n| `event:{id}` | 90 days (configurable) | Auto-expire via Redis EXPIRE |\n| `delivery:{id}` | 14 days (configurable) | Auto-expire via Redis EXPIRE |\n| `events:{tenantId}`, `events:_all` | N/A (ZSET) | Hourly trim via RetentionCleanupService |\n| `deliveries:{subId}` | N/A (ZSET) | Hourly trim via RetentionCleanupService |\n| `dispatch:pending` | Self-draining | Consumed by BRPOP |\n| `dispatch:retry` | Self-draining | Entries move to pending when ready |\n\n### Resilience: events service down\n\nIf `cycles-server-events` is not running or not deployed:\n\n1. **Admin and runtime servers are unaffected** — event emission is fire-and-forget, wrapped in try-catch, never blocks the API response\n2. **Events and deliveries accumulate in Redis** — `event:{id}` keys (90-day TTL), `delivery:{id}` keys (14-day TTL), `dispatch:pending` list grows\n3. **Redis memory is bounded** — TTLs ensure keys auto-expire even if never consumed\n4. **When the events service restarts:**\n   - Stale deliveries (older than `MAX_DELIVERY_AGE_MS`, default 24h) are immediately marked FAILED — they won't be delivered late\n   - Fresh deliveries are processed normally via BRPOP\n   - RetentionCleanupService trims orphaned ZSET index entries hourly\n5. **No data loss for events** — event records persist in Redis for 90 days regardless of delivery status\n\n### Admin dual-auth on tenant webhook endpoints (informational)\n\nAs of admin-spec v0.1.25.16, six tenant-scoped webhook REST endpoints on `cycles-server-admin` accept both `ApiKeyAuth` and `AdminKeyAuth`:\n`GET /v1/webhooks`, `GET/PATCH/DELETE /v1/webhooks/{id}`, `POST /v1/webhooks/{id}/test`, `GET /v1/webhooks/{id}/deliveries`.\nAdmin-initiated updates record `actor_type=admin_on_behalf_of` in audit metadata (vs `api_key` for tenant-initiated).\n\n**No functional impact on this service** — `cycles-server-events` reads subscriptions from Redis directly and does not call those admin HTTP endpoints. Noted here for observability and ops awareness.\n\n## Event Types (41)\n\n| Category | Count | Types |\n|----------|-------|-------|\n| `budget` | 16 | created, updated, funded, debited, reset, **reset_spent**, debt_repaid, frozen, unfrozen, closed, threshold_crossed, exhausted, over_limit_entered, over_limit_exited, debt_incurred, burn_rate_anomaly |\n| `reservation` | 5 | denied, denial_rate_spike, expired, expiry_rate_spike, commit_overage |\n| `tenant` | 6 | created, updated, suspended, reactivated, closed, settings_changed |\n| `api_key` | 6 | created, revoked, expired, permissions_changed, auth_failed, auth_failure_rate_spike |\n| `policy` | 3 | created, updated, deleted |\n| `system` | 5 | store_connection_lost, store_connection_restored, high_latency, webhook_delivery_failed, webhook_test |\n\n`budget.reset_spent` (v0.1.25.6, admin-spec v0.1.25.18) is emitted for billing-period rollovers and is distinct from `budget.reset` (which is a ceiling resize that preserves spent). Consumers can route these separately. The payload's `spent_override_provided` flag indicates whether `spent` was explicitly supplied (migration / proration / correction) vs defaulted to 0 (routine rollover).\n\n## Transport Layer\n\nPluggable transport interface. Currently implements `webhook` (HTTP POST).\n\n```java\npublic interface Transport {\n    String type();\n    TransportResult deliver(Event event, Subscription subscription, String signingSecret);\n}\n```\n\n## Monitoring\n\nSpring Actuator endpoints run on a **separate management port (9980)** so they are not reachable from the public API port (7980). Keep 9980 on an internal-only ClusterIP / network; scrape from there.\n\n| Endpoint | Description |\n|----------|-------------|\n| `GET :9980/actuator/health` | Liveness check (UP/DOWN) |\n| `GET :9980/actuator/info` | Build info (version, artifact) |\n| `GET :9980/actuator/prometheus` | Prometheus-format metrics for scraping |\n\nPrometheus scrape config example:\n\n```yaml\nscrape_configs:\n  - job_name: cycles-server-events\n    metrics_path: /actuator/prometheus\n    static_configs:\n      - targets: ['localhost:9980']\n```\n\nOverride the management port via the `MANAGEMENT_PORT` env var if 9980 collides.\n\nIn addition to Spring Boot's auto-emitted `http_server_requests_seconds` (which covers the actuator endpoints, not the outbound webhook traffic), this service exposes eight domain-level meters under the `cycles_webhook_*` namespace — seven counters plus one latency timer. Operators can alert on fleet-wide failure rates, stale-delivery backlogs, subscription auto-disables, and payload-validator warnings without grepping logs.\n\nFull metric inventory, tag semantics, ready-to-paste Prometheus alert rules, SLO definitions, dashboard queries, and an incident playbook live in [`OPERATIONS.md`](OPERATIONS.md).\n\n## Webhook Payload Example\n\nThe webhook POST body is the full event JSON. Null fields are omitted.\n\n```json\n{\n  \"event_id\": \"evt_abc123\",\n  \"event_type\": \"budget.exhausted\",\n  \"category\": \"budget\",\n  \"timestamp\": \"2026-04-01T12:00:00Z\",\n  \"tenant_id\": \"t_xyz789\",\n  \"source\": \"runtime\",\n  \"data\": {\n    \"budget_id\": \"bdg_001\",\n    \"current_balance\": 0,\n    \"limit\": 10000\n  },\n  \"actor\": {\n    \"type\": \"api_key\",\n    \"key_id\": \"key_abc\",\n    \"source_ip\": \"10.0.1.42\"\n  },\n  \"correlation_id\": \"req_def456\"\n}\n```\n\n## Build \u0026 Test\n\n```bash\n# Build and run unit tests (201 unit tests, 95%+ line coverage enforced by JaCoCo)\nmvn verify\n\n# Run all tests including integration (requires Docker for Testcontainers Redis)\nmvn verify -Pintegration-tests\n\n# Run\nREDIS_HOST=localhost REDIS_PORT=6379 java -jar target/cycles-server-events-*.jar\n```\n\n## Documentation\n\n- [`CHANGELOG.md`](CHANGELOG.md) — release notes for downstream consumers (Docker / JAR / operators)\n- [`OPERATIONS.md`](OPERATIONS.md) — operator runbook: metrics inventory, alert recipes, SLOs, incident playbook\n- [`AUDIT.md`](AUDIT.md) — engineering history, audit posture, and cross-repo drift notes\n- Sibling services (same conventions, dashboards carry over):\n  - [`cycles-server`](https://github.com/runcycles/cycles-server) — runtime reservation + budget authority\n  - [`cycles-server-admin`](https://github.com/runcycles/cycles-server-admin) — admin plane (tenants, budgets, webhooks, API keys)\n\n## License\n\nApache License 2.0\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fruncycles%2Fcycles-server-events","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fruncycles%2Fcycles-server-events","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fruncycles%2Fcycles-server-events/lists"}