{"id":49191452,"url":"https://github.com/opencitations/oc_db_kyoo","last_synced_at":"2026-04-23T07:01:29.129Z","repository":{"id":341654539,"uuid":"1168660228","full_name":"opencitations/oc_db_kyoo","owner":"opencitations","description":null,"archived":false,"fork":false,"pushed_at":"2026-03-11T11:52:53.000Z","size":62,"stargazers_count":0,"open_issues_count":0,"forks_count":1,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-03-11T18:02:10.172Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/opencitations.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-02-27T16:44:52.000Z","updated_at":"2026-03-11T11:52:42.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/opencitations/oc_db_kyoo","commit_stats":null,"previous_names":["opencitations/oc_db_kyoo"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/opencitations/oc_db_kyoo","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/opencitations%2Foc_db_kyoo","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/opencitations%2Foc_db_kyoo/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/opencitations%2Foc_db_kyoo/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/opencitations%2Foc_db_kyoo/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/opencitations","download_url":"https://codeload.github.com/opencitations/oc_db_kyoo/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/opencitations%2Foc_db_kyoo/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32169657,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-23T02:19:40.750Z","status":"ssl_error","status_checked_at":"2026-04-23T02:17:55.737Z","response_time":53,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2026-04-23T07:01:27.181Z","updated_at":"2026-04-23T07:01:29.023Z","avatar_url":"https://github.com/opencitations.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# oc_db_kyoo\n\nDatabase Queue Manager for OpenCitations. Sits between the caching layer (Varnish/Redis) and the database backends, managing per-backend request queuing, concurrency limiting, circuit breaking, and least-queue load balancing.\n\n## Overview\n\n**oc_db_kyoo** (pronounced like \"queue\") is an async HTTP reverse proxy that protects database backends from overload by:\n\n- Limiting concurrent requests per backend\n- Queuing excess requests with configurable depth and timeout\n- Routing requests to the least-loaded backend (least-queue strategy)\n- **Circuit breaker** with three states (CLOSED → OPEN → HALF_OPEN) to fast-fail when a backend is unreachable\n- **Active health checking** to detect and confirm backend recovery automatically\n- **Two-tier routing** with a fallback pool that activates when all primary backends are down\n- **Real-time dashboard** for monitoring backend state, circuit status, and queue metrics\n- Returning a friendly \"backend busy\" page when all backends are saturated\n\nEach instance of oc_db_kyoo manages **one type** of database (e.g., all Virtuoso instances, or all QLever instances). Each backend within that instance gets its own independent queue and circuit breaker.\n\n## Architecture\n\n```\nService (oc-api, oc-sparql, oc-search, ...)\n  → Varnish (HTTP cache)\n    → Redis (secondary cache)\n      → oc_db_kyoo (Virtuoso)  → [virtuoso-1, virtuoso-2, ...]  (primary pool)\n                                → [virtuoso-fb-1, ...]           (fallback pool)\n      → oc_db_kyoo (QLever)    → [qlever-1, qlever-2, ...]      (primary pool)\n                                → [qlever-fb-1, ...]             (fallback pool)\n```\n\n## Configuration\n\n### conf.json\n\n```json\n{\n  \"listen_port\": 8080,\n  \"log_level\": \"info\",\n  \"backends\": [\n    {\n      \"name\": \"virtuoso-1\",\n      \"host\": \"virtuoso-1.default.svc.cluster.local\",\n      \"port\": 8890,\n      \"path\": \"/sparql\"\n    },\n    {\n      \"name\": \"virtuoso-2\",\n      \"host\": \"virtuoso-2.default.svc.cluster.local\",\n      \"port\": 8890,\n      \"path\": \"/sparql\"\n    },\n    {\n      \"name\": \"virtuoso-3\",\n      \"host\": \"virtuoso-3.default.svc.cluster.local\",\n      \"port\": 8890,\n      \"path\": \"/sparql\"\n    }\n  ],\n  \"max_concurrent_per_backend\": 20,\n  \"max_queue_per_backend\": 100,\n  \"queue_timeout\": 60,\n  \"backend_timeout\": 60,\n  \"circuit_breaker_threshold\": 3,\n  \"circuit_breaker_recovery_time\": 15,\n  \"health_check_interval\": 10,\n  \"health_check_timeout\": 5,\n  \"health_check_query\": \"ASK WHERE { ?s ?p ?o }\",\n  \"fallback_backends\": [\n    {\n      \"name\": \"virtuoso-fb-1\",\n      \"host\": \"virtuoso-fb-1.default.svc.cluster.local\",\n      \"port\": 8890,\n      \"path\": \"/sparql\"\n    }\n  ],\n  \"fallback_max_concurrent_per_backend\": 3,\n  \"fallback_max_queue_per_backend\": 20,\n  \"fallback_queue_timeout\": 120,\n  \"fallback_backend_timeout\": 120\n}\n```\n\n### Parameter Reference\n\n| Parameter | Description | Default |\n|---|---|---|\n| `listen_port` | Port the service listens on | `8080` |\n| `log_level` | Logging level (debug, info, warning, error) | `info` |\n| `backends` | List of primary database backends | — |\n| `max_concurrent_per_backend` | Max simultaneous requests sent to each primary backend | `10` |\n| `max_queue_per_backend` | Max requests waiting in queue per primary backend | `50` |\n| `queue_timeout` | Max seconds a request can wait in queue before being dropped | `120` |\n| `backend_timeout` | Max seconds to wait for a primary backend response | `900` |\n| `circuit_breaker_threshold` | Consecutive connection failures before opening the circuit | `3` |\n| `circuit_breaker_recovery_time` | Seconds to wait before probing an OPEN backend | `15` |\n| `health_check_interval` | Seconds between health check probes on OPEN/HALF_OPEN backends | `10` |\n| `health_check_timeout` | Timeout in seconds for each health check probe | `5` |\n| `health_check_query` | SPARQL query used for health probes | `ASK WHERE { ?s ?p ?o }` |\n| `fallback_backends` | List of fallback backends (activated when all primaries are down) | `[]` |\n| `fallback_max_concurrent_per_backend` | Max simultaneous requests per fallback backend | `3` |\n| `fallback_max_queue_per_backend` | Max queue depth per fallback backend | `20` |\n| `fallback_queue_timeout` | Max seconds a request waits in fallback queue | `120` |\n| `fallback_backend_timeout` | Max seconds to wait for a fallback backend response | `120` |\n\n### Environment Variables\n\nEnvironment variables override `conf.json` values (Docker/Kubernetes pattern):\n\n```env\nLISTEN_PORT=8080\nLOG_LEVEL=info\n\n# Queue settings\nMAX_CONCURRENT_PER_BACKEND=10\nMAX_QUEUE_PER_BACKEND=250\nQUEUE_TIMEOUT=180\nBACKEND_TIMEOUT=60\n\n# Circuit breaker\nCIRCUIT_BREAKER_THRESHOLD=3\nCIRCUIT_BREAKER_RECOVERY_TIME=15\n\n# Health check\nHEALTH_CHECK_INTERVAL=10\nHEALTH_CHECK_TIMEOUT=5\nHEALTH_CHECK_QUERY=ASK WHERE { ?s ?p ?o }\n\n# Primary backends (BACKEND_N_*)\nBACKEND_0_NAME=virtuoso-1\nBACKEND_0_HOST=virtuoso-1.default.svc.cluster.local\nBACKEND_0_PORT=8890\nBACKEND_0_PATH=/sparql\nBACKEND_1_NAME=virtuoso-2\nBACKEND_1_HOST=virtuoso-2.default.svc.cluster.local\nBACKEND_1_PORT=8890\nBACKEND_1_PATH=/sparql\n\n# Fallback backends (FALLBACK_N_*)\nFALLBACK_MAX_CONCURRENT_PER_BACKEND=3\nFALLBACK_MAX_QUEUE_PER_BACKEND=20\nFALLBACK_QUEUE_TIMEOUT=120\nFALLBACK_BACKEND_TIMEOUT=120\nFALLBACK_0_NAME=virtuoso-fb-1\nFALLBACK_0_HOST=virtuoso-fb-1.default.svc.cluster.local\nFALLBACK_0_PORT=8890\nFALLBACK_0_PATH=/sparql\n```\n\nThe service discovers backends by scanning `BACKEND_N_HOST` and `FALLBACK_N_HOST` env vars (N=0,1,2,...). If env vars exist for a pool, they replace the corresponding `conf.json` entries entirely.\n\n## Circuit Breaker\n\nEach backend has an independent circuit breaker with three states:\n\n```\n        ┌─ 2nd probe ok ────────────────────────┐\n        │                                        │\n    CLOSED ──N failures──▶ OPEN ──probe ok──▶ HALF_OPEN\n        ▲                   ▲                    │\n        │                   └── probe failed ────┘\n        │                                        │\n        └──── 2nd probe ok ─────────────────────┘\n```\n\n- **CLOSED** — healthy, traffic flows normally.\n- **OPEN** — backend is unreachable. All queued requests are drained immediately (no waiting for `queue_timeout`). New requests skip this backend. A background health checker probes at regular intervals.\n- **HALF_OPEN** — first health probe succeeded, waiting for confirmation. No user traffic is routed to this backend. The health checker sends a second probe after one interval. If it succeeds → CLOSED. If it fails → back to OPEN.\n\nRecovery is driven entirely by the health checker — no real user requests are involved in the recovery process.\n\nConnection-level failures that trigger the circuit breaker include `ConnectError`, `ConnectTimeout`, `ReadTimeout`, `WriteTimeout`, and any other transport-level `httpx.TimeoutException`. HTTP 4xx/5xx responses do **not** trip the breaker (the database process is alive).\n\n## Fallback Pool\n\nWhen **all** primary backends have their circuit OPEN, the fallback pool activates:\n\n- Fallback backends have their own independent concurrency limits, queue depth, and timeouts.\n- Each fallback backend has its own circuit breaker.\n- As soon as any primary backend recovers (circuit returns to CLOSED), traffic routes back to the primary pool.\n\nThis two-tier design ensures that reserve capacity is available during full primary outages without affecting normal-operation routing.\n\n## Health Checker\n\nA background task runs every `health_check_interval` seconds and probes all OPEN and HALF_OPEN backends (both primary and fallback) by sending a real SPARQL query:\n\n```\nGET \u003cbackend_url\u003e?query=ASK WHERE { ?s ?p ?o }\nAccept: application/sparql-results+json\n```\n\nThe health checker drives the full recovery process:\n\n1. **OPEN backend**: If the probe gets a response with status \u003c 500, the backend transitions to HALF_OPEN.\n2. **HALF_OPEN backend**: If the probe gets a response with status \u003c 500, the backend transitions to CLOSED (confirmed recovery). If the probe fails, the backend goes back to OPEN.\n\nThis two-step confirmation ensures a backend is reliably responding before it receives user traffic again.\n\n\u003e **Note**: Virtuoso requires `ASK WHERE { ?s ?p ?o }` without `LIMIT`. QLever backends may use a different query via `HEALTH_CHECK_QUERY`.\n\n## Timeout Logging\n\nBackend timeouts are logged to a dedicated file `timeout_requests.log` with full detail including the SPARQL query text, client IP, and user agent, separated clearly for analysis.\n\n## Endpoints\n\n| Endpoint | Description |\n|---|---|\n| `/{path}` | Proxy — all requests are forwarded to the least-loaded backend |\n| `/health` | Liveness/readiness probe: 200 if at least one CLOSED backend (primary or fallback) can accept requests, 503 otherwise |\n| `/status` | Detailed per-backend queue and circuit breaker statistics (JSON) |\n| `/dashboard` | Real-time monitoring dashboard with auto-refresh (HTML) |\n\n### /status response example\n\n```json\n{\n  \"status\": \"ok\",\n  \"all_primaries_down\": false,\n  \"backends\": [\n    {\n      \"name\": \"virtuoso-1\",\n      \"active_requests\": 3,\n      \"queued_requests\": 0,\n      \"total_requests\": 1250,\n      \"total_completed\": 1245,\n      \"total_errors\": 2,\n      \"total_timeouts\": 3,\n      \"total_rejected\": 0,\n      \"total_circuit_breaks\": 0,\n      \"avg_response_time_ms\": 245.67,\n      \"circuit_state\": \"closed\"\n    },\n    {\n      \"name\": \"virtuoso-2\",\n      \"active_requests\": 5,\n      \"queued_requests\": 2,\n      \"total_requests\": 1180,\n      \"total_completed\": 1175,\n      \"total_errors\": 1,\n      \"total_timeouts\": 4,\n      \"total_rejected\": 0,\n      \"total_circuit_breaks\": 1,\n      \"avg_response_time_ms\": 312.45,\n      \"circuit_state\": \"closed\"\n    }\n  ],\n  \"fallback_backends\": [\n    {\n      \"name\": \"virtuoso-fb-1\",\n      \"active_requests\": 0,\n      \"queued_requests\": 0,\n      \"total_requests\": 0,\n      \"total_completed\": 0,\n      \"total_errors\": 0,\n      \"total_timeouts\": 0,\n      \"total_rejected\": 0,\n      \"total_circuit_breaks\": 0,\n      \"avg_response_time_ms\": 0.0,\n      \"circuit_state\": \"closed\"\n    }\n  ]\n}\n```\n\n### /dashboard\n\nThe dashboard shows color-coded cards for each backend with real-time circuit state, queue meters, and request counters. Fallback backends are displayed in a separate section with an activation banner indicating whether the fallback pool is active or on standby. Auto-refreshes every 2 seconds.\n\n## How it works\n\n1. A request arrives at oc_db_kyoo\n2. The **least-queue router** selects the backend with the lowest total load (active + queued) from the primary pool\n3. Only backends with a **CLOSED** circuit receive user traffic\n4. If the selected backend has capacity, the request is forwarded immediately\n5. If the backend is at max concurrency, the request enters the backend's queue\n6. If the queue is full, other available CLOSED backends (including fallback if all primaries are down) are tried\n7. If the circuit **opens** while requests are queued, they are **drained immediately** instead of waiting for the full queue timeout\n8. If all backends are saturated or down, a **503 \"Backend Busy\"** page is returned\n9. A **health checker** probes OPEN and HALF_OPEN backends periodically using a real SPARQL query\n10. Recovery requires two consecutive successful probes: OPEN → HALF_OPEN (first probe OK) → CLOSED (second probe OK). No user traffic is involved in recovery\n\n## Running\n\n### Local development\n\n```bash\n# Install uv (if not already installed)\ncurl -LsSf https://astral.sh/uv/install.sh | sh\n\n# Install dependencies and create virtual environment\nuv sync\n\n# Run the application\nuv run python app.py\n\n# Or with a custom port\nuv run python app.py --port 9090\n```\n\n### Docker\n\nThe image version is read from `pyproject.toml` (single source of truth). To publish a new image on DockerHub, update the `version` field in `pyproject.toml` and push to `main` — the GitHub Actions workflow will build and push the new tag automatically, skipping the build if that version already exists.\n\n```bash\nVERSION=$(grep -m1 '^version' pyproject.toml | cut -d'\"' -f2)\n\ndocker build -t opencitations/oc_db_kyoo:$VERSION .\ndocker run -p 8080:8080 \\\n  -e MAX_CONCURRENT_PER_BACKEND=5 \\\n  -e BACKEND_0_NAME=db1 \\\n  -e BACKEND_0_HOST=localhost \\\n  -e BACKEND_0_PORT=8890 \\\n  -e BACKEND_0_PATH=/sparql \\\n  opencitations/oc_db_kyoo:$VERSION\n```\n\n### Kubernetes\n\nSee `manifest-example.yaml` and `.env.example` for deployment templates.\n\n## Tech Stack\n\n- **Python 3.11+**\n- **[uv](https://docs.astral.sh/uv/)** — fast Python package manager\n- **FastAPI** + **uvicorn** — async HTTP framework\n- **httpx** — async HTTP client for backend forwarding\n- **asyncio.Semaphore** — concurrency control per backend\n- **Pydantic** — configuration validation","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fopencitations%2Foc_db_kyoo","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fopencitations%2Foc_db_kyoo","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fopencitations%2Foc_db_kyoo/lists"}