{"id":35411558,"url":"https://github.com/cloudnativeworks/elchi-gslb","last_synced_at":"2026-05-04T08:04:12.676Z","repository":{"id":331432581,"uuid":"1125088269","full_name":"CloudNativeWorks/elchi-gslb","owner":"CloudNativeWorks","description":null,"archived":false,"fork":false,"pushed_at":"2026-02-06T20:53:15.000Z","size":221,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-02-06T23:04:21.245Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/CloudNativeWorks.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-12-30T06:00:19.000Z","updated_at":"2026-02-06T20:49:41.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/CloudNativeWorks/elchi-gslb","commit_stats":null,"previous_names":["cloudnativeworks/elchi-gslb"],"tags_count":3,"template":false,"template_full_name":null,"purl":"pkg:github/CloudNativeWorks/elchi-gslb","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/CloudNativeWorks%2Felchi-gslb","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/CloudNativeWorks%2Felchi-gslb/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/CloudNativeWorks%2Felchi-gslb/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/CloudNativeWorks%2Felchi-gslb/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/CloudNativeWorks","download_url":"https://codeload.github.com/CloudNativeWorks/elchi-gslb/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/CloudNativeWorks%2Felchi-gslb/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32599416,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-03T22:12:39.696Z","status":"online","status_checked_at":"2026-05-04T02:00:06.625Z","response_time":58,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2026-01-02T14:12:41.384Z","updated_at":"2026-05-04T08:04:12.668Z","avatar_url":"https://github.com/CloudNativeWorks.png","language":"Go","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Elchi GSLB - CoreDNS Plugin\n\nCoreDNS plugin for Global Server Load Balancing (GSLB) integration with Elchi Controller.\n\n## Name\n\n*elchi* - provides dynamic DNS resolution for Global Server Load Balancing zones managed by the Elchi Controller\n\n## Description\n\nThe *elchi* plugin integrates CoreDNS with the Elchi backend controller to provide dynamic DNS responses for GSLB zones. It maintains an in-memory cache of DNS records that is synchronized periodically from the Elchi backend using hash-based change detection.\n\n**Note**: The zone name is completely flexible - you can use any zone you control (e.g., `gslb.example.org`, `lb.mycompany.com`, `dns.internal`, etc.). The examples use `gslb.elchi` and `gslb.example.org` for illustration purposes only.\n\nThe plugin answers DNS queries from a pre-built cache synchronized from an external API, providing:\n- Hash-based change detection (only updates when backend data changes)\n- Periodic background sync (default: 5 minutes)\n- Instant updates via optional webhook endpoint\n- Thread-safe cache operations with minimal lock duration\n- Graceful degradation on backend failures (continues serving stale data)\n- Support for A and AAAA record types\n\n## Syntax\n\n~~~ txt\n*elchi* {\n    endpoint **URL**\n    secret **KEY**\n    node_ip **IP**\n    [ttl **SECONDS**]\n    [sync_interval **DURATION**]\n    [timeout **DURATION**]\n    [webhook [**ADDRESS**]]\n    [regions **REGION** ...]\n    [tls_skip_verify]\n    [fallthrough [**ZONES**...]]\n}\n~~~\n\n- **URL** is the Elchi backend API endpoint (required)\n- **KEY** is the shared secret for authentication, must match ELCHI_JWT_SECRET in backend (required, minimum 8 characters)\n- **IP** is the node's IP address, sent to controller for node identification (required, typically `{$NODE_IP}`)\n- **SECONDS** is the default TTL for DNS records without explicit TTL (optional, default: 300)\n- **DURATION** is a Go duration string (e.g., `5m`, `30s`) for sync_interval or timeout\n  - **sync_interval** specifies how often to check for changes (optional, default: `15m`, minimum: `5s`)\n  - **timeout** specifies HTTP request timeout (optional, default: `4s`, minimum: `1s`)\n- **REGION** is one or more region names to filter DNS records (optional). Only records belonging to the specified regions will be fetched from the controller. Use `all` or omit the directive to fetch all regions. Examples: `regions asya avrupa`, `regions all`\n- **tls_skip_verify** skips TLS certificate verification (optional, for self-signed certificates)\n- **ADDRESS** is the webhook server listen address (optional, default: `:8053`)\n- **ZONES** are zones to fall through for (optional, defaults to all zones if fallthrough is enabled)\n\n## Examples\n\nMinimal configuration for zone `gslb.example.org`:\n\n~~~ corefile\ngslb.example.org {\n    elchi {\n        endpoint http://elchi-backend:8080\n        secret my-shared-secret\n    }\n}\n~~~\n\nFull configuration with all options (production DaemonSet with hostNetwork):\n\n~~~ corefile\ngslb.example.org:53 {\n    bind {$NODE_IP}\n    elchi {\n        endpoint http://elchi-backend:8080\n        secret my-shared-secret\n        node_ip {$NODE_IP}\n        regions asya avrupa\n        ttl 300\n        sync_interval 1m\n        timeout 4s\n        webhook :8053\n        fallthrough\n    }\n    file /etc/coredns/gslb.example.org.db gslb.example.org\n    ready\n    prometheus :9253\n    log\n    errors\n}\n\n.:53 {\n    bind {$NODE_IP}\n    forward . 8.8.8.8 8.8.4.4\n    log\n    errors\n    cache 30\n}\n~~~\n\nMultiple zones with different backends:\n\n~~~ corefile\ngslb.prod.example.org {\n    elchi {\n        endpoint http://elchi-prod:8080\n        secret prod-secret\n    }\n}\n\ngslb.staging.example.org {\n    elchi {\n        endpoint http://elchi-staging:8080\n        secret staging-secret\n    }\n}\n~~~\n\nHTTPS with self-signed certificate:\n\n~~~ corefile\ngslb.example.org {\n    elchi {\n        endpoint https://elchi-backend:8443\n        secret my-shared-secret\n        tls_skip_verify\n    }\n}\n~~~\n\n\u003e **Warning**: `tls_skip_verify` disables TLS certificate verification. Only use this for testing or when using self-signed certificates in a trusted network. For production, use proper CA-signed certificates.\n\nCNAME-based regional failover:\n\n~~~ corefile\n# Asia region\nasya-gslb.elchi {\n    elchi {\n        endpoint http://elchi-asia:8080\n        secret asia-secret\n    }\n}\n\n# Europe region (backup)\navrupa-gslb.elchi {\n    elchi {\n        endpoint http://elchi-europe:8080\n        secret europe-secret\n    }\n}\n~~~\n\nWhen the Asia region is disabled (e.g., during maintenance), the controller sends:\n\n```json\n{\n  \"name\": \"service.asya-gslb.elchi\",\n  \"type\": \"A\",\n  \"ttl\": 20,\n  \"ips\": [\"10.10.1.20\", \"10.10.1.21\"],\n  \"enabled\": false,\n  \"failover\": \"service.avrupa-gslb.elchi\"\n}\n```\n\nDNS clients querying `service.asya-gslb.elchi` will receive a CNAME to `service.avrupa-gslb.elchi` and automatically resolve to the Europe region IPs.\n\n## Architecture\n\n```\n┌─────────────┐         ┌──────────────┐         ┌─────────────────┐\n│   DNS       │  Query  │   CoreDNS    │  HTTP   │  Elchi Backend  │\n│   Client    │────────▶│   + Elchi    │────────▶│   Controller    │\n│             │         │   Plugin     │         │   (/dns/*)      │\n└─────────────┘         └──────────────┘         └─────────────────┘\n                               │                          │\n                               │  Periodic Sync (5m)      │\n                               │  Check: hash changed?    │\n                               │◀─────────────────────────┘\n                               │\n                               ▼\n                        ┌──────────────┐\n                        │  DNS Record  │\n                        │  Cache       │\n                        │  (in-memory) │\n                        └──────────────┘\n```\n\n### Sync Mechanism\n\n1. **Initial Load**: On startup, fetches complete DNS snapshot from `/dns/snapshot?zone={zone}` (optionally filtered by `regions`)\n2. **Periodic Check**: Every 5 minutes (configurable), calls `/dns/changes?zone={zone}\u0026since={hash}` (optionally filtered by `regions`)\n3. **Change Detection**: If hash differs, fetches new snapshot and replaces cache atomically\n4. **Query Handling**: DNS queries are answered from pre-built in-memory cache (no backend calls on hot path)\n\n## Quick Start\n\n### Option 1: Docker (Recommended)\n\n```bash\n# Pull the image\ndocker pull cloudnativeworks/elchi-coredns:latest\n\n# Create Corefile\ncat \u003e Corefile \u003c\u003cEOF\ngslb.elchi {\n    elchi {\n        endpoint http://elchi-backend:8080\n        secret your-secret-key\n        sync_interval 1m\n        webhook :8053\n        fallthrough\n    }\n    log\n    errors\n}\n\n. {\n    forward . 8.8.8.8 8.8.4.4\n    cache 30\n}\nEOF\n\n# Run\ndocker run -d \\\n  --name elchi-coredns \\\n  -p 53:53/udp \\\n  -p 53:53/tcp \\\n  -p 8053:8053 \\\n  -v $(pwd)/Corefile:/etc/coredns/Corefile \\\n  cloudnativeworks/elchi-coredns:latest\n```\n\n### Option 2: Build from Source\n\n#### 1. Setup\n\n```bash\nmake setup\n```\n\nThis will:\n- Clone CoreDNS v1.13.2 into `./coredns/`\n- Create `Corefile` from `Corefile.example`\n- Download Go dependencies\n\n#### 2. Configure\n\nEdit `Corefile` with your settings:\n\n```\ngslb.example.org {\n    elchi {\n        endpoint http://localhost:8080\n        secret your-secret-key-here\n        ttl 300\n        sync_interval 1m\n        timeout 4s\n        fallthrough\n    }\n    log\n    errors\n}\n\n. {\n    forward . 8.8.8.8 8.8.4.4\n    log\n    errors\n    cache 30\n}\n```\n\n### 3. Run (No Build Required!)\n\n```bash\nmake run\n```\n\nThis runs CoreDNS via `go run` with sudo (required for port 53 on macOS). No build step needed!\n\n**Alternative: Build Binary First (Optional)**\n\nIf you prefer to use a compiled binary:\n\n```bash\nmake build      # Creates ./coredns-elchi binary\nmake run-build  # Run from binary\n```\n\nThe build process:\n- Registers the elchi plugin in CoreDNS `plugin.cfg`\n- Uses `go mod replace` to use local plugin code\n- Builds CoreDNS with the elchi plugin included\n- Creates `./coredns-elchi` binary\n\n### 4. Test\n\nIn another terminal:\n\n```bash\nmake query\n# Or manually:\ndig @localhost -p 53 project1.gslb.elchi A\n```\n\n## Authentication\n\nThe plugin uses **simple secret header authentication**:\n\n```http\nX-Elchi-Secret: \u003cshared-secret\u003e\n```\n\nThe `secret` in Corefile must match `ELCHI_JWT_SECRET` environment variable in the Elchi backend.\n\nThis is suitable for pod-to-pod communication within Kubernetes where network traffic is already secured. HTTPS is not required.\n\n## Backend API Specification\n\nThe Elchi backend must implement these DNS API endpoints:\n\n### GET /dns/snapshot\n\nFetches the complete DNS snapshot for a zone.\n\n**Query Parameters:**\n- `zone` (required) - DNS zone name (e.g., `gslb.elchi`)\n- `regions` (optional) - Comma-separated list of regions to filter records (e.g., `asya,avrupa`). If omitted, all records are returned.\n\n**Request Headers:**\n```\nX-Elchi-Secret: \u003cELCHI_JWT_SECRET\u003e\nAccept: application/json\n```\n\n**Response (200 OK):**\n```json\n{\n  \"zone\": \"gslb.elchi\",\n  \"version_hash\": \"abc123def456\",\n  \"records\": [\n    {\n      \"name\": \"listener1.gslb.elchi\",\n      \"type\": \"A\",\n      \"ttl\": 300,\n      \"ips\": [\"192.168.1.10\", \"192.168.1.11\"]\n    },\n    {\n      \"name\": \"listener2.gslb.elchi\",\n      \"type\": \"AAAA\",\n      \"ttl\": 600,\n      \"ips\": [\"2001:db8::1\", \"2001:db8::2\"]\n    }\n  ]\n}\n```\n\n**Response Fields:**\n- `zone` - DNS zone name (must match request)\n- `version_hash` - Opaque hash string for change detection\n- `records` - Array of DNS records\n\n**Record Fields:**\n- `name` - Fully qualified domain name (FQDN)\n- `type` - Record type (\"A\" or \"AAAA\")\n- `ttl` - Time-to-live in seconds (0 = use default)\n- `ips` - Array of IP address strings\n\n**Error Responses:**\n- `400 Bad Request` - Invalid zone or missing parameters\n- `401 Unauthorized` - Invalid or missing secret\n- `404 Not Found` - Zone not found\n- `500 Internal Server Error` - Server error\n\n### GET /dns/changes\n\nChecks for DNS changes since a given version hash.\n\n**Query Parameters:**\n- `zone` (required) - DNS zone name\n- `since` (required) - Last known version_hash\n- `regions` (optional) - Comma-separated list of regions to filter records (e.g., `asya,avrupa`). If omitted, all records are returned.\n\n**Request Headers:** Same as `/dns/snapshot`\n\n**Response - No Changes (200 OK):**\n```json\n{\n  \"unchanged\": true\n}\n```\n\n**Response - Has Changes (200 OK):**\n```json\n{\n  \"unchanged\": false,\n  \"zone\": \"gslb.elchi\",\n  \"version_hash\": \"xyz789abc012\",\n  \"records\": [\n    // Full snapshot of current records\n  ]\n}\n```\n\n**Note:** When `unchanged=false`, the response includes the complete new snapshot, not a diff. The plugin replaces the entire cache atomically.\n\n**Error Responses:** Same as `/dns/snapshot`\n\n### Backend Implementation Notes\n\n1. **Version Hash Generation:**\n   - Generate a hash (e.g., SHA256) of the current DNS records\n   - Hash should change whenever records are added, removed, or modified\n   - Hash can be based on record content, timestamps, or database version\n\n2. **Authentication:**\n   - Validate `X-Elchi-Secret` header matches `ELCHI_JWT_SECRET`\n   - Zone information comes from the `zone` query parameter\n   - Return 401 if authentication fails\n\n3. **Performance:**\n   - Plugin queries `/dns/changes` every 5 minutes\n   - Optimize for quick hash comparison\n   - Only generate full snapshot when hash differs\n\n4. **Record Generation:**\n   - Records should be within the specified zone\n   - IPs must be valid IPv4 (for A) or IPv6 (for AAAA)\n   - TTL of 0 means use plugin default\n\n## Plugin Webhook Endpoints\n\nThe plugin can optionally expose webhook endpoints for instant updates, health monitoring, and record inspection. Enable with the `webhook` directive in Corefile.\n\n### Configuration\n\n```\ngslb.elchi:53 {\n    bind {$NODE_IP}\n    elchi {\n        endpoint http://localhost:8080\n        secret your-secret-key-here\n        node_ip {$NODE_IP}\n        webhook :8053  # Enable webhook server on port 8053\n    }\n}\n```\n\nThe `webhook` directive accepts an optional address parameter (default: `:8053`).\n\n### POST /notify\n\nWebhook endpoint for instant DNS record updates from the Elchi controller. Allows pushing changes without waiting for the periodic sync interval.\n\n**Authentication:** Requires `X-Elchi-Secret` header\n\n**Request Body (Updates):**\n```json\n{\n  \"records\": [\n    {\n      \"name\": \"new.gslb.elchi\",\n      \"type\": \"A\",\n      \"ttl\": 300,\n      \"ips\": [\"192.168.3.10\"]\n    }\n  ]\n}\n```\n\n**Request Body (Deletes):**\n```json\n{\n  \"deletes\": [\n    {\n      \"name\": \"old.gslb.elchi\",\n      \"type\": \"A\"\n    }\n  ]\n}\n```\n\n**Request Body (Mixed):**\n```json\n{\n  \"records\": [\n    {\"name\": \"updated.gslb.elchi\", \"type\": \"A\", \"ttl\": 300, \"ips\": [\"192.168.4.10\"]}\n  ],\n  \"deletes\": [\n    {\"name\": \"removed.gslb.elchi\", \"type\": \"A\"}\n  ]\n}\n```\n\n**Response (200 OK):**\n```json\n{\n  \"status\": \"ok\",\n  \"updated\": 1,\n  \"deleted\": 1\n}\n```\n\n**Usage:**\n```bash\ncurl -X POST http://localhost:8053/notify \\\n  -H \"X-Elchi-Secret: your-secret-key\" \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\"records\": [{\"name\": \"test.gslb.elchi\", \"type\": \"A\", \"ttl\": 300, \"ips\": [\"192.168.1.10\"]}]}'\n```\n\n### GET /health\n\nHealth check endpoint that returns plugin status, cache statistics, and sync status.\n\n**Authentication:** None required (public endpoint)\n\n**Response (200 OK - Healthy):**\n```json\n{\n  \"status\": \"healthy\",\n  \"zone\": \"gslb.elchi\",\n  \"records_count\": 42,\n  \"version_hash\": \"abc123\",\n  \"last_sync\": \"2025-12-31T08:30:00Z\",\n  \"last_sync_status\": \"success\"\n}\n```\n\n**Response (503 Service Unavailable - Degraded):**\n```json\n{\n  \"status\": \"degraded\",\n  \"zone\": \"gslb.elchi\",\n  \"records_count\": 42,\n  \"version_hash\": \"abc123\",\n  \"last_sync\": \"2025-12-31T08:00:00Z\",\n  \"last_sync_status\": \"failed\",\n  \"error\": \"connection timeout\"\n}\n```\n\n**Health Status Logic:**\n- `healthy`: Last sync succeeded OR last failure was recent (\u003c 2x sync_interval)\n- `degraded`: Last sync failed AND it's been \u003e 2x sync_interval since last success\n\n**Usage:**\n```bash\ncurl http://localhost:8053/health\n```\n\n### GET /records\n\nReturns currently cached DNS records, with optional filtering by name or type.\n\n**Authentication:** Requires `X-Elchi-Secret` header\n\n**Query Parameters:**\n- `name` (optional) - Filter by domain name (substring match)\n- `type` (optional) - Filter by record type (A or AAAA)\n\n**Response (200 OK):**\n```json\n{\n  \"zone\": \"gslb.elchi\",\n  \"version_hash\": \"abc123\",\n  \"count\": 2,\n  \"records\": [\n    {\n      \"name\": \"test1.gslb.elchi\",\n      \"type\": \"A\",\n      \"ttl\": 300,\n      \"ips\": [\"192.168.1.10\"]\n    },\n    {\n      \"name\": \"test2.gslb.elchi\",\n      \"type\": \"A\",\n      \"ttl\": 300,\n      \"ips\": [\"192.168.2.10\"]\n    }\n  ]\n}\n```\n\n**Usage:**\n```bash\n# Get all records\ncurl -H \"X-Elchi-Secret: your-secret-key\" http://localhost:8053/records\n\n# Filter by name\ncurl -H \"X-Elchi-Secret: your-secret-key\" http://localhost:8053/records?name=test1\n\n# Filter by type\ncurl -H \"X-Elchi-Secret: your-secret-key\" http://localhost:8053/records?type=A\n```\n\n### Webhook Integration Workflow\n\n1. **Periodic Sync (Default):**\n   - Every 5 minutes, plugin checks `/dns/changes` for updates\n   - Only fetches full snapshot if hash changed\n\n2. **Instant Updates (Optional):**\n   - Enable webhook server with `webhook` directive\n   - Controller can push updates immediately via POST /notify\n   - Updates are merged with existing cache (partial updates)\n   - No need to wait for next sync interval\n\n3. **Monitoring:**\n   - Use GET /health for automated health checks\n   - Monitor `last_sync_status` and `records_count`\n   - Alert on `degraded` status\n\n## Development\n\n### Project Structure\n\n```\nelchi-gslb/\n├── elchi.go              # Main plugin logic\n├── setup.go              # Configuration parsing\n├── client.go             # Elchi backend HTTP client\n├── cache.go              # Thread-safe DNS record cache\n├── webhook.go            # Webhook server and endpoints\n├── *_test.go             # Unit and integration tests\n├── coredns/              # CoreDNS clone (created by make setup)\n├── Corefile.example      # Example configuration\n├── Makefile              # Development commands\n├── go.mod                # Go module definition\n└── README.md             # This file\n```\n\n### Running Tests\n\n```bash\nmake test\n```\n\nThis runs all unit tests including:\n- Client tests (HTTP mocking)\n- Cache tests (concurrency, benchmarks)\n- Integration tests (ServeDNS)\n\n### Building\n\n```bash\nmake build\n```\n\nThis:\n1. Clones CoreDNS (if not already cloned)\n2. Registers elchi plugin in CoreDNS `plugin.cfg` (after `kubernetes` plugin)\n3. Uses `go mod replace` to use local plugin code\n4. Builds CoreDNS with elchi plugin\n5. Creates `./coredns-elchi` binary\n\n**Note**: The build process modifies `coredns/plugin.cfg` and `coredns/go.mod` to include the elchi plugin.\n\n### Debug Logging\n\nEnable debug logs in Corefile:\n\n```\ngslb.elchi {\n    elchi { ... }\n    log\n    errors\n}\n```\n\nWatch logs:\n```bash\nmake run\n# Logs show:\n# [INFO] Initial snapshot loaded: 42 records, hash=abc123\n# [DEBUG] Checking for changes since hash=abc123\n# [DEBUG] No changes detected\n```\n\n## Troubleshooting\n\n### Plugin doesn't start\n\n**Check:** Corefile syntax and zone format\n```bash\n# Zone must be in server block declaration\ngslb.elchi {  # ← Zone here\n    elchi {\n        # NOT here\n    }\n}\n```\n\n### \"secret is required\" error\n\n**Fix:** Add secret to Corefile:\n```\nelchi {\n    secret your-secret-key-here  # ← Must be at least 8 chars\n}\n```\n\n### \"Initial snapshot fetch failed\"\n\n**Possible causes:**\n1. Backend not running → Start Elchi backend\n2. Wrong endpoint → Check `endpoint` in Corefile\n3. Wrong secret → Verify matches ELCHI_JWT_SECRET\n4. Network issue → Check connectivity\n\n**Note:** Plugin continues running and retries in background.\n\n### No DNS responses\n\n1. **Check cache is populated:**\n   - Look for \"Initial snapshot loaded\" in logs\n   - If not, check backend API\n\n2. **Check query domain matches zone:**\n   ```bash\n   # Corefile zone: gslb.elchi\n   # Query must be: *.gslb.elchi\n   dig @localhost test.gslb.elchi A\n   ```\n\n3. **Check record exists in backend:**\n   ```bash\n   curl -H \"X-Elchi-Secret: \u003csecret\u003e\" \\\n     \"http://localhost:8080/dns/snapshot?zone=gslb.elchi\"\n   ```\n\n### \"Permission denied\" on port 53\n\n**Solution:** Use sudo\n```bash\nsudo go run cmd/coredns/main.go -conf Corefile\n# Or:\nmake run  # Already includes sudo\n```\n\n### Changes not appearing\n\n**Possible causes:**\n1. **Sync interval not elapsed** → Wait 5 minutes or restart\n2. **Hash unchanged** → Verify backend actually changed data\n3. **Backend error** → Check logs for \"Change check failed\"\n\n## How It Works\n\n### Record Mapping\n\nThe plugin maps Elchi DNS records to CoreDNS responses:\n\n**Backend Record:**\n```json\n{\n  \"name\": \"api.gslb.elchi\",\n  \"type\": \"A\",\n  \"ttl\": 300,\n  \"ips\": [\"192.168.1.10\", \"192.168.1.11\"]\n}\n```\n\n**DNS Query:**\n```bash\ndig @localhost api.gslb.elchi A\n```\n\n**DNS Response:**\n```\napi.gslb.elchi. 300 IN A 192.168.1.10\napi.gslb.elchi. 300 IN A 192.168.1.11\n```\n\n### Cache Behavior\n\n- **Pre-built Records:** DNS RR objects built during sync, not during query\n- **Atomic Updates:** Entire cache replaced atomically on changes\n- **Thread-Safe:** Concurrent queries + background sync are safe\n- **Version Tracking:** Stores current version_hash for change detection\n\n### Zone Matching\n\n- Plugin only answers queries **within configured zone**\n- Queries outside zone → passed to next plugin\n- Zone format: `gslb.elchi`\n- Supports multiple zones by configuring multiple server blocks\n\n## Ready\n\nThis plugin implements the `ready` plugin's readiness interface. It reports ready when:\n- The cache has been initialized\n- At least one successful sync has occurred (version_hash is not empty)\n\nThe plugin will report not ready during:\n- Initial startup before first successful sync\n- If the cache is nil (initialization failed)\n\nThis integrates with Kubernetes readiness probes when used with the CoreDNS `ready` plugin:\n\n~~~ corefile\ngslb.example.org {\n    elchi {\n        endpoint http://elchi-backend:8080\n        secret my-secret\n    }\n    ready\n}\n~~~\n\nThe `ready` plugin will serve readiness checks on `:8181/ready` by default. The *elchi* plugin will be considered ready once the initial DNS snapshot has been successfully loaded.\n\n## Metrics\n\nThe *elchi* plugin exports Prometheus metrics following CoreDNS naming standards. All metrics use the `coredns_elchi_` prefix.\n\n### DNS Query Metrics\n\n- **`coredns_elchi_requests_total{zone, type}`** (Counter)\n  - Total number of DNS requests handled by the plugin\n  - Labels: `zone` (DNS zone), `type` (query type: A, AAAA, etc.)\n\n- **`coredns_elchi_cache_hits_total{zone, type}`** (Counter)\n  - Total number of successful cache lookups\n  - Labels: `zone`, `type`\n\n- **`coredns_elchi_cache_misses_total{zone, type}`** (Counter)\n  - Total number of failed cache lookups\n  - Labels: `zone`, `type`\n\n### Synchronization Metrics\n\n- **`coredns_elchi_sync_duration_seconds{zone, type}`** (Histogram)\n  - Duration of backend sync operations in seconds\n  - Labels: `zone`, `type` (operation type: \"snapshot\" or \"changes\")\n  - Buckets: 0.005, 0.01, 0.025, 0.05, 0.1, 0.25, 0.5, 1, 2.5, 5, 10\n\n- **`coredns_elchi_sync_errors_total{zone, type}`** (Counter)\n  - Total number of backend sync errors\n  - Labels: `zone`, `type` (\"snapshot\" or \"changes\")\n\n### Cache Metrics\n\n- **`coredns_elchi_cache_size{zone}`** (Gauge)\n  - Number of DNS records currently in cache\n  - Labels: `zone`\n\n### Webhook Metrics\n\n- **`coredns_elchi_webhook_requests_total{endpoint, status}`** (Counter)\n  - Total number of webhook requests received\n  - Labels: `endpoint` (\"health\", \"records\", \"notify\"), `status` (\"success\", \"error\", \"unauthorized\")\n\n### Accessing Metrics\n\nConfigure Prometheus endpoint in your Corefile:\n\n```\ngslb.elchi:53 {\n    bind {$NODE_IP}\n    elchi {\n        endpoint http://localhost:8080\n        secret your-secret-key\n        node_ip {$NODE_IP}\n    }\n    prometheus localhost:9253\n}\n```\n\nMetrics are available at `http://localhost:9253/metrics`.\n\n### Example PromQL Queries\n\n```promql\n# Cache hit rate by query type\nsum(rate(coredns_elchi_cache_hits_total[5m])) by (type) /\nsum(rate(coredns_elchi_requests_total[5m])) by (type)\n\n# Sync error rate\nrate(coredns_elchi_sync_errors_total[5m])\n\n# 95th percentile sync duration\nhistogram_quantile(0.95, rate(coredns_elchi_sync_duration_seconds_bucket[5m]))\n\n# Current cache size\ncoredns_elchi_cache_size\n\n# Webhook unauthorized attempts\nrate(coredns_elchi_webhook_requests_total{status=\"unauthorized\"}[5m])\n```\n\n## Production Deployment\n\n### Docker\n\nPre-built multi-architecture Docker images are available on Docker Hub:\n\n```bash\n# Pull latest version\ndocker pull cloudnativeworks/elchi-coredns:latest\n\n# Pull specific version\ndocker pull cloudnativeworks/elchi-coredns:v0.1.0\n\n# Run with custom Corefile\ndocker run -d \\\n  --name elchi-coredns \\\n  -p 53:53/udp \\\n  -p 53:53/tcp \\\n  -p 8053:8053 \\\n  -v $(pwd)/Corefile:/etc/coredns/Corefile \\\n  cloudnativeworks/elchi-coredns:latest\n```\n\n**Supported architectures:**\n- `linux/amd64`\n- `linux/arm64`\n\n### Kubernetes Example (DaemonSet with hostNetwork)\n\n```yaml\napiVersion: v1\nkind: ConfigMap\nmetadata:\n  name: coredns-elchi\ndata:\n  Corefile: |\n    gslb.elchi:53 {\n        bind {$NODE_IP}\n        elchi {\n            endpoint http://elchi-backend:8080\n            secret ${ELCHI_SECRET}\n            node_ip {$NODE_IP}\n            sync_interval 1m\n            webhook :8053\n            fallthrough\n        }\n        file /etc/coredns/gslb.elchi.db gslb.elchi\n        prometheus :9253\n        errors\n        log\n    }\n    .:53 {\n        bind {$NODE_IP}\n        forward . 8.8.8.8 8.8.4.4\n        cache 30\n    }\n---\napiVersion: apps/v1\nkind: Deployment\nmetadata:\n  name: coredns-elchi\nspec:\n  replicas: 3\n  selector:\n    matchLabels:\n      app: coredns-elchi\n  template:\n    metadata:\n      labels:\n        app: coredns-elchi\n    spec:\n      containers:\n      - name: coredns\n        image: cloudnativeworks/elchi-coredns:latest\n        ports:\n        - containerPort: 53\n          protocol: UDP\n        - containerPort: 53\n          protocol: TCP\n        - containerPort: 9253\n          name: metrics\n        env:\n        - name: ELCHI_SECRET\n          valueFrom:\n            secretKeyRef:\n              name: elchi-credentials\n              key: secret\n        volumeMounts:\n        - name: config\n          mountPath: /etc/coredns\n      volumes:\n      - name: config\n        configMap:\n          name: coredns-elchi\n```\n\n## Performance\n\n- **Query Latency:** \u003c1ms (in-memory cache lookup)\n- **Sync Overhead:** Minimal (only when hash changes)\n- **Memory:** ~100 bytes per DNS record\n- **Throughput:** \u003e10K queries/sec on modern hardware\n\n## Bugs\n\n- The plugin currently only supports A and AAAA record types. Other record types (CNAME, MX, TXT, etc.) are not supported.\n- When the backend is unreachable at startup, the plugin continues with an empty cache and serves NXDOMAIN for all queries until the first successful sync.\n- The webhook server does not support TLS. It is designed for internal pod-to-pod communication in Kubernetes where network traffic is already secured.\n\n## Documentation\n\nComprehensive documentation is available in the [docs/](docs/) directory:\n\n### Architecture \u0026 Design\n\n- **[Architecture Decision Records (ADRs)](docs/adr/)** - Understand design decisions\n  - [In-Memory Cache Usage](docs/adr/001-in-memory-cache.md) - Why cache instead of proxying\n  - [Hash-Based Sync](docs/adr/002-hash-based-sync.md) - Efficient change detection\n  - [Webhook Architecture](docs/adr/003-webhook-architecture.md) - Instant updates\n  - [Graceful Degradation](docs/adr/004-graceful-degradation.md) - Fault tolerance\n  - [Pre-Built DNS Records](docs/adr/005-pre-built-records.md) - Performance optimization\n\n- **[Sequence Diagrams](docs/diagrams/)** - Visual flow documentation\n  - [Startup Flow](docs/diagrams/01-startup-flow.md) - Plugin initialization\n  - [Periodic Sync](docs/diagrams/02-periodic-sync.md) - Background synchronization\n  - [Webhook Updates](docs/diagrams/03-webhook-flow.md) - Instant propagation\n  - [DNS Query Handling](docs/diagrams/04-query-flow.md) - Query path\n  - [Error Recovery](docs/diagrams/05-error-recovery.md) - Failure handling\n\n### Guides\n\n- **[Performance Tuning Guide](docs/guides/performance-tuning.md)** - Optimize for your workload\n  - Configuration parameters explained\n  - Resource requirements (CPU/memory)\n  - Benchmarking instructions\n  - Troubleshooting performance issues\n\n### Quick Links\n\n- [Full Documentation Index](docs/README.md)\n\n## License\n\nApache License 2.0 - See [LICENSE](LICENSE) file for details.\n\n## Contributing\n\nContributions welcome! Please ensure:\n- Tests pass: `make test`\n- Code follows Go best practices\n- Documentation updated\n\n## Support\n\n- Issues: [GitHub Issues](https://github.com/your-org/elchi-gslb/issues)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcloudnativeworks%2Felchi-gslb","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcloudnativeworks%2Felchi-gslb","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcloudnativeworks%2Felchi-gslb/lists"}