https://github.com/libraz/coredns-dynresolve
CoreDNS plugin for dynamic DNS resolution via REST API — simple, fail-safe service discovery without etcd or Kubernetes.
https://github.com/libraz/coredns-dynresolve
Last synced: 10 days ago
JSON representation
CoreDNS plugin for dynamic DNS resolution via REST API — simple, fail-safe service discovery without etcd or Kubernetes.
- Host: GitHub
- URL: https://github.com/libraz/coredns-dynresolve
- Owner: libraz
- License: apache-2.0
- Created: 2026-04-02T18:44:34.000Z (2 months ago)
- Default Branch: main
- Last Pushed: 2026-04-04T13:32:38.000Z (2 months ago)
- Last Synced: 2026-04-04T15:57:24.736Z (2 months ago)
- Language: Go
- Size: 74.2 KB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
- Support: support/deb/Dockerfile.debbuild
Awesome Lists containing this project
README
# coredns-dynresolve
[](https://github.com/libraz/coredns-dynresolve/actions)
[](https://github.com/libraz/coredns-dynresolve/releases)
[](https://codecov.io/gh/libraz/coredns-dynresolve)
[](https://github.com/libraz/coredns-dynresolve/blob/main/LICENSE)
[](https://go.dev/)
[](https://github.com/libraz/coredns-dynresolve)
A [CoreDNS](https://coredns.io/) plugin for dynamic DNS resolution driven by a REST API.
Enables DNS-based service failover in on-premise environments without etcd, Consul, or Kubernetes. An external controller pushes service state via authenticated HTTP endpoints; the plugin reads that state and serves DNS.
## Background
Service failover via DNS typically requires one of:
- Kubernetes DNS + Service
- CoreDNS + etcd plugin
- Consul catalog
All of these assume a distributed system is already in place. In transitional on-premise environments — where Kubernetes is not yet deployed and the operational cost of a distributed KV store is not justified — these options are too heavy.
This plugin takes a simpler approach: an external controller (a script, systemd timer, monitoring agent, or CI pipeline) pushes service state to the plugin via a REST API. Health checking, failover decisions, and state generation are outside the plugin's scope — they belong to the external controller.
Design constraints:
- ServeDNS reads from in-memory state only (non-blocking)
- DNS continues to respond when the API is unreachable (fail-safe)
- Plugin bugs do not crash CoreDNS (panic-safe)
- A records only; other query types are delegated to the next plugin
## Architecture
```mermaid
graph LR
subgraph External
C[External Controller]
C -->|HTTP PUT| API
end
subgraph "CoreDNS + dynresolve"
API[APIServer
auth + rate limit + TLS] -->|write| SS[(StateStore)]
SS --> SD[ServeDNS]
CA[(Cache)] <--> SD
FB[Fallback IP] --> SD
NP[Next Plugin] --> SD
end
CL[DNS Clients] -->|query| SD
SD -->|response| CL
```
StateStore is the shared in-memory state. The APIServer writes to it; ServeDNS reads from it via the Cache.
## Fallback Chain
Each query is resolved in the following order:
```mermaid
flowchart TD
Q[DNS Query] --> CC{Cache fresh?}
CC -->|Yes| R1[Respond from cache]
CC -->|No| SL{StateStore
has data?}
SL -->|Yes| UC[Update cache] --> R2[Respond from source]
SL -->|No| SC{Stale cache?}
SC -->|Yes| R3[Respond stale]
SC -->|No| FI{Fallback IP?}
FI -->|Yes| R4[Respond fallback]
FI -->|No| NX[Next Plugin]
style R1 fill:#2d6,stroke:#1a4,color:#fff
style R2 fill:#2d6,stroke:#1a4,color:#fff
style R3 fill:#da2,stroke:#a80,color:#fff
style R4 fill:#da2,stroke:#a80,color:#fff
style NX fill:#68f,stroke:#46c,color:#fff
```
## Quick Start
### Corefile
```
service.local {
dynresolve {
api listen 127.0.0.1:8080
api token
api allow 127.0.0.0/8
api rate-limit 10
ttl 5
cache 100ms
fallback 10.0.0.1
}
errors
}
```
With TLS:
```
service.local {
dynresolve {
api listen 0.0.0.0:8443
api token {env.DYNRESOLVE_TOKEN}
api allow 10.0.0.0/8
api rate-limit 10
api tls /etc/coredns/tls/cert.pem /etc/coredns/tls/key.pem
ttl 5
cache 100ms
fallback 10.0.0.1
}
errors
}
```
### Build
```bash
make build
```
### Verify
```bash
./coredns -conf Corefile
# Push state via API (short name — zone suffix is appended automatically)
curl -X PUT -H "Authorization: Bearer " \
-d '{"type":"A","records":["10.0.0.12"],"ttl":5}' \
http://127.0.0.1:8080/v1/services/valkey
# Resolve via DNS
dig @127.0.0.1 valkey.service.local A +short
# 10.0.0.12
```
## Configuration
### Core
| Directive | Default | Description |
|---|---|---|
| `ttl ` | `5` | Default DNS response TTL |
| `cache ` | `100ms` | In-memory cache TTL (supports stale serving) |
| `fallback ` | — | IPv4 address returned when all other sources fail |
### REST API (required)
| Directive | Default | Description |
|---|---|---|
| `api listen ` | required | Listen address (e.g., `127.0.0.1:8080`) |
| `api token ` | required | Bearer token. Supports `$ENV_VAR` and `{env.VAR}` syntax |
| `api allow ` | allow all | Allowed client CIDRs (e.g., `127.0.0.0/8 10.0.0.0/8`) |
| `api rate-limit ` | `10` | Maximum write requests per second |
| `api tls ` | — | TLS certificate and key paths. Enables HTTPS (TLS 1.2+) |
| `api persist ` | — | File path for state persistence. On each write, state is saved to disk (atomic write). On startup, state is restored from this file if it exists. Disabled by default (in-memory only) |
### API Endpoints
| Method | Path | Description |
|---|---|---|
| `GET` | `/v1/services` | List all services |
| `GET` | `/v1/services/{name}` | Get a single service |
| `PUT` | `/v1/services/{name}` | Add or update a service |
| `DELETE` | `/v1/services/{name}` | Delete a service |
| `POST` | `/v1/services/{name}/records` | Add records to a service |
| `DELETE` | `/v1/services/{name}/records` | Remove records from a service |
| `PUT` | `/v1/state` | Bulk replace all services |
All endpoints require `Authorization: Bearer ` header.
Write endpoints (`PUT`, `POST`, `DELETE`) are subject to rate limiting. Read endpoints (`GET`) are not.
Service names can be specified as short names (e.g., `valkey`) or full names (e.g., `valkey.service.local`). The zone suffix is appended automatically based on the CoreDNS server block. Responses include both `name` (short) and `fqdn` (full qualified).
### State Format
Used by `PUT /v1/services/{name}` and `PUT /v1/state`.
Single service:
```json
{"type": "A", "records": ["10.0.0.12"], "ttl": 5}
```
Response:
```json
{"name": "valkey", "fqdn": "valkey.service.local", "type": "A", "records": ["10.0.0.12"], "ttl": 5}
```
Bulk state request (short names):
```json
{
"services": {
"valkey": {"type": "A", "records": ["10.0.0.12"], "ttl": 5},
"web": {"type": "A", "records": ["10.0.0.20", "10.0.0.21"], "ttl": 10}
}
}
```
Bulk state response:
```json
{"zone": "service.local", "count": 2}
```
List services response (`GET /v1/services`):
```json
{
"zone": "service.local",
"services": {
"valkey": {"type": "A", "records": ["10.0.0.12"], "ttl": 5},
"web": {"type": "A", "records": ["10.0.0.20", "10.0.0.21"], "ttl": 10}
}
}
```
| Field | Specification |
|---|---|
| `type` | `"A"` only. Other types are rejected with 400 |
| `records` | IPv4 addresses. Invalid entries are rejected with 400 |
| `ttl` | Seconds. `0` falls back to the plugin-level default |
| names | Short name or FQDN (without trailing dot). Zone suffix appended automatically |
### Record-Level Operations
Used by `POST /v1/services/{name}/records` (add) and `DELETE /v1/services/{name}/records` (remove).
```json
{"records": ["10.0.0.13", "10.0.0.14"], "ttl": 5}
```
| Field | Specification |
|---|---|
| `records` | IPv4 addresses to add or remove |
| `ttl` | (add only) Optional. Updates TTL if non-zero |
`POST` creates the service if it does not exist. Duplicate records are ignored.
`DELETE` removes the specified records. If all records are removed, the service is deleted (returns 204).
## Design Principles
| Principle | Detail |
|---|---|
| Persist-safe | Write errors to the persist file are logged but do not affect API responses |
| Fail-safe | Serves stale data or fallback IP on source failure. DNS never stops responding |
| Non-blocking | ServeDNS reads from in-memory StateStore only. HTTP I/O runs in a separate goroutine |
| Panic-safe | ServeDNS includes `recover()`. Plugin bugs do not crash CoreDNS |
| A records only | Other query types are delegated to the next plugin |
| Separation of concerns | Health check and failover logic belong to the external controller |
## Security
### DNS
Each CoreDNS server block creates an independent plugin instance with its own StateStore. Zone restriction applies to both DNS queries and API writes.
```
service.local {
dynresolve { ... } # Instance A — only service.local names
}
infra.local {
dynresolve { ... } # Instance B — only infra.local names
}
```
### REST API
| Layer | Mechanism |
|---|---|
| Transport | `api tls` enables HTTPS (TLS 1.2+). Required when listening on non-loopback addresses |
| Network | `api listen` binds to a specific address. Use `127.0.0.1` for local-only access |
| IP restriction | `api allow ` restricts by source IP. Disallowed IPs receive 403 |
| Authentication | `api token` requires `Authorization: Bearer ` header. Constant-time comparison. Invalid tokens receive 401 |
| Rate limiting | `api rate-limit ` limits write operations per second. Excess requests receive 429 |
| Zone enforcement | Service names are automatically scoped to the CoreDNS server block zone. Short names are expanded with the zone suffix |
| Name validation | Service names must be valid DNS labels (`[a-z0-9-]`, 1-63 chars per label, max 253 chars total) |
| Record limit | Maximum 64 A records per service. Excess records are rejected with 400 |
| Body size | 1 MB (single service) / 10 MB (bulk state) |
| Audit logging | All write operations (PUT/POST/DELETE) are logged with source IP, service name, and record values |
### Token Management
The token can be specified as:
- Plaintext in Corefile: `api token mysecret`
- Environment variable: `api token $DYNRESOLVE_TOKEN` or `api token {env.DYNRESOLVE_TOKEN}`
The environment variable form is recommended for production to avoid storing secrets in configuration files.
### Known Limitations
- **Single static token.** Token rotation requires Corefile edit + CoreDNS restart. Per-client tokens are not supported. For multi-client access control, use a reverse proxy.
- **No replay protection.** Captured requests can be replayed. TLS mitigates network-level capture. For stronger guarantees, use mTLS via a reverse proxy.
- **`PUT /v1/state` is a full replacement.** Replaces all services within the zone. Zone validation applies to each entry. For granular changes, use `PUT /v1/services/{name}`, `POST /v1/services/{name}/records`, or `DELETE /v1/services/{name}/records`.
## Building
Default base: **CoreDNS v1.12.1**. Override with `make build COREDNS_VERSION=v1.x.x`.
```bash
git clone https://github.com/libraz/coredns-dynresolve.git
cd coredns-dynresolve
make build
```
### Makefile Targets
| Target | Description |
|---|---|
| `make build` | Build custom CoreDNS binary with dynresolve |
| `make test` | Run unit tests with race detector |
| `make lint` | Run golangci-lint |
| `make coverage` | Run tests with coverage report |
| `make integration-test` | Run integration tests (requires `make build`) |
| `make clean` | Remove build artifacts |
### RPM / DEB Packages
```bash
make pkg-rpm-el9 # AlmaLinux 9
make pkg-rpm-el10 # AlmaLinux 10
make pkg-deb-jammy # Ubuntu 22.04
make pkg-deb-noble # Ubuntu 24.04
make pkg-all # All of the above
```
## Testing
### Unit Tests
```bash
make test
```
Covers: config parsing, validation, cache, fallback chain, panic recovery, API handlers, middleware (auth/IP/rate limit), StateStore concurrency.
### Integration Tests
39 tests. Starts a real CoreDNS process and verifies the full stack via `dns.Exchange` and HTTP.
```bash
make build
make integration-test
```
| Category | Coverage |
|---|---|
| Basic operations | PUT then resolve, multiple records, fallback, TCP, EDNS0, delete |
| API functionality | GET/LIST/PUT/DELETE, bulk replace, validation errors |
| Short names | PUT/GET/DELETE with short names, full name interop, bulk state with short names, zone in responses |
| Record-level | Add records, add to new service, deduplication, partial remove, full remove (service deletion), not found, invalid IP, empty records |
| Security | Auth required, wrong token, invalid IP, invalid type |
| Concurrency | Parallel queries, concurrent API writes + DNS reads |
| Scale | 50 records per service, 200 services bulk |
| Persistence | PUT creates file, restore on restart, delete updates file |
| Edge cases | Case sensitivity, unsupported query types, cross-contamination, default TTL |
Integration tests use the `integration` build tag and are excluded from CI.
## License
[Apache License 2.0](LICENSE)