https://github.com/thomas-illiet/cerbai
CerbAI is a lightweight HTTP reverse proxy written in Go, built for corporate environments where access to LLM services is secured via JWT tokens obtained through an OAuth2 client_credentials flow over mutual TLS (mTLS).
https://github.com/thomas-illiet/cerbai
jwt mtls oauth openai proxy
Last synced: 12 days ago
JSON representation
CerbAI is a lightweight HTTP reverse proxy written in Go, built for corporate environments where access to LLM services is secured via JWT tokens obtained through an OAuth2 client_credentials flow over mutual TLS (mTLS).
- Host: GitHub
- URL: https://github.com/thomas-illiet/cerbai
- Owner: thomas-illiet
- Created: 2026-04-19T13:19:10.000Z (3 months ago)
- Default Branch: main
- Last Pushed: 2026-04-19T14:58:55.000Z (3 months ago)
- Last Synced: 2026-04-19T15:31:00.642Z (3 months ago)
- Topics: jwt, mtls, oauth, openai, proxy
- Language: Go
- Homepage:
- Size: 18.6 KB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# CerbAI

[](https://github.com/thomas-illiet/cerbai/actions/workflows/ci.yml)
[](https://github.com/thomas-illiet/cerbai/actions/workflows/release.yml)
CerbAI is a lightweight HTTP reverse proxy written in Go, designed for corporate environments where LLM access is protected by a JWT obtained through an OAuth2 `client_credentials` flow over mTLS. It is Kubernetes-native: configured entirely through environment variables or CLI flags — no config file required.
## How it works
```none
Client (OpenAI SDK / curl)
│
▼
┌─────────┐ client_credentials ┌───────────────┐
│ CerbAI │ ─────── mTLS ─────────► │ Token Service │
│ :8085 │ ◄──── JWT token ─────── └───────────────┘
└────┬────┘ (memory or Redis)
│ Authorization: Bearer
▼
┌──────────┐
│ LLM API │
└──────────┘
```
1. On each incoming request, CerbAI retrieves a valid JWT from its cache (memory or Redis).
2. On a cache miss, it refreshes the token via `client_credentials` over mTLS.
3. It injects the token into the `Authorization` header and forwards the request to the LLM.
4. SSE streaming responses (`text/event-stream`) are forwarded chunk by chunk with no buffering.
## Requirements
- Go 1.23+ (or Docker)
- A client certificate (cert + key) for mTLS
- Access to an internal token service and LLM endpoint
- Redis (optional, recommended for multi-replica deployments)
## Installation
### From source
```bash
git clone https://github.com/thomas-illiet/cerbai
cd cerbai
go build -o cerbai .
```
### Docker
```bash
docker pull ghcr.io/thomas-illiet/cerbai:latest
```
## Configuration
CerbAI is configured exclusively through **CLI flags** or **environment variables** — no config file needed.
### CLI flags
```
Usage:
cerbai [flags]
Flags:
--client-id string OAuth2 client ID (env: CERBAI_CLIENT_ID)
--client-secret string OAuth2 client secret (env: CERBAI_CLIENT_SECRET)
-h, --help help for cerbai
--listen-addr string Address to listen on (env: CERBAI_LISTEN_ADDR) (default ":8085")
--llm-url string Upstream LLM base URL (env: CERBAI_LLM_URL)
--proxy-token string Bearer token required to use the proxy, optional (env: CERBAI_PROXY_TOKEN)
--redis-url string Redis URL for shared token cache, optional (env: CERBAI_REDIS_URL)
--tls-ca-file string Custom CA certificate file, optional (env: CERBAI_TLS_CA_FILE)
--tls-cert-file string mTLS client certificate file path (env: CERBAI_TLS_CERT_FILE)
--tls-key-file string mTLS client key file path (env: CERBAI_TLS_KEY_FILE)
--token-cache-ttl duration Token cache TTL (env: CERBAI_TOKEN_CACHE_TTL) (default 5m0s)
--token-endpoint string OAuth2 token endpoint URL (env: CERBAI_TOKEN_ENDPOINT)
--token-header string Header name to inject the token into (env: CERBAI_TOKEN_HEADER) (default "Authorization")
--token-prefix string Token value prefix (env: CERBAI_TOKEN_PREFIX) (default "Bearer ")
--log-level string Log level: debug, info, warn, error (env: CERBAI_LOG_LEVEL) (default "info")
```
### Environment variables
| Variable | Default | Description |
| ------------------------ | --------------- | --------------------------------------------- |
| `CERBAI_LISTEN_ADDR` | `:8085` | Address to listen on |
| `CERBAI_LLM_URL` | — | Upstream LLM base URL |
| `CERBAI_TOKEN_ENDPOINT` | — | OAuth2 token endpoint URL |
| `CERBAI_CLIENT_ID` | — | OAuth2 client ID |
| `CERBAI_CLIENT_SECRET` | — | OAuth2 client secret |
| `CERBAI_TLS_CERT_FILE` | — | mTLS client certificate file path |
| `CERBAI_TLS_KEY_FILE` | — | mTLS client key file path |
| `CERBAI_TLS_CA_FILE` | — | Custom CA file (optional, uses system CAs) |
| `CERBAI_TOKEN_CACHE_TTL` | `5m` | Token cache TTL |
| `CERBAI_TOKEN_HEADER` | `Authorization` | Header name for token injection |
| `CERBAI_TOKEN_PREFIX` | `Bearer ` | Token value prefix |
| `CERBAI_PROXY_TOKEN` | — | Bearer token to access the proxy (optional) |
| `CERBAI_REDIS_URL` | — | Redis URL (optional, see Token cache section) |
| `CERBAI_LOG_LEVEL` | `info` | Log level: debug, info, warn, error |
## Running
### Minimal example
```bash
./cerbai \
--llm-url https://llm.internal.example.com \
--token-endpoint https://auth.internal.example.com/oauth2/token \
--client-id my-client \
--client-secret my-secret \
--tls-cert-file /etc/certs/client.crt \
--tls-key-file /etc/certs/client.key
```
### With Docker Compose
```bash
cp .env.example .env
# Edit .env with your values
# Place your mTLS certificates in ./certs/
# certs/client.crt certs/client.key certs/ca.crt
docker compose up -d
```
This starts CerbAI alongside a Redis instance used for token caching.
## Token cache
CerbAI supports two cache backends:
### In-memory (default)
Local to each instance. Simple, no external dependency.
```bash
./cerbai --token-cache-ttl 10m ...
```
### Redis (optional — recommended for multi-replica)
All replicas share the same cached token. Avoids N simultaneous requests to the token service during a scale-out event.
```bash
./cerbai --redis-url redis://redis:6379/0 ...
# With TLS:
./cerbai --redis-url rediss://redis:6379/0 ...
```
| Behaviour | Detail |
| --------------------- | --------------------------------------------------- |
| Configurable TTL | `--token-cache-ttl` (default: 5m) |
| Respects `expires_in` | Uses `min(configured TTL, expires_in - 30s)` |
| Thread-safe (memory) | `sync.RWMutex` with double-checked locking |
| Atomic (Redis) | Native `SET EX`, no mutex required |
| Startup warm-up | Token pre-fetched at boot; non-fatal if unavailable |
## Proxy authentication
When `--proxy-token` (or `CERBAI_PROXY_TOKEN`) is set, all requests to the proxy must include a matching `Authorization` header. Omit the flag to disable auth entirely.
```bash
./cerbai --proxy-token my-secret-key ...
```
```bash
curl http://localhost:8085/v1/chat/completions \
-H "Authorization: Bearer my-secret-key" \
-H "Content-Type: application/json" \
-d '{"model":"gpt-4","messages":[{"role":"user","content":"Hello!"}]}'
```
Requests without a valid token receive `401 Unauthorized`. The `/healthz` endpoint is always public.
## Health check
CerbAI exposes a health endpoint at `GET /healthz` that returns `200 ok` when the process is running.
```bash
curl http://localhost:8085/healthz
```
## Usage
Point your OpenAI-compatible client at `http://localhost:8085`:
```bash
# Non-streaming
curl http://localhost:8085/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{"model":"gpt-4","messages":[{"role":"user","content":"Hello!"}]}'
# SSE streaming
curl http://localhost:8085/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{"model":"gpt-4","stream":true,"messages":[{"role":"user","content":"Hello!"}]}'
```
Python (OpenAI SDK):
```python
from openai import OpenAI
client = OpenAI(
base_url="http://localhost:8085/v1",
api_key="not-used", # token is managed by CerbAI
)
stream = client.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": "Hello!"}],
stream=True,
)
for chunk in stream:
print(chunk.choices[0].delta.content, end="", flush=True)
```
## Logs
Structured JSON logs on stdout. Set log level with `--log-level` or `CERBAI_LOG_LEVEL`:
```bash
./cerbai --log-level debug ...
# or
export CERBAI_LOG_LEVEL=debug
```
Available levels: `debug`, `info`, `warn`, `error` (default: `info`)
### Example logs
```json
{"time":"2026-04-19T10:00:00Z","level":"INFO","msg":"starting CerbAI","version":"v1.2.0","commit":"abc1234","build_date":"2026-04-19T10:00:00Z"}
{"time":"2026-04-19T10:00:00Z","level":"INFO","msg":"config loaded","listen_addr":":8085","llm_url":"https://llm.internal.example.com","token_cache_ttl":"5m0s","redis":true}
{"time":"2026-04-19T10:00:00Z","level":"INFO","msg":"token refreshed","ttl":"4m30s","backend":"redis","duration_ms":45}
{"time":"2026-04-19T10:00:00Z","level":"INFO","msg":"starting proxy server","addr":":8085"}
```
### Debug level logs
At `debug` level, additional request-level logging:
```json
{"time":"2026-04-19T10:00:01Z","level":"DEBUG","msg":"incoming request","path":"/v1/chat/completions","method":"POST","remote_addr":"127.0.0.1:54321"}
{"time":"2026-04-19T10:00:01Z","level":"DEBUG","msg":"token fetched","duration_ms":2}
{"time":"2026-04-19T10:00:01Z","level":"DEBUG","msg":"request completed","path":"/v1/chat/completions","method":"POST","duration_ms":1250}
{"time":"2026-04-19T10:00:02Z","level":"WARN","msg":"auth failed","path":"/v1/chat/completions","method":"POST","remote_addr":"127.0.0.1:54322"}
```
## Performance testing
CerbAI ships with a [k6](https://k6.io) performance test suite. Tests run entirely locally via Docker Compose against a lightweight Go mock server that simulates the OAuth2 token endpoint and the LLM API — no real credentials or upstream required.
### Architecture
```none
k6 ──► cerbai :8085 ──► mock-server :9090
│ │
│ POST /token (fake OAuth2 response)
└────────► POST /v1/chat/completions (fake LLM response)
```
### Prerequisites
- Docker with Compose plugin
### Quick start
```bash
# Build the mock server image (once)
make perf-build
# Smoke test — 2 VUs for 1 minute
make perf-smoke
# Load test — 50 VUs for 2 minutes
make perf-load
# Stress test — ramp up to 200 VUs to find the saturation point
make perf-stress
# Soak test — 20 VUs for 30 minutes (detects memory leaks and degradation)
make perf-soak
# Smoke → load → stress in sequence
make perf-all
# Tear down the perf environment
make perf-down
```
### Test scenarios
| Scenario | VUs | Duration | p95 threshold | Error threshold |
| -------- | -------------- | -------- | ------------- | --------------- |
| Smoke | 2 | 1 min | < 500 ms | < 1% |
| Load | 50 | 2 min | < 1 s | < 2% |
| Stress | 0 → 200 (ramp) | 8 min | < 2 s | < 5% |
| Soak | 20 | 30 min | < 1 s | < 1% |
### Optional environment variables
| Variable | Default | Description |
| ------------- | -------------------- | -------------------------------------------- |
| `PROXY_URL` | `http://cerbai:8085` | CerbAI address as seen from the k6 container |
| `PROXY_TOKEN` | _(empty)_ | Bearer token if `--proxy-token` is set |
Pass them inline to override:
```bash
PROXY_TOKEN=my-secret make perf-smoke
```
### File layout
```none
tests/perf/
├── mock_server/
│ └── main.go — Go mock server (OAuth2 token + LLM endpoints)
└── k6/
├── smoke.js
├── load.js
├── stress.js
└── soak.js
docker-compose.perf.yml — Override: points cerbai at mock-server (no TLS)
```
## CI / CD
| Workflow | Trigger | Action |
| ------------- | -------------------------------- | -------------------------------------- |
| `ci.yml` | Push to any branch, PR to `main` | Build, vet, test, Dockerfile lint |
| `release.yml` | Push tag `v*` | Multi-arch Docker build & push to GHCR |
To release a new version:
```bash
git tag v1.0.0
git push origin v1.0.0
```
The release workflow builds `linux/amd64` and `linux/arm64` images and publishes them to `ghcr.io/thomas-illiet/cerbai` with tags `v1.0.0`, `v1.0`, `v1`, and `sha-`.
## Architecture
```none
internal/
├── config/
│ └── config.go — Viper config, Cobra flags, TLS config builder
├── middleware/
│ └── auth.go — Bearer token auth middleware for proxy access
├── proxy/
│ └── handler.go — httputil.ReverseProxy, token injection, SSE streaming
└── token/
├── cache.go — In-memory cache (RWMutex double-check) + OAuth2 fetcher
└── redis.go — Redis cache (atomic SET EX, multi-replica safe)
main.go — Cobra CLI, cache selection, token warmup, graceful shutdown
```