https://github.com/azerozero/grob

LLM proxy with built-in DLP and regulatory compliance. Redacts secrets before they reach the API. EU AI Act, GDPR, HDS/PCI DSS ready. Multi-provider failover, live TUI, virtual keys, fan-out. 6 MB, zero deps. Rust.
https://github.com/azerozero/grob
ai-gateway air-gapped anthropic audit-log dlp eu-ai-act failover fan-out gdpr gemini llm-proxy multi-provider ollama openai opentelemetry rust secret-scanning sovereign streaming virtual-keys
Last synced: 20 days ago
JSON representation
Host: GitHub
URL: https://github.com/azerozero/grob
Owner: azerozero
License: other
Created: 2026-02-22T23:31:12.000Z (4 months ago)
Default Branch: main
Last Pushed: 2026-06-08T09:49:10.000Z (26 days ago)
Last Synced: 2026-06-08T11:25:48.725Z (26 days ago)
Topics: ai-gateway, air-gapped, anthropic, audit-log, dlp, eu-ai-act, failover, fan-out, gdpr, gemini, llm-proxy, multi-provider, ollama, openai, opentelemetry, rust, secret-scanning, sovereign, streaming, virtual-keys
Language: Rust
Size: 3.91 MB
Stars: 18
Watchers: 1
Forks: 1
Open Issues: 5
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
- Codeowners: .github/CODEOWNERS
- Security: SECURITY.md
- Agents: AGENTS.md
- Cla: CLA.md
Awesome Lists containing this project

README

          


  
Grob

  

    Your LLM traffic leaks data. Grob stops it.

  

  

    The only LLM proxy with built-in DLP, written in Rust, deployable air-gapped.

  

  

    

    

    

    

  


---

**Grob** is a high-performance LLM routing proxy that sits between your AI tools and your providers. It redacts secrets before they reach the API, fails over transparently when a provider goes down, and fits in a 6 MB container with zero dependencies.

> **~90 µs pure overhead** with full DLP + routing + caching + rate limiting -- [40x faster than LiteLLM, every feature measured individually](docs/reference/benchmarks.md).

```mermaid

flowchart LR

    CC[Claude Code] --> G

    AI[Aider] --> G

    CX[Codex CLI] --> G

    FO[Forge] --> G

    CU[Cursor] --> G

    G["Grob <DLP>
6 MB · zero deps"] --> A["Anthropic (primary)"]

    G --> OR["OpenRouter (fallback)"]

    G --> GE[Gemini]

    G --> DS[DeepSeek]

    G --> OL["Ollama (local)"]

```

## Why Grob?

| Problem | How Grob solves it |

|---------|-------------------|

| API keys and secrets leak to LLM providers in prompts | **DLP engine** scans every request -- redacts, blocks, or warns before the data leaves |

| Provider goes down during a coding session | **Multi-provider failover** with circuit breakers and exponential backoff. Zero client changes |

| No visibility into what your AI tools send | **`grob watch`** -- live TUI showing every request, response, DLP action, and fallback in real time |

| Bill shock from runaway LLM usage | **Spend tracking** with per-tenant budgets, monthly caps, and alerts at 80% |

| AI agent executes destructive tool calls without review | **HIT Gateway** -- intercepts every `tool_use` block, enforces per-policy approval rules (auto-approve / require human / deny), supports multisig and quorum |

| Deploying in air-gapped / sovereign environments | **Single binary, 6 MB, zero dependencies** -- no Python, no PostgreSQL, no Redis |

## 30-second quickstart

**With Homebrew** (macOS / Linux):

```bash

brew install azerozero/tap/grob

```

**Without Homebrew** (Linux / CI):

```bash

curl -fsSL https://raw.githubusercontent.com/azerozero/grob/main/scripts/install.sh | sh

```

Then:

```bash

grob setup        # writes ~/.grob/config.toml (override with GROB_CONFIG or --config)

grob exec -- claude

```

That's it. Grob auto-starts, routes traffic, and stops when your tool exits. To check a long-running instance, run `grob status` or `curl http://[::1]:13456/health`.

## DLP -- secrets never reach the provider

Every request and response passes through the DLP engine before leaving your machine:

```toml

[dlp]

enabled = true

[[dlp.secrets]]

name = "custom_token"

prefix = "tok_"

pattern = "tok_[A-Za-z0-9]{40}"

action = "redact"            # API keys, tokens, credentials → [REDACTED]

[dlp.pii]

credit_cards = true

iban = true

action = "redact"            # Emails, phone numbers → redacted

[[dlp.names]]

term = "Acme Corp"

action = "pseudonym"         # Real names → consistent pseudonyms

[dlp.prompt_injection]

enabled = true

action = "block"             # Prompt injection attempts → 400

[dlp.url_exfil]

enabled = true

action = "block"             # Data exfiltration URLs → stripped

```

No other LLM proxy does this. LiteLLM, Bifrost, Portkey, Kong -- none have inline DLP on the hot path.

## Live traffic inspector

```bash

grob watch

```

```

┌─ Providers ──────────────────────────────────────────────────────────┐

│  anthropic ●  142ms  99.2%  │  openrouter ●  380ms  97.1%           │

├─ Live ───────────────────────────────────────────────────────────────┤

│  11:24:03  → claude-sonnet-4-6    anthropic   1.2K tok              │

│  11:24:04  ← claude-sonnet-4-6    anthropic   834 tok  1.4s  $0.02 │

│  11:24:05  DLP: 1 secret redacted (AWS key pattern)                 │

│  11:24:09  FALLBACK: anthropic 429 → openrouter                     │

│  11:24:10  ← gemini-2.5-pro       openrouter  412 tok  0.6s  $0.001│

├─ Alerts ─────────────────────────────────────────────────────────────┤

│  DLP: 3 secrets | 1 PII | 0 injections   Circuit: all OK            │

└──────────────────────────────────────────────────────────────────────┘

```

## Intelligent routing

Requests are classified by intent, then routed to the best model with automatic fallback:

```mermaid

flowchart LR

    R[Request] --> CL[Classify]

    CL --> M[Model] --> P1["Provider (P1)"]

    P1 -->|fail| P2["Provider (P2)"]

    CL -->|extended thinking?| O[Opus 4.7]

    CL -->|web_search tool?| GP[Gemini 2.5 Pro]

    CL -->|background task?| GF[Haiku 4.5]

    CL -->|regex match?| CM[custom model]

    CL -->|default| S[Sonnet 4.6]

```

Presets configure everything in one command:

| Preset | What it sets up | Cost |

|--------|-----------------|------|

| **perf** | Pure Anthropic OAuth (Pro/Max) — auto-maps `claude-*` to native | Max subscription |

| **ultra-cheap** | Stacked free tiers (Groq + Cerebras + Z.ai + OpenRouter `:free`) | ~€0-2/month |

| **gdpr** | EU-only routing — Mistral, Scaleway, OVH (`region = "eu"`) | Pay-as-you-go |

| **eu-ai-act** | EU AI Act compliant — EU providers + transparency headers + risk classification | Pay-as-you-go |

| **eu-eco** | Strict-EU sovereign, budget — Scaleway FR + Nebius `eu-north1` | Pay-as-you-go |

| **eu-pro** | Strict-EU sovereign, balanced — Hermes-4-405B + Qwen3.5-397B | Pay-as-you-go |

| **eu-max** | Strict-EU sovereign, premium — preemptive 397B/405B everywhere | Pay-as-you-go |

```bash

grob preset apply perf

grob preset list   # see every available preset

```

## Supported providers

| Provider | Auth | Notes |

|----------|------|-------|

| **Anthropic** | API key / OAuth (Max) | Claude models |

| **OpenAI** | API key | GPT, o-series |

| **Gemini** | API key / OAuth (Pro) | Google AI Studio |

| **Vertex AI** | ADC | Google Cloud |

| **OpenRouter** | API key | 200+ models |

| **DeepSeek** | API key | DeepSeek V4, R1 |

| **Mistral** | API key | Devstral, Codestral |

| **Groq** | API key | Fast inference |

| **z.ai** | API key | GLM-4 family |

| **MiniMax** | API key | MiniMax models |

| **Kimi Coding** | API key | Kimi K2 |

| **Zenmux** | API key | Aggregated routing |

| **Ollama** | none | Local inference |

Any OpenAI-compatible API works with `provider_type = "openai"` and a custom `base_url`.

## Multi-tenant virtual keys

Distribute API keys to teams with per-key budgets, rate limits, and model restrictions:

```bash

grob key create --name "frontend-team" --tenant frontend --budget 50 --rate-limit 20

# grob_a1b2c3d4e5f6... (shown once, hashed at rest)

grob key list

# PREFIX        NAME            TENANT     BUDGET    RATE

# grob_a1b2...  frontend-team   frontend   $50/mo    20 rps

# grob_f8e7...  ml-pipeline     data       $200/mo   100 rps

```

## Fan-out racing

Send the same request to multiple providers in parallel. Pick the fastest, cheapest, or best-quality response:

```toml

[[models]]

name = "best-answer"

strategy = "fan_out"

[models.fan_out]

mode = "fastest"   # or "best_quality", "weighted"

```

## Regulatory compliance

Grob maps its features to specific regulatory articles. Every claim is [verified against the codebase](docs/reference/features.md#implementation-verification-audited-2026-03-18).

| Regulation | Coverage |

|------------|----------|

| **EU AI Act** | Art. 12 (signed audit log with model/tokens), Art. 14 (risk scoring + escalation webhook), Art. 15 (injection detection, 28 languages), Art. 52 (transparency headers) |

| **GDPR/RGPD** | PII redaction, name pseudonymization, EU-only provider routing (`gdpr = true`), canary tokens for leak detection |

| **HDS/PCI DSS/SecNumCloud** | Hash-chained audit entries, Merkle batch signing, classification NC/C1/C2/C3, AES-256-GCM credentials at rest |

| **NIS2/DORA** | Multi-provider resilience, escalation webhooks, zero-downtime upgrades |

```bash

grob preset apply eu-ai-act   # EU AI Act + GDPR in one command

grob preset apply gdpr        # EU-only routing + DLP

```

## Also included

- **Signed audit log** -- ECDSA-P256 / Ed25519 / HMAC-SHA256, hash-chained, Merkle tree batch signing

- **Response caching** -- Dedup temperature=0 requests (saves tokens and money)

- **Native TLS + ACME** -- Built-in HTTPS with Let's Encrypt auto-certificates

- **Three API endpoints** -- `/v1/messages` (Anthropic), `/v1/chat/completions` (OpenAI), `/v1/responses` (Codex CLI)

- **Prometheus + OpenTelemetry** -- `/metrics` endpoint, OTLP distributed tracing

- **MCP tool matrix** -- JSON-RPC server for tool-calling orchestration

See the [full feature matrix](docs/reference/features.md) for rate limiting, JWT/OAuth, log export, zero-downtime upgrades, record & replay, and more.

## Configuration

```toml

[[providers]]

name = "anthropic"

provider_type = "anthropic"

auth_type = "oauth"

oauth_provider = "anthropic-max"

[[providers]]

name = "openrouter"

provider_type = "openrouter"

api_key = "$OPENROUTER_API_KEY"

[[models]]

name = "default"

[[models.mappings]]

provider = "anthropic"

actual_model = "claude-sonnet-4-6"

priority = 1

[[models.mappings]]

provider = "openrouter"

actual_model = "openai/gpt-5"

priority = 2

[router]

default = "default"

think = "claude-opus-thinking"

[server]

port = 13456

```

See [Configuration Reference](docs/reference/configuration.md) for all options.

## CLI

```

grob setup                Start the interactive setup wizard

grob start [-d]           Start the server (--detach for background)

grob stop / restart       Stop or restart the server

grob exec --         Run a command behind the proxy (auto start/stop)

grob watch                Live traffic inspector (TUI dashboard)

grob status               Service status + spend summary

grob spend                Monthly spend breakdown

grob key create/list/revoke  Manage virtual API keys

grob secrets add/list/test    Manage encrypted upstream secrets

grob validate             Test all providers with real API calls

grob doctor               Run diagnostic checks

grob preset list/apply    Manage presets

grob connect [provider]   Set up credentials interactively

```

## Container

The image listens on container port `8080` and supports the same config path contract as the binary:

```bash

docker volume create grob-data

docker run --rm -p 8080:8080 \

  -v "$HOME/.grob/config.toml:/etc/grob/config.toml:ro" \

  -v grob-data:/var/lib/grob \

  -e GROB_CONFIG=/etc/grob/config.toml \

  -e GROB_HOME=/var/lib/grob \

  ghcr.io/azerozero/grob:latest

```

Use `-p 13456:8080` if you want the native host default on the outside. 6 MB image, `FROM scratch`, TLS bundled via rustls. No OS layer needed.

## Project structure

```

src/

├── server/              Axum HTTP server and dispatch pipeline

│   ├── dispatch/        Core dispatch: DLP, cache, route, provider loop

│   ├── openai_compat/   OpenAI /v1/chat/completions translation

│   ├── responses_compat/  OpenAI Responses API translation

│   ├── rpc/             JSON-RPC control plane

│   ├── watch_sse.rs     Live traffic inspector SSE backend

│   └── fan_out.rs       Parallel multi-provider dispatch

├── providers/           Provider implementations and registry

├── routing/             Request routing: classification + nature-inspired primitives

│   ├── classify/        Regex-based request classification engine (task type, tier, auto-map)

│   ├── circuit_breaker.rs  Passive per-endpoint circuit breaker (RE-1a, ADR-0018)

│   └── health_check.rs     Active per-provider health probe (RE-1b, opt-in)

├── cli/                 Config structs and CLI argument parsing

├── commands/            CLI command implementations

├── auth/                OAuth client, token store, JWT validation

├── features/

│   ├── dlp/             Secret scanning, PII, canary tokens

│   ├── policies/        HIT Gateway, per-action authorization

│   ├── token_pricing/   Pricing, spend tracking, budgets

│   ├── mcp/             MCP tool matrix, JSON-RPC server

│   ├── tap/             Webhook event emission

│   ├── harness/         Record & replay sandwich testing

│   ├── tool_layer/      Tool-calling abstraction layer

│   ├── pledge/          Pledge-based capability restrictions

│   ├── watch/           TUI dashboard and live traffic inspector support

│   └── log_export/      Encrypted audit log export

├── shared/              Cross-cutting modules (not tied to a single slice)

│   ├── acme.rs          Automatic TLS certificate provisioning via ACME

│   ├── instance.rs      Multi-instance coordination (PID + port probing)

│   ├── net.rs           Network binding with SO_REUSEPORT

│   ├── otel.rs          OpenTelemetry subscriber bootstrap

│   ├── pid.rs           PID file management for daemon mode

│   └── message_tracing/ Request/response trace pipeline (JSONL + rotation)

├── security/            Circuit breakers, rate limiting, audit log

├── storage/             Persistent storage layer: atomic files, JSONL journals (GrobStore)

├── models/              Model and message type definitions

├── cache/               Response cache layer

├── pricing.rs           Static model pricing (leaf module, breaks cycle providers↔features)

└── preset/              Preset management system

```

## Development

### Prerequisites

- Rust stable (edition 2021)

- For TUI features: a terminal with 256-color support

- [prek](https://github.com/j178/prek) for pre-commit hooks (optional but recommended)

### Build and run

```bash

cargo build

cargo run -- start

```

### Tests

```bash

cargo test

```

### Pre-commit hooks

```bash

prek install   # activates fmt, clippy, gitleaks on commit

```

### Benchmarks

```bash

cargo bench --bench routing

cargo bench --bench hotpath

```

## Documentation

| Doc | Description |

|-----|-------------|

| [Feature Matrix](docs/reference/features.md) | Complete feature list with config references |

| [Getting Started](docs/tutorials/getting-started.md) | Step-by-step tutorial |

| [Configuration Reference](docs/reference/configuration.md) | All config options |

| [DLP Reference](docs/reference/dlp.md) | Secret scanning, PII, injection, URL exfil |

| [DLP How-To](docs/how-to/dlp.md) | Recipes for each DLP feature |

| [Security Model](docs/explanation/security.md) | Rate limiting, audit, circuit breakers |

| [Architecture](docs/explanation/architecture.md) | Module layout and design decisions |

| [CLI Reference](docs/reference/cli.md) | Full command documentation |

| [OAuth Setup](docs/how-to/oauth-setup.md) | Anthropic Max, Gemini Pro |

| [Benchmarks](docs/reference/benchmarks.md) | AWS results, competitor comparison |

| [Provider Setup](docs/how-to/providers.md) | Per-provider guides |

| [Python SDK Examples](docs/examples/sdk-python.md) | Call Grob from `anthropic` and `openai` Python SDKs |

| [Node SDK Examples](docs/examples/sdk-node.md) | Call Grob from `@anthropic-ai/sdk` and `openai` Node SDKs |

## Contributing

See [CONTRIBUTING.md](CONTRIBUTING.md) for development setup, testing, and PR guidelines.

## License

[AGPL-3.0](LICENSE) -- Commercial licensing available. See [LICENSING.md](LICENSING.md).

Built in Rust. Copyright (c) 2025-2026 [A00 SASU](https://github.com/azerozero).
ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/azerozero/grob

Awesome Lists containing this project

README

Grob