{"id":45432357,"url":"https://github.com/relayplane/proxy","last_synced_at":"2026-05-04T03:08:03.335Z","repository":{"id":336232520,"uuid":"1148753470","full_name":"RelayPlane/proxy","owner":"RelayPlane","description":"Open source cost intelligence proxy for AI agents. Cut costs ~80% with smart model routing. Dashboard, policy engine, 11 providers. MIT licensed.","archived":false,"fork":false,"pushed_at":"2026-03-29T22:26:14.000Z","size":1944,"stargazers_count":110,"open_issues_count":4,"forks_count":14,"subscribers_count":1,"default_branch":"main","last_synced_at":"2026-03-30T00:56:16.822Z","etag":null,"topics":["ai","anthropic","claude","cost-optimization","llm","openai","openclaw","openclaw-skill","proxy"],"latest_commit_sha":null,"homepage":"https://relayplane.com","language":"TypeScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/RelayPlane.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-02-03T10:24:42.000Z","updated_at":"2026-03-29T22:26:18.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/RelayPlane/proxy","commit_stats":null,"previous_names":["relayplane/proxy"],"tags_count":67,"template":false,"template_full_name":null,"purl":"pkg:github/RelayPlane/proxy","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/RelayPlane%2Fproxy","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/RelayPlane%2Fproxy/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/RelayPlane%2Fproxy/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/RelayPlane%2Fproxy/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/RelayPlane","download_url":"https://codeload.github.com/RelayPlane/proxy/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/RelayPlane%2Fproxy/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31384847,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-04T01:22:39.193Z","status":"online","status_checked_at":"2026-04-04T02:00:07.569Z","response_time":60,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai","anthropic","claude","cost-optimization","llm","openai","openclaw","openclaw-skill","proxy"],"created_at":"2026-02-22T02:10:45.957Z","updated_at":"2026-05-04T03:08:03.328Z","avatar_url":"https://github.com/RelayPlane.png","language":"TypeScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"# @relayplane/proxy\n\n[![npm](https://img.shields.io/npm/v/@relayplane/proxy)](https://www.npmjs.com/package/@relayplane/proxy)\n[![MIT License](https://img.shields.io/badge/license-MIT-blue.svg)](https://github.com/RelayPlane/proxy/blob/main/LICENSE)\n\nA **Node.js npm LLM proxy** that sits between your AI agents and providers. Drop-in replacement for OpenAI and Anthropic base URLs — no Docker, no Python, just `npm install`. Tracks every request, shows where the money goes, and offers configurable task-aware routing — all running **locally, for free**.\n\n**Live savings dashboard:** [relayplane.com/live](https://relayplane.com/live) — see real-time aggregate savings from developers worldwide.\n\n**The npm-native LLM proxy for Node.js developers.** Works with Claude Code, Cursor, OpenClaw, and any tool that supports `OPENAI_BASE_URL` or `ANTHROPIC_BASE_URL`.\n\n**Free, open-source proxy features:**\n- 📊 Per-request cost tracking across 11 providers\n- 💰 **Cache-aware cost tracking** - accurately tracks Anthropic prompt caching with cache read savings, creation costs, and true per-request costs\n- 🔀 Configurable task-aware routing (complexity-based, cascade, model overrides)\n- 🛡️ Circuit breaker - if the proxy fails, your agent doesn't notice\n- 📈 **Local dashboard** at `localhost:4100` - cost breakdown, savings analysis, provider health, agent breakdown\n- 💵 **Budget enforcement** - daily/hourly/per-request spend limits with block, warn, downgrade, or alert actions\n- 🔍 **Anomaly detection** - catches runaway agent loops, cost spikes, and token explosions in real time\n- 🔔 **Cost alerts** - threshold alerts at configurable percentages, webhook delivery, alert history\n- ⬇️ **Auto-downgrade** - automatically switches to cheaper models when budget thresholds are hit\n- 📦 **Aggressive cache** - exact-match response caching with gzipped disk persistence\n- 🤖 **Per-agent cost tracking** - identifies agents by system prompt fingerprint and tracks cost per agent\n- 📝 **Content logging** - dashboard shows system prompt preview, user message, and response preview per request\n- 🔐 **OAuth passthrough** - correctly forwards `user-agent` and `x-app` headers for Claude Max subscription users (OpenClaw compatible)\n- 🧠 **Osmosis mesh** - collective learning layer that shares anonymized routing signals across users (on by default, opt-out: `relayplane mesh off`)\n- 🔧 **systemd/launchd service** - `relayplane service install` for always-on operation with auto-restart\n- 🏥 **Health watchdog** - `/health` endpoint with uptime tracking and active probing\n- 🛡️ **Config resilience** - atomic writes, automatic backup/restore, credential separation\n\n\u003e **Cloud dashboard available separately** - see [Cloud Dashboard \u0026 Pro Features](#cloud-dashboard--pro-features) below. Your prompts always stay local.\n\n## Quick Start\n\n```bash\nnpm install -g @relayplane/proxy\nrelayplane init\nrelayplane start\n# Dashboard at http://localhost:4100\n```\n\nWorks with any agent framework that talks to OpenAI or Anthropic APIs. Point your client at `http://localhost:4100` (set `ANTHROPIC_BASE_URL` or `OPENAI_BASE_URL`) and the proxy handles the rest.\n\n### Auto-start with Claude Code\n\nAdd to `~/.claude/settings.json`:\n\n```json\n{\n  \"hooks\": {\n    \"SessionStart\": [\n      {\n        \"hooks\": [\n          {\n            \"type\": \"command\",\n            \"command\": \"relayplane ensure-running\"\n          }\n        ]\n      }\n    ]\n  }\n}\n```\n\nRelayPlane will start automatically when Claude Code opens. If it's already running (multiple sessions), the hook exits immediately. No duplicate processes.\n\n## What's New in v1.8.35\n\n**Recent additions:**\n\n- **[relayplane.com/live](https://relayplane.com/live)** — Atlas public proof-of-work dashboard showing real-time aggregate savings from developers worldwide\n- **Osmosis Phase 1 shipped** — local telemetry capture tracks every routing decision; mesh is on by default\n- **Service installer hardened** — detects invoking user, loads env files correctly on systemd installs\n- **Provider-aware auto-routing** — Gemini, OpenRouter, xAI supported natively without extra config\n- **Agent-native signup flow** — `relayplane login` handles device OAuth inline\n\n**Note for upgraders from pre-v1.8.14:** Telemetry and mesh are now ON by default. Disable both: `relayplane telemetry off \u0026\u0026 relayplane mesh off`\n\n## Supported Providers\n\n**Anthropic** · **OpenAI** · **Google Gemini** · **xAI/Grok** · **OpenRouter** · **DeepSeek** · **Groq** · **Mistral** · **Together** · **Fireworks** · **Perplexity**\n\n## Configuration\n\nRelayPlane reads configuration from `~/.relayplane/config.json`. Override the path with the `RELAYPLANE_CONFIG_PATH` environment variable.\n\n```bash\n# Default location\n~/.relayplane/config.json\n\n# Override with env var\nRELAYPLANE_CONFIG_PATH=/path/to/config.json relayplane start\n```\n\nA minimal config file:\n\n```json\n{\n  \"enabled\": true,\n  \"modelOverrides\": {},\n  \"routing\": {\n    \"mode\": \"cascade\",\n    \"cascade\": { \"enabled\": true },\n    \"complexity\": { \"enabled\": true }\n  }\n}\n```\n\nAll configuration is optional - sensible defaults are applied for every field. The proxy merges your config with its defaults via deep merge, so you only need to specify what you want to change.\n\n## Architecture\n\n```text\nClient (Claude Code / Aider / Cursor)\n        |\n        |  OpenAI/Anthropic-compatible request\n        v\n+-------------------------------------------------------+\n| RelayPlane Proxy (local)                               |\n|-------------------------------------------------------|\n| 1) Parse request                                       |\n| 2) Cache check (exact or aggressive mode)              |\n|    └─ HIT → return cached response (skip provider)    |\n| 3) Budget check (daily/hourly/per-request limits)      |\n|    └─ BREACH → block / warn / downgrade / alert       |\n| 4) Anomaly detection (velocity, cost spike, loops)     |\n|    └─ DETECTED → alert + optional block               |\n| 5) Auto-downgrade (if budget threshold exceeded)       |\n|    └─ Rewrite model to cheaper alternative             |\n| 6) Infer task/complexity (pre-request)                 |\n| 7) Select route/model                                  |\n|    - explicit model / passthrough                     |\n|    - relayplane:auto/cost/fast/quality                |\n|    - configured complexity/cascade rules               |\n| 8) Forward request to provider                         |\n| 9) Return provider response + cache it                 |\n| 10) Record telemetry + update budget tracking          |\n| 11) Mesh sync (push anonymized routing signals)        |\n+-------------------------------------------------------+\n        |\n        v\nProvider APIs (Anthropic/OpenAI/Gemini/xAI/...)\n```\n\n## How It Works\n\nRelayPlane is a local HTTP proxy. You point your agent at `localhost:4100` by setting `ANTHROPIC_BASE_URL` or `OPENAI_BASE_URL`. The proxy:\n\n1. **Intercepts** your LLM API requests\n2. **Classifies** the task using heuristics (token count, prompt patterns, keyword matching - no LLM calls)\n3. **Routes** to the configured model based on classification and your routing rules (or passes through to the original model by default)\n4. **Forwards** the request directly to the LLM provider (your prompts go straight to the provider, not through RelayPlane servers)\n5. **Records** token counts, latency, and cost locally for your dashboard\n\n**Default behavior is passthrough** - requests go to whatever model your agent requested. Routing (cascade, complexity-based) is configurable and must be explicitly enabled.\n\n## Complexity-Based Routing\n\nThe proxy classifies incoming requests by complexity (simple, moderate, complex) based on prompt length, token patterns, and the presence of tools. Each tier maps to a different model.\n\n```json\n{\n  \"routing\": {\n    \"complexity\": {\n      \"enabled\": true,\n      \"simple\": \"claude-3-5-haiku-latest\",\n      \"moderate\": \"claude-sonnet-4-20250514\",\n      \"complex\": \"claude-opus-4-20250514\"\n    }\n  }\n}\n```\n\n**How classification works:**\n\n- **Simple** - Short prompts, straightforward Q\u0026A, basic code tasks\n- **Moderate** - Multi-step reasoning, code review, analysis with context\n- **Complex** - Architecture decisions, large codebases, tasks with many tools, long prompts with evaluation/comparison language\n\nThe classifier scores requests based on message count, total token length, tool usage, and content patterns (e.g., words like \"analyze\", \"compare\", \"evaluate\" increase the score). This happens locally - no prompt content is sent anywhere.\n\n## Model Overrides\n\nMap any model name to a different one. Useful for silently redirecting expensive models to cheaper alternatives without changing your agent configuration:\n\n```json\n{\n  \"modelOverrides\": {\n    \"claude-opus-4-5\": \"claude-3-5-haiku\",\n    \"gpt-4o\": \"gpt-4o-mini\"\n  }\n}\n```\n\nOverrides are applied before any other routing logic. The original requested model is logged for tracking.\n\n## Cascade Mode\n\nStart with the cheapest model and escalate only when the response shows uncertainty or refusal. This gives you the cost savings of a cheap model with a safety net.\n\n```json\n{\n  \"routing\": {\n    \"mode\": \"cascade\",\n    \"cascade\": {\n      \"enabled\": true,\n      \"models\": [\n        \"claude-3-5-haiku-latest\",\n        \"claude-sonnet-4-20250514\",\n        \"claude-opus-4-20250514\"\n      ],\n      \"escalateOn\": \"uncertainty\",\n      \"maxEscalations\": 2\n    }\n  }\n}\n```\n\n**`escalateOn` options:**\n\n| Value | Triggers escalation when... |\n|-------|----------------------------|\n| `uncertainty` | Response contains hedging language (\"I'm not sure\", \"it's hard to say\", \"this is just a guess\") |\n| `refusal` | Model refuses to help (\"I can't assist with that\", \"as an AI\") |\n| `error` | The request fails outright |\n\n**`maxEscalations`** caps how many times the proxy will retry with a more expensive model. Default: `1`.\n\nThe cascade walks through the `models` array in order, starting from the first. Each escalation moves to the next model in the list.\n\n## Smart Aliases\n\nUse semantic model names instead of provider-specific IDs:\n\n| Alias | Resolves to | Via |\n|-------|------------|-----|\n| `rp:best` | `anthropic/claude-sonnet-4-5` | OpenRouter |\n| `rp:fast` | `anthropic/claude-3-5-haiku` | OpenRouter |\n| `rp:cheap` | `google/gemini-2.0-flash-001` | OpenRouter |\n| `rp:balanced` | `anthropic/claude-3-5-haiku` | OpenRouter |\n| `relayplane:auto` | Same as `rp:balanced` | - |\n| `rp:auto` | Same as `rp:balanced` | - |\n\nUse these as the `model` field in your API requests:\n\n```json\n{\n  \"model\": \"rp:fast\",\n  \"messages\": [{\"role\": \"user\", \"content\": \"Hello\"}]\n}\n```\n\n## Routing Suffixes\n\nAppend `:cost`, `:fast`, or `:quality` to any model name to hint at routing preference:\n\n```json\n{\n  \"model\": \"claude-sonnet-4:cost\",\n  \"messages\": [{\"role\": \"user\", \"content\": \"Summarize this\"}]\n}\n```\n\n| Suffix | Behavior |\n|--------|----------|\n| `:cost` | Optimize for lowest cost |\n| `:fast` | Optimize for lowest latency |\n| `:quality` | Optimize for best output quality |\n\nThe suffix is stripped before provider lookup - the base model must still be valid. Suffixes influence routing decisions when the proxy has multiple options.\n\n## Provider Cooldowns / Reliability\n\nWhen a provider starts failing, the proxy automatically cools it down to avoid hammering a broken endpoint:\n\n```json\n{\n  \"reliability\": {\n    \"cooldowns\": {\n      \"enabled\": true,\n      \"allowedFails\": 3,\n      \"windowSeconds\": 60,\n      \"cooldownSeconds\": 120\n    }\n  }\n}\n```\n\n| Field | Default | Description |\n|-------|---------|-------------|\n| `enabled` | `true` | Enable/disable cooldown tracking |\n| `allowedFails` | `3` | Failures within the window before cooldown triggers |\n| `windowSeconds` | `60` | Rolling window for counting failures |\n| `cooldownSeconds` | `120` | How long to avoid the provider after cooldown triggers |\n\nAfter cooldown expires, the provider is automatically retried. Successful requests clear the failure counter.\n\n## Hybrid Auth\n\nUse your Anthropic MAX subscription token for expensive models (Opus) while using standard API keys for cheaper models (Haiku, Sonnet). This lets you leverage MAX plan pricing where it matters most.\n\n```json\n{\n  \"auth\": {\n    \"anthropicMaxToken\": \"sk-ant-oat-...\",\n    \"useMaxForModels\": [\"opus\", \"claude-opus\"]\n  }\n}\n```\n\n**How it works:**\n\n- When a request targets a model matching any pattern in `useMaxForModels`, the proxy uses `anthropicMaxToken` via `x-api-key` header\n- All other Anthropic requests use the standard `ANTHROPIC_API_KEY` env var with `x-api-key` header\n- Pattern matching is case-insensitive substring match, so `\"opus\"` matches `claude-opus-4-20250514`\n- Both `sk-ant-api*` and `sk-ant-oat*` tokens are sent as `x-api-key` (Anthropic accepts all token types via this header)\n\nSet your standard key in the environment as usual:\n\n```bash\nexport ANTHROPIC_API_KEY=\"sk-ant-api03-...\"\n```\n\n## Telemetry\n\n**Telemetry is enabled by default.** This powers the cloud dashboard and helps improve routing recommendations. Only anonymous metadata is sent, never prompts or responses.\n\nDisable with:\n```bash\nrelayplane telemetry off\n```\n\nThe proxy sends anonymized metadata to `api.relayplane.com`:\n\n- **device_id** - Random anonymous hash (no PII)\n- **task_type** - Heuristic classification label (e.g., \"code_generation\", \"summarization\")\n- **model** - Which model was used\n- **tokens_in/out** - Token counts\n- **latency_ms** - Response time\n- **cost_usd** - Estimated cost\n\n**Never collected:** prompts, responses, file paths, or anything that could identify you or your project. Your prompts go directly to LLM providers, never through RelayPlane servers. Mesh (on by default) shares anonymized metadata: model, tokens, cost, latency, success/fail. Opt out: `relayplane mesh off`.\n\n\u003e **Cloud dashboard:** To see your data at [relayplane.com/dashboard](https://relayplane.com/dashboard), run `relayplane login`. Telemetry is already on by default. The cloud dashboard requires telemetry to function. You can disable telemetry anytime, but cloud features won't work without it.\n\nWhen the proxy connects and telemetry is enabled, it will confirm:\n```\n[RelayPlane] Cloud dashboard connected - telemetry enabled.\nYour prompts stay local. Only anonymous metadata (model, tokens, cost) is sent.\nDisable anytime: relayplane telemetry off\n```\n\n### Audit mode\n\nAudit mode buffers telemetry events in memory so you can inspect exactly what would be sent before it goes anywhere. Useful for compliance review.\n\n```bash\nrelayplane start --audit\n```\n\n### Offline mode\n\n```bash\nrelayplane start --offline\n```\n\nDisables all network calls except the actual LLM requests. No telemetry transmission, no cloud features. The proxy still tracks everything locally for your dashboard.\n\n## Dashboard\n\nThe built-in dashboard runs at [http://localhost:4100](http://localhost:4100) (or `/dashboard`). It shows:\n\n- Total requests, success rate, average latency\n- Cost breakdown by model and provider (with provider column to distinguish `anthropic` vs `openrouter` for same model names)\n- **Agent Cost Breakdown** - per-agent spend table identifying agents by system prompt fingerprint\n- Recent request history with agent column and expandable rows (state persists across the 5-second auto-refresh)\n- **Content previews** - system prompt preview, user message, and response preview in expandable rows\n- **Honest savings breakdown** - routing savings (RelayPlane's contribution) vs cache savings (Anthropic's feature), with tooltip explaining the calculation\n- Error detail capture - failed requests show the error message and HTTP status code\n- Provider health status\n- Wider 1600px layout for dense data views\n\n### Per-Agent Cost Tracking\n\nRelayPlane v1.7 identifies each agent by fingerprinting its system prompt. This groups all requests from the same agent together - even across sessions - so you can see exactly which agent is responsible for which costs.\n\nThe Agent Cost Breakdown table in the dashboard shows total spend, request count, and average cost per request for each distinct agent. No configuration required - fingerprinting happens automatically.\n\n### Content Logging\n\nWhen content logging is enabled, the dashboard stores and displays:\n\n- A preview of the system prompt\n- The first user message in the conversation\n- A preview of the model's response\n\nThis makes it easy to correlate a cost spike with the actual request that caused it. Content is stored locally only - nothing is sent to RelayPlane servers.\n\n### Auth Passthrough (Claude Max / OpenClaw Users)\n\nIf you use a Claude Max subscription (tokens starting with `sk-ant-oat*`), the proxy handles them correctly via the `x-api-key` header. No special configuration needed. The proxy also forwards `user-agent` and `x-app` headers required by Anthropic for subscription validation.\n\n**Important:** All Anthropic token types (`sk-ant-api*` and `sk-ant-oat*`) are sent via `x-api-key`. The proxy does not use `Authorization: Bearer` for Anthropic requests.\n\n## OpenClaw Integration\n\nThe simplest way to use RelayPlane with OpenClaw is to point the Anthropic provider at the proxy. This routes all Anthropic model requests through RelayPlane transparently, with no changes to model names or agent configs.\n\n### Setup\n\n1. Install and start the proxy:\n\n```bash\nnpm install -g @relayplane/proxy\nrelayplane init\nrelayplane start\n```\n\n2. Point OpenClaw's Anthropic provider at the proxy:\n\n```bash\nopenclaw config set models.providers.anthropic.baseUrl http://localhost:4100\n```\n\nThat's it. All `anthropic/*` model requests now flow through RelayPlane. Your existing model names (`anthropic/claude-sonnet-4-6`, `anthropic/claude-opus-4-6`) work unchanged.\n\n### What you get\n\n- **Cost tracking** per agent, per model, per day\n- **Complexity-based routing** (e.g., simple tasks use Sonnet, complex tasks use Opus)\n- **Budget enforcement** with automatic downgrades\n- **Dashboard** at http://localhost:4100\n\n### Complexity routing example\n\nConfigure the proxy to automatically route simple tasks to Sonnet and complex tasks to Opus:\n\n```json\n{\n  \"routing\": {\n    \"mode\": \"complexity\",\n    \"complexity\": {\n      \"enabled\": true,\n      \"simple\": \"claude-sonnet-4-6\",\n      \"moderate\": \"claude-sonnet-4-6\",\n      \"complex\": \"claude-opus-4-6\"\n    }\n  }\n}\n```\n\nOpenClaw agents request whatever model they're configured with. The proxy classifies the task and routes accordingly. No agent config changes needed.\n\n### Auth\n\nThe proxy passes through whatever API key OpenClaw sends. If you use a MAX subscription, OpenClaw sends your `sk-ant-oat*` token and the proxy forwards it directly to Anthropic. No extra auth configuration in the proxy is needed for passthrough mode.\n\nFor hybrid auth (MAX token for expensive models, standard key for cheap ones), see [Hybrid Auth](#hybrid-auth).\n\n### API Endpoints\n\nThe dashboard is powered by JSON endpoints you can use directly:\n\n| Endpoint | Description |\n|----------|-------------|\n| `GET /v1/telemetry/stats` | Aggregate statistics (total requests, costs, model counts) |\n| `GET /v1/telemetry/runs?limit=N` | Recent request history |\n| `GET /v1/telemetry/savings` | Cost savings from smart routing |\n| `GET /v1/telemetry/health` | Provider health and cooldown status |\n\n## Budget Enforcement\n\nSet spending limits to prevent runaway costs. The budget manager tracks spend in rolling daily and hourly windows using SQLite with an in-memory cache for \u003c5ms hot-path checks.\n\n```json\n{\n  \"budget\": {\n    \"enabled\": true,\n    \"dailyUsd\": 50,\n    \"hourlyUsd\": 10,\n    \"perRequestUsd\": 2,\n    \"onBreach\": \"downgrade\",\n    \"downgradeTo\": \"claude-sonnet-4-6\",\n    \"alertThresholds\": [50, 80, 95]\n  }\n}\n```\n\n| Field | Default | Description |\n|-------|---------|-------------|\n| `enabled` | `false` | Enable budget enforcement |\n| `dailyUsd` | `50` | Daily spend limit |\n| `hourlyUsd` | `10` | Hourly spend limit |\n| `perRequestUsd` | `2` | Max cost for a single request |\n| `onBreach` | `\"downgrade\"` | Action: `block`, `warn`, `downgrade`, or `alert` |\n| `downgradeTo` | `\"claude-sonnet-4-6\"` | Model to use when downgrading |\n| `alertThresholds` | `[50, 80, 95]` | Fire alerts at these % of daily limit |\n\n```bash\nrelayplane budget status          # See current spend vs limits\nrelayplane budget set --daily 25  # Change daily limit\nrelayplane budget set --hourly 5  # Change hourly limit\nrelayplane budget reset           # Reset spend counters\n```\n\n## Anomaly Detection\n\nCatches runaway agent loops and cost spikes using a sliding window over the last 100 requests.\n\n```json\n{\n  \"anomaly\": {\n    \"enabled\": true,\n    \"velocityThreshold\": 50,\n    \"tokenExplosionUsd\": 5.0,\n    \"repetitionThreshold\": 20,\n    \"windowMs\": 300000\n  }\n}\n```\n\n**Detection types:**\n\n| Type | Triggers when... |\n|------|-------------------|\n| `velocity_spike` | Request rate exceeds threshold in 5-minute window |\n| `cost_acceleration` | Spend rate is doubling every minute |\n| `repetition` | Same model + similar token count \u003e20 times in 5 min |\n| `token_explosion` | Single request estimated cost exceeds $5 |\n\n## Cost Alerts\n\nGet notified when spending crosses thresholds. Alerts are deduplicated per window and stored in SQLite for history.\n\n```json\n{\n  \"alerts\": {\n    \"enabled\": true,\n    \"webhookUrl\": \"https://hooks.slack.com/...\",\n    \"cooldownMs\": 300000,\n    \"maxHistory\": 500\n  }\n}\n```\n\nAlert types: `threshold` (budget %), `anomaly` (detection triggers), `breach` (limit exceeded). Severity levels: `info`, `warning`, `critical`.\n\n```bash\nrelayplane alerts list            # Show recent alerts\nrelayplane alerts counts          # Count by type (threshold/anomaly/breach)\n```\n\n## Auto-Downgrade\n\nWhen budget hits a configurable threshold (default 80%), the proxy automatically rewrites expensive models to cheaper alternatives. Adds `X-RelayPlane-Downgraded` headers so your agent knows.\n\n```json\n{\n  \"downgrade\": {\n    \"enabled\": true,\n    \"thresholdPercent\": 80,\n    \"mapping\": {\n      \"claude-opus-4-6\": \"claude-sonnet-4-6\",\n      \"gpt-4o\": \"gpt-4o-mini\",\n      \"gemini-2.5-pro\": \"gemini-2.0-flash\"\n    }\n  }\n}\n```\n\nBuilt-in mappings cover all major Anthropic, OpenAI, and Google models. Override with your own.\n\n## Response Cache\n\nCaches LLM responses to avoid duplicate API calls. SHA-256 hash of the canonical request → cached response with gzipped disk persistence.\n\n```json\n{\n  \"cache\": {\n    \"enabled\": true,\n    \"mode\": \"exact\",\n    \"maxSizeMb\": 100,\n    \"defaultTtlSeconds\": 3600,\n    \"onlyWhenDeterministic\": true\n  }\n}\n```\n\n| Mode | Behavior |\n|------|----------|\n| `exact` | Cache only identical requests (default) |\n| `aggressive` | Broader matching with shorter TTL (30 min default) |\n\nOnly caches deterministic requests (temperature=0) by default. Skips responses with tool calls.\n\n```bash\nrelayplane cache status   # Entries, size, hit rate, saved cost\nrelayplane cache stats    # Detailed breakdown by model and task type\nrelayplane cache clear    # Wipe the cache\nrelayplane cache on/off   # Toggle caching\n```\n\n## Osmosis Mesh\n\nOpt-in collective learning layer. Share anonymized routing signals (model, task type, tokens, cost - never prompts) and benefit from the network's routing intelligence.\n\n```json\n{\n  \"mesh\": {\n    \"enabled\": true,\n    \"endpoint\": \"https://osmosis-mesh-dev.fly.dev\",\n    \"sync_interval_ms\": 60000,\n    \"contribute\": true\n  }\n}\n```\n\nOn by default. Opt out: `relayplane mesh off`. Free users receive provider health alerts; Pro users receive full routing intelligence.\n\n```bash\nrelayplane mesh status              # Atoms local/synced, last sync, endpoint\nrelayplane mesh on/off              # Enable/disable mesh\nrelayplane mesh sync                # Force sync now\nrelayplane mesh contribute on/off   # Toggle contribution\n```\n\n## System Service\n\nInstall RelayPlane as a system service for always-on operation with auto-restart on crash.\n\n```bash\n# Linux (systemd)\nsudo relayplane service install     # Install + enable + start\nsudo relayplane service uninstall   # Stop + disable + remove\nrelayplane service status           # Check service state\n\n# macOS (launchd)\nrelayplane service install          # Install as LaunchAgent\nrelayplane service uninstall        # Remove LaunchAgent\nrelayplane service status           # Check loaded state\n```\n\nThe service unit includes `WatchdogSec=30` (systemd) and `KeepAlive` (launchd) for automatic health monitoring and restart. API keys from your current environment are captured into the service definition.\n\n## Config Resilience\n\nConfiguration is protected against corruption:\n\n- **Atomic writes** - config is written to a `.tmp` file then renamed (no partial writes)\n- **Automatic backup** - `config.json.bak` is updated before every save\n- **Auto-restore** - if `config.json` is corrupt/missing, the proxy restores from backup\n- **Credential separation** - API keys live in `credentials.json`, surviving config resets\n\n## Circuit Breaker\n\nIf the proxy ever fails, all traffic automatically bypasses it - your agent talks directly to the provider. When RelayPlane recovers, traffic resumes. No manual intervention needed.\n\n## CLI Reference\n\n```\nrelayplane [command] [options]\n```\n\n| Command | Description |\n|---------|-------------|\n| `(default)` / `start` | Start the proxy server |\n| `init` | Initialize config and show setup instructions |\n| `status` | Show proxy status, plan, and cloud sync info |\n| `login` | Log in to RelayPlane (device OAuth flow) |\n| `logout` | Clear stored credentials |\n| `upgrade` | Open pricing page |\n| `enable` / `disable` | Toggle proxy routing in OpenClaw config |\n| `telemetry on\\|off\\|status` | Manage telemetry |\n| `stats` | Show usage statistics and savings |\n| `config [set-key \u003ckey\u003e]` | Show or update configuration |\n| `budget status\\|set\\|reset` | Manage spend limits |\n| `alerts list\\|counts` | View cost alert history |\n| `cache status\\|stats\\|clear\\|on\\|off` | Manage response cache |\n| `mesh status\\|on\\|off\\|sync\\|contribute` | Manage Osmosis mesh |\n| `service install\\|uninstall\\|status` | System service management |\n| `autostart on\\|off\\|status` | Legacy autostart (systemd) |\n| `ensure-running` | Start proxy if not running (idempotent, safe for hooks) |\n\n**Server options:**\n\n| Flag | Default | Description |\n|------|---------|-------------|\n| `--port \u003cn\u003e` | `4100` | Port to listen on |\n| `--host \u003cs\u003e` | `127.0.0.1` | Host to bind to |\n| `--offline` | - | No network calls except LLM endpoints |\n| `--audit` | - | Show telemetry payloads before sending |\n| `-v, --verbose` | - | Verbose logging |\n\n## Cloud Dashboard \u0026 Pro Features\n\nThe proxy is fully functional without a cloud account. All features above are **local and free**.\n\nCloud dashboard is **free for all signed-up users**. Just `relayplane login`. For extended history, full mesh intelligence, and governance, [relayplane.com](https://relayplane.com) offers:\n\n| Feature | Plan |\n|---------|------|\n| Cloud dashboard - run history, cost trends, analytics | Free (all tiers) |\n| 30-day cloud history, weekly cost digest, routing recommendations | Starter ($9/mo) |\n| Full mesh intelligence - routing signals from thousands of agents | Pro ($29/mo) |\n| 90-day history, data export, cost spike alerts | Pro |\n| Private team mesh, per-agent spend limits, approval flows | Max ($99/mo) |\n| Governance \u0026 compliance rules, audit logs | Max |\n\n**[View pricing →](https://relayplane.com/pricing)**\n\n### Connecting to Cloud\n\n```bash\nrelayplane login    # authenticate - unlocks cloud dashboard (free)\n```\n\nTelemetry is on by default. The cloud dashboard requires it to display your data. Disable anytime: `relayplane telemetry off`.\n\n\u003e **Privacy-first:** Telemetry sends only anonymous metadata - model name, token counts, cost, latency. Your prompts, inputs, and outputs **never leave your machine**. Mesh is also on by default; opt out: `relayplane mesh off`.\n\n---\n\n## Your Keys Stay Yours\n\nRelayPlane requires your own provider API keys. Your prompts go directly to LLM providers - never through RelayPlane servers. All proxy execution is local. Mesh telemetry (anonymous metadata only) is on by default. Opt out: `relayplane mesh off`. Your prompts always go directly to providers.\n\n## License\n\n[MIT](https://github.com/RelayPlane/proxy/blob/main/LICENSE)\n\n---\n\n[relayplane.com](https://relayplane.com) · [GitHub](https://github.com/RelayPlane/proxy)\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frelayplane%2Fproxy","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Frelayplane%2Fproxy","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frelayplane%2Fproxy/lists"}