https://github.com/mxyhi/token_proxy
Local AI API gateway for OpenAI / Gemini / Anthropic. Runs on your machine, keeps tokens counted (SQLite), offers priority-based load balancing, optional OpenAI Chat↔Responses format conversion, and one-click setup for Claude Code / Codex.
https://github.com/mxyhi/token_proxy
api-key-managenment claude-api claude-code codex openai openai-api openai-responses-api opencode provider-management rust tauri typescript
Last synced: 19 days ago
JSON representation
Local AI API gateway for OpenAI / Gemini / Anthropic. Runs on your machine, keeps tokens counted (SQLite), offers priority-based load balancing, optional OpenAI Chat↔Responses format conversion, and one-click setup for Claude Code / Codex.
- Host: GitHub
- URL: https://github.com/mxyhi/token_proxy
- Owner: mxyhi
- License: apache-2.0
- Created: 2026-01-09T16:14:54.000Z (about 1 month ago)
- Default Branch: main
- Last Pushed: 2026-01-16T12:20:12.000Z (about 1 month ago)
- Last Synced: 2026-01-16T18:14:07.039Z (about 1 month ago)
- Topics: api-key-managenment, claude-api, claude-code, codex, openai, openai-api, openai-responses-api, opencode, provider-management, rust, tauri, typescript
- Language: Rust
- Homepage:
- Size: 11.3 MB
- Stars: 3
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Agents: AGENTS.md
Awesome Lists containing this project
README
# Token Proxy
English | [中文](README.zh-CN.md)
Local AI API gateway for OpenAI / Gemini / Anthropic. Runs on your machine, keeps tokens counted (SQLite), offers priority-based load balancing, optional API format conversion (OpenAI Chat/Responses ↔ Anthropic Messages, plus Gemini ↔ OpenAI/Anthropic, including SSE/tools/images), and one-click setup for Claude Code / Codex.
> Default listen port: **9208** (release) / **19208** (debug builds).
---
## What you get
- Multiple providers: `openai`, `openai-response`, `anthropic`, `gemini`, `kiro`
- Built-in routing + optional format conversion (OpenAI Chat ⇄ Responses; Anthropic Messages ↔ OpenAI; Gemini ↔ OpenAI/Anthropic; SSE supported)
- Per-upstream priority + two balancing strategies (fill-first / round-robin)
- Model alias mapping (exact / prefix* / wildcard*) and response model rewrite
- Local access key (Authorization) + upstream key injection
- SQLite-powered dashboard (requests, tokens, cached tokens, latency, recent)
- macOS tray live token rate (optional)
## Screenshots
| | |
| --- | --- |
| **Dashboard**
 | **Core**
 |
| **Upstreams**
 | **Add upstream**
 |
## Quick start (macOS)
1) Install: move `Token Proxy.app` to `/Applications`. If blocked: `xattr -cr /Applications/Token\ Proxy.app`.
2) Launch the app. The proxy starts automatically.
3) Open **Config File** tab, edit and save (writes `config.jsonc` in the Tauri config dir). Defaults are usable; just paste your upstream API keys.
4) Call via curl (example with local auth):
```bash
curl -X POST \
-H "Authorization: Bearer YOUR_LOCAL_KEY" \
-H "Content-Type: application/json" \
http://127.0.0.1:9208/v1/chat/completions \
-d '{"model":"gpt-4.1-mini","messages":[{"role":"user","content":"hi"}]}'
```
You can also call using the Anthropic Messages format (useful for Claude Code clients):
```bash
curl -X POST \
-H "x-api-key: YOUR_LOCAL_KEY" \
-H "Content-Type: application/json" \
http://127.0.0.1:9208/v1/messages \
-d '{"model":"claude-3-5-sonnet-20241022","max_tokens":256,"messages":[{"role":"user","content":[{"type":"text","text":"hi"}]}]}'
```
## Workspace & CLI (Rust)
- This repo is now a Cargo workspace; the Tauri app still lives in `src-tauri/`.
- CLI crate: `crates/token_proxy_cli` (binary `token-proxy`).
- Default config path: `./config.jsonc` (override with `--config`).
```bash
# start proxy
cargo run -p token_proxy_cli -- serve
# start with custom config path
cargo run -p token_proxy_cli -- --config ./config.jsonc serve
# config helpers
cargo run -p token_proxy_cli -- config init
cargo run -p token_proxy_cli -- --config ./config.jsonc config path
```
## Configuration reference
- File: `config.jsonc` (comments + trailing commas allowed)
- Location:
- CLI: `--config` (default: `./config.jsonc`)
- Tauri: **AppConfig** directory (resolved automatically by the app)
### Core fields
| Field | Default | Notes |
| --- | --- | --- |
| `host` | `127.0.0.1` | Listen address (IPv6 allowed; will be bracketed in URLs) |
| `port` | `9208` release / `19208` debug | Change if the port is taken |
| `local_api_key` | `null` | When set: local auth uses format-specific headers (see Auth rules); local auth inputs are **not** forwarded upstream. |
| `app_proxy_url` | `null` | Proxy for app updater & as placeholder for upstreams (`"$app_proxy_url"`). Supports `http/https/socks5/socks5h`. |
| `log_level` | `silent` | `silent|error|warn|info|debug|trace`; debug/trace log request headers (auth redacted) and small bodies (≤64KiB). Release builds force `silent`. |
| `max_request_body_bytes` | `20971520` (20 MiB) | 0 = fallback to default. Protects inbound body size. |
| `tray_token_rate.enabled` | `true` | macOS tray live rate; harmless elsewhere. |
| `tray_token_rate.format` | `split` | `combined` (`total`), `split` (`↑in ↓out`), `both` (`total | ↑in ↓out`). |
| `enable_api_format_conversion` | `true` | Allow OpenAI/Anthropic/Gemini fallback via request/response body and SSE stream conversion. |
| `upstream_strategy` | `priority_fill_first` | `priority_fill_first` (default) keeps trying the highest-priority group in list order; `priority_round_robin` rotates within each priority group. |
### Upstream entries (`upstreams[]`)
| Field | Default | Notes |
| --- | --- | --- |
| `id` | required | Unique per upstream. |
| `provider` | required | One of `openai`, `openai-response`, `anthropic`, `gemini`, `kiro`. |
| `base_url` | required | Full base; overlapping path parts are de-duplicated. (`kiro` can be empty.) |
| `api_key` | `null` | Provider-specific bearer/key; overrides request headers. |
| `kiro_account_id` | `null` | Required when `provider=kiro`. |
| `preferred_endpoint` | `null` | `kiro` only: `ide` or `cli`. |
| `proxy_url` | `null` | Per-upstream proxy; supports `http/https/socks5/socks5h`; default is **no system proxy**. `$app_proxy_url` placeholder allowed. |
| `priority` | `0` | Higher = tried earlier. Grouped by priority then by order (or round-robin). |
| `enabled` | `true` | Disabled upstreams are skipped. |
| `model_mappings` | `{}` | Exact / `prefix*` / `*`. Priority: exact > longest prefix > wildcard. Response echoes original alias. |
| `overrides.header` | `{}` | Set/remove headers (null removes). Hop-by-hop/Host/Content-Length are always ignored. |
## Routing & format conversion
- Gemini: `/v1beta/models/*:generateContent` and `*:streamGenerateContent` → `gemini` (SSE supported).
- Anthropic: `/v1/messages` (and subpaths) and `/v1/complete` → `anthropic` (Kiro shares the same format).
- OpenAI: `/v1/chat/completions` → `openai`; `/v1/responses` → `openai-response`.
- Other paths: choose the provider with the highest configured priority; tie-break is `openai` > `openai-response` > `anthropic`.
- If the preferred provider is missing but `enable_api_format_conversion=true`, the proxy auto-converts request/response bodies and streams between supported formats (including SSE).
- If `openai` is missing for `/v1/chat/completions`: fallback can be `openai-response`, `anthropic`, or `gemini` (priority-based; tie-break prefers `openai-response`).
- For `/v1/messages`: choose between `anthropic` and `kiro` by priority; tie-break uses upstream id. If the chosen provider returns a retryable error, the proxy will fall back to the other native provider (Anthropic ↔ Kiro) when configured.
- If neither `anthropic` nor `kiro` exists for `/v1/messages` and `enable_api_format_conversion=true`: fallback can be `openai-response`, `openai`, or `gemini` (priority-based; tie-break prefers `openai-response`).
- If `openai-response` is missing for `/v1/responses`: fallback can be `openai`, `anthropic`, or `gemini` (priority-based; tie-break prefers `openai`).
- If `gemini` is missing for `/v1beta/models/*:generateContent`: fallback can be `openai-response`, `openai`, or `anthropic` (priority-based; tie-break prefers `openai-response`).
## Auth rules (important)
- Local access: `local_api_key` enabled → require format-specific key. These local auth inputs are stripped and **not** forwarded upstream.
- OpenAI / Responses: `Authorization: Bearer `
- Anthropic `/v1/messages`: `x-api-key` or `x-anthropic-api-key`
- Gemini: `x-goog-api-key` or `?key=...`
- When `local_api_key` is enabled, request headers are **not** used for upstream auth; configure `upstreams[].api_key` instead.
- Upstream auth resolution (per request):
- **OpenAI**: `upstream.api_key` → `x-openai-api-key` → `Authorization` (only if `local_api_key` is **not** set) → error.
- **Anthropic**: `upstream.api_key` → `x-api-key` / `x-anthropic-api-key` → error. Missing `anthropic-version` is auto-filled with `2023-06-01`.
- **Gemini**: `upstream.api_key` → `x-goog-api-key` → query `?key=...` → error.
## Load balancing & retries
- Priorities: higher `priority` groups first; inside a group use list order (fill-first) or round-robin (if `priority_round_robin`).
- Retryable conditions: network timeout/connect errors, or status 400/403/429/307/5xx **except** 504/524. Retries stay within the same provider's priority groups.
- `/v1/messages` only: after the chosen native provider is exhausted (retryable errors), the proxy can fall back to the other native provider (`anthropic` ↔ `kiro`) if it is configured.
## Observability
- SQLite log: `data.db` in config dir. Stores per-request stats (tokens, cached tokens, latency, model, upstream).
- Token rate: macOS tray shows live total or split rates (configurable via `tray_token_rate`).
- Debug/trace log bodies capped at 64KiB.
## Dashboard
- In-app **Dashboard** page visualizes totals, per-provider stats, time series, and recent requests (page size 50, offset supported).
## One-click CLI setup
- Claude Code: writes `~/.claude/settings.json` `env` (`ANTHROPIC_BASE_URL`, `ANTHROPIC_AUTH_TOKEN` when local key is set).
- Codex: writes `~/.codex/config.toml` `model_provider="token_proxy"` and `[model_providers.token_proxy].base_url` → `http://127.0.0.1:/v1`; writes `~/.codex/auth.json` `OPENAI_API_KEY`.
- A `.token_proxy.bak` file is created before overwriting; restart the CLI to apply.
## FAQ
- **Port already in use?** Change `port` in `config.jsonc`; remember to update your client base URL.
- **Got 401?** If `local_api_key` is set, you must send the format-specific local key (OpenAI/Responses: `Authorization`, Anthropic: `x-api-key`, Gemini: `x-goog-api-key` or `?key=`). With local auth enabled, configure upstream keys in `upstreams[].api_key`.
- **Got 504?** Upstream did not send response headers or the first body chunk within 120s. For streaming responses, a 120s idle timeout between chunks may also close the connection.
- **413 Payload Too Large?** Body exceeded `max_request_body_bytes` (default 20 MiB) or the 4 MiB transform limit for format-conversion requests.
- **Why no system proxy?** By design, `reqwest` is built with `.no_proxy()`; set per-upstream `proxy_url` if needed.