{"id":37419860,"url":"https://github.com/mxyhi/token_proxy","last_synced_at":"2026-06-29T05:01:05.046Z","repository":{"id":331549042,"uuid":"1131199099","full_name":"mxyhi/token_proxy","owner":"mxyhi","description":"Local AI API gateway for OpenAI / Gemini / Anthropic. Runs on your machine, keeps tokens counted (SQLite), offers priority-based load balancing, optional OpenAI Chat↔Responses format conversion, and one-click setup for Claude Code / Codex.","archived":false,"fork":false,"pushed_at":"2026-06-26T15:38:24.000Z","size":14895,"stargazers_count":74,"open_issues_count":2,"forks_count":14,"subscribers_count":2,"default_branch":"main","last_synced_at":"2026-06-26T17:17:37.754Z","etag":null,"topics":["api-key-managenment","claude-api","claude-code","codex","openai","openai-api","openai-responses-api","opencode","provider-management","rust","tauri","typescript"],"latest_commit_sha":null,"homepage":"","language":"Rust","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/mxyhi.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":"AGENTS.md","dco":null,"cla":null}},"created_at":"2026-01-09T16:14:54.000Z","updated_at":"2026-06-26T15:33:42.000Z","dependencies_parsed_at":null,"dependency_job_id":"f4eaf3f9-abfb-4952-8adb-40742bcb68e7","html_url":"https://github.com/mxyhi/token_proxy","commit_stats":null,"previous_names":["mxyhi/token_proxy"],"tags_count":128,"template":false,"template_full_name":null,"purl":"pkg:github/mxyhi/token_proxy","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mxyhi%2Ftoken_proxy","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mxyhi%2Ftoken_proxy/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mxyhi%2Ftoken_proxy/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mxyhi%2Ftoken_proxy/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/mxyhi","download_url":"https://codeload.github.com/mxyhi/token_proxy/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mxyhi%2Ftoken_proxy/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":34913586,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-29T02:00:05.398Z","response_time":58,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["api-key-managenment","claude-api","claude-code","codex","openai","openai-api","openai-responses-api","opencode","provider-management","rust","tauri","typescript"],"created_at":"2026-01-16T06:05:28.357Z","updated_at":"2026-06-29T05:01:05.032Z","avatar_url":"https://github.com/mxyhi.png","language":"Rust","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Token Proxy\n\nEnglish | [中文](README.zh-CN.md)\n\nLocal AI API gateway for OpenAI / Gemini / Anthropic. Runs on your machine, keeps tokens counted (SQLite), offers priority-based load balancing, optional API format conversion (OpenAI Chat/Responses ↔ Anthropic Messages, plus Gemini ↔ OpenAI/Anthropic, including SSE/tools/images), and one-click setup for Claude Code / Codex.\n\n\u003e Default listen port: **9208** (release) / **19208** (debug builds).\n\n---\n\n## What you get\n- Multiple providers: `openai`, `openai-response`, `anthropic`, `gemini`, `kiro`, `codex`\n- Built-in routing + optional format conversion (OpenAI Chat ⇄ Responses; Anthropic Messages ↔ OpenAI; Gemini ↔ OpenAI/Anthropic; SSE supported)\n- Per-upstream priority + two balancing strategies (fill-first / round-robin)\n- Model alias mapping (exact / prefix* / wildcard*) and response model rewrite\n- Local access key (Authorization) + upstream key injection\n- SQLite-powered dashboard (requests, tokens, cached tokens, latency, recent)\n- macOS tray live token rate (optional)\n\n## Screenshots\n|  |  |\n| --- | --- |\n| **Dashboard**\u003cbr\u003e![Dashboard](images/dashboard.png) | **Core**\u003cbr\u003e![Core settings](images/core.png) |\n| **Upstreams**\u003cbr\u003e![Upstreams](images/upstream.png) | **Add upstream**\u003cbr\u003e![Add upstream](images/add-upstream.png) |\n\n## Quick start (macOS)\n1) Install: move `Token Proxy.app` to `/Applications`. If blocked: `xattr -cr /Applications/Token\\ Proxy.app`.\n2) Launch the app. The proxy starts automatically.\n3) Open **Config File** tab, edit and save (writes `config.jsonc` in the Tauri config dir). Defaults are usable; just paste your upstream API keys. Running proxies auto-apply the new config via reload or restart when needed.\n4) Call via curl (example with local auth):\n```bash\ncurl -X POST \\\n  -H \"Authorization: Bearer YOUR_LOCAL_KEY\" \\\n  -H \"Content-Type: application/json\" \\\n  http://127.0.0.1:9208/v1/chat/completions \\\n  -d '{\"model\":\"gpt-4.1-mini\",\"messages\":[{\"role\":\"user\",\"content\":\"hi\"}]}'\n```\n\nYou can also call using the Anthropic Messages format (useful for Claude Code clients):\n```bash\ncurl -X POST \\\n  -H \"x-api-key: YOUR_LOCAL_KEY\" \\\n  -H \"Content-Type: application/json\" \\\n  http://127.0.0.1:9208/v1/messages \\\n  -d '{\"model\":\"claude-3-5-sonnet-20241022\",\"max_tokens\":256,\"messages\":[{\"role\":\"user\",\"content\":[{\"type\":\"text\",\"text\":\"hi\"}]}]}'\n```\n\n## Workspace \u0026 CLI (Rust)\n- This repo is now a Cargo workspace; the Tauri app still lives in `src-tauri/`.\n- CLI crate: `crates/token_proxy_cli` (binary `token-proxy`).\n- Default config path: `./config.jsonc` (override with `--config`).\n- GitHub Releases also publish packaged CLI archives per target:\n  - Unix: `token-proxy_cli_\u003cversion\u003e_\u003ctarget\u003e.tar.gz`\n  - Windows: `token-proxy_cli_\u003cversion\u003e_\u003ctarget\u003e.zip`\n\n```bash\n# start proxy\ncargo run -p token_proxy_cli -- serve\n\n# start with custom config path\ncargo run -p token_proxy_cli -- --config ./config.jsonc serve\n\n# config helpers\ncargo run -p token_proxy_cli -- config init\ncargo run -p token_proxy_cli -- --config ./config.jsonc config path\n```\n\n## Frontend tests\n```bash\n# watch mode\npnpm test\n\n# run once (CI-friendly)\npnpm test:run\n\n# coverage (optional)\npnpm test:coverage\n\n# TypeScript typecheck\npnpm exec tsc --noEmit\n```\n\nNotes:\n- Test files live in `src/**/*.test.{ts,tsx}`.\n- Global test setup (Tauri mocks + jsdom polyfills) is in `src/test/setup.ts`.\n- Vitest config is in `vitest.config.ts`.\n\n## Configuration reference\n- File: `config.jsonc` (comments + trailing commas allowed)\n- Location:\n  - CLI: `--config` (default: `./config.jsonc`)\n  - Tauri: **AppConfig** directory (resolved automatically by the app)\n\n### Core fields\n| Field | Default | Notes |\n| --- | --- | --- |\n| `host` | `127.0.0.1` | Listen address (IPv6 allowed; will be bracketed in URLs) |\n| `port` | `9208` release / `19208` debug | Change if the port is taken |\n| `local_api_key` | `null` | When set: local auth uses format-specific headers (see Auth rules); local auth inputs are **not** forwarded upstream. |\n| `app_proxy_url` | `null` | Proxy for app updater \u0026 as placeholder for upstreams (`\"$app_proxy_url\"`). Supports `http/https/socks5/socks5h`. |\n| `log_level` | `silent` | `silent|error|warn|info|debug|trace`; debug/trace log request headers (auth redacted) and small bodies (≤64KiB). Release builds force `silent`. |\n| `max_request_body_bytes` | `104857600` (100 MiB) | 0 = fallback to default. Protects inbound body size. |\n| `retryable_failure_cooldown_secs` | `15` | Cooldown window after retryable failures that should temporarily sideline an upstream. `0` disables cooldown. Reloading or restarting the running proxy resets current cooldown state. |\n| `codex_session_scoped_cooldown_enabled` | `false` | Only applies to Codex account-backed OpenAI Responses requests. When enabled, cooldown is isolated by `session_id`; final success clears that session, and requests without `session_id` do not share cooldown. |\n| `tray_token_rate.enabled` | `true` | macOS tray live rate; harmless elsewhere. |\n| `tray_token_rate.format` | `split` | `combined` (`total`), `split` (`↑in ↓out`), `both` (`total | ↑in ↓out`). |\n| `upstream_strategy` | `{ \"order\": \"fill_first\", \"dispatch\": { \"type\": \"serial\" } }` | Structured strategy object. `order` controls candidate ordering inside one priority group; `dispatch` controls serial / hedged / race execution. |\n\n### Upstream entries (`upstreams[]`)\n| Field | Default | Notes |\n| --- | --- | --- |\n| `id` | required | Unique per upstream. |\n| `providers` | required | One upstream can serve multiple providers. Special providers `kiro/codex` cannot be mixed with others. |\n| `base_url` | required | Full base; overlapping path parts are de-duplicated. (`providers=[\"kiro\"]` / `[\"codex\"]` can be empty.) |\n| `api_key` | `null` | Provider-specific bearer/key; overrides request headers. |\n| `kiro_account_id` | `null` | Required when `providers=[\"kiro\"]`. |\n| `preferred_endpoint` | `null` | `kiro` only (`providers=[\"kiro\"]`): `ide` or `cli`. |\n| `proxy_url` | `null` | Per-upstream proxy; supports `http/https/socks5/socks5h`; default is **no system proxy**. `$app_proxy_url` placeholder allowed. |\n| `priority` | `0` | Higher = tried earlier. Grouped by priority then by order (or round-robin). |\n| `enabled` | `true` | Disabled upstreams are skipped. |\n| `model_mappings` | `{}` | Exact / `prefix*` / `*`. Priority: exact \u003e longest prefix \u003e wildcard. Response echoes original alias. |\n| `convert_from_map` | `{}` | Explicitly allow inbound format conversion per provider. Example: `{ \"openai-response\": [\"openai_chat\", \"anthropic_messages\"] }`. |\n| `overrides.header` | `{}` | Set/remove headers (null removes). Hop-by-hop/Host/Content-Length are always ignored. |\n\n## Routing \u0026 format conversion\n- Gemini native API: `/v1beta/models/*` (including `:generateContent`, `:streamGenerateContent`, `:countTokens`, `:embedContent`, `:batchEmbedContents`), model catalog/detail, `/v1beta/files*`, `/upload/v1beta/files*`, `/v1beta/cachedContents*`, `/v1beta/tunedModels*` → `gemini`.\n- Anthropic: `/v1/messages` (and subpaths) and `/v1/complete` → `anthropic` (Kiro shares the same format).\n- OpenAI create routes: `/v1/chat/completions` → `openai`; `/v1/responses` → `openai-response`.\n- OpenAI native pass-through routes are explicitly pinned to OpenAI-compatible providers and won't fall through to `anthropic`: `chat/completions/*`, `responses/*`, `assistants*`, `threads*`, `conversations*`, `chatkit*`, `containers*`, `evals*`, `files*`, `uploads*`, `batches*`, `vector_stores*`, `images/*`, `audio/*`, `embeddings`, `moderations`, `completions`, `fine_tuning/*`, `realtime/*`, `skills*`, `videos*`.\n- For `responses/*` resources, provider preference is `openai-response` → `openai`; for other OpenAI native resources, provider preference is `openai` → `openai-response`.\n- Other paths: choose the provider with the highest configured priority; tie-break is `openai` \u003e `openai-response` \u003e `anthropic`.\n- Cross-format fallback/conversion is controlled by `upstreams[].convert_from_map` (no global switch). If a provider has no eligible upstream for the inbound format, it won't be selected.\n- If `openai` is missing for `/v1/chat/completions`: fallback can be `openai-response`, `anthropic`, or `gemini` (priority-based; tie-break prefers `openai-response`).\n- For `/v1/messages`: choose between `anthropic` and `kiro` by priority; tie-break uses upstream id. If the chosen provider returns a retryable error, the proxy will fall back to the other native provider (Anthropic ↔ Kiro) when configured.\n- If neither `anthropic` nor `kiro` exists for `/v1/messages`: other providers can be selected only when allowed for `anthropic_messages` via `convert_from_map` (e.g. `openai-response`, `openai`, `gemini`).\n- If `openai-response` is missing for `/v1/responses`: fallback can be `openai`, `anthropic`, or `gemini` (priority-based; tie-break prefers `openai`).\n- If `gemini` is missing for `/v1beta/models/*:generateContent` or `*:streamGenerateContent`: fallback can be `openai-response`, `openai`, or `anthropic` (priority-based; tie-break prefers `openai-response`).\n- Other Gemini native endpoints are pass-through only and require a configured `gemini` upstream.\n\n## Auth rules (important)\n- Local access: `local_api_key` enabled → require format-specific key. These local auth inputs are stripped and **not** forwarded upstream.\n  - Public whitelist: `GET` / `HEAD` `/v1/models` and `/v1beta/openai/models` do not require local key.\n  - OpenAI / Responses: `Authorization: Bearer \u003ckey\u003e`\n  - Anthropic `/v1/messages`: `x-api-key` or `x-anthropic-api-key`\n  - Gemini native API: `x-goog-api-key` or `?key=...`\n- When `local_api_key` is enabled, request headers are **not** used for upstream auth; configure `upstreams[].api_key` instead.\n- Upstream auth resolution (per request):\n  - **OpenAI**: `upstream.api_key` → `x-openai-api-key` → `Authorization` (only if `local_api_key` is **not** set) → error.\n  - **Anthropic**: `upstream.api_key` → `x-api-key` / `x-anthropic-api-key` → error. Missing `anthropic-version` is auto-filled with `2023-06-01`.\n  - **Gemini**: `upstream.api_key` → `x-goog-api-key` → query `?key=...` → error.\n\n## Load balancing \u0026 retries\n- Priorities: higher `priority` groups first.\n- `upstream_strategy.order` controls selection inside the same priority group:\n  - `fill_first`: keep the configured list order.\n  - `round_robin`: rotate the starting point across requests.\n- `upstream_strategy.dispatch` controls how requests are launched inside one priority group:\n  - `{\"type\":\"serial\"}`: try one candidate at a time.\n  - `{\"type\":\"hedged\",\"delay_ms\":2000,\"max_parallel\":2}`: launch the first candidate immediately, then add one more attempt after `delay_ms` if the prior attempt is still unresolved, up to `max_parallel`.\n  - `{\"type\":\"race\",\"max_parallel\":3}`: launch up to `max_parallel` candidates immediately and take the first successful result.\n- Retryable conditions: network timeout/connect errors, or status 400/401/403/404/408/422/429/307/5xx (including 504/524). Retries stay within the same provider's priority groups.\n- Cooldown conditions: `401/403/408/429/5xx` will temporarily move the failed upstream behind ready peers for `retryable_failure_cooldown_secs` (default `15`); `400/404/422/307` stay retryable but do not trigger cross-request cooldown. With `codex_session_scoped_cooldown_enabled=true`, Codex account-backed OpenAI Responses cooldown is isolated by `session_id`; final successful requests do not keep same-session cooldown, and requests without `session_id` do not share cooldown.\n- `/v1/messages` only: after the chosen native provider is exhausted (retryable errors), the proxy can fall back to the other native provider (`anthropic` ↔ `kiro`) if it is configured.\n\n## Observability\n- SQLite log: `data.db` in config dir. Stores per-request stats (tokens, cached tokens, latency, model, upstream).\n- Token rate: macOS tray shows live total or split rates (configurable via `tray_token_rate`).\n- Debug/trace log bodies capped at 64KiB.\n\n## Dashboard\n- In-app **Dashboard** page visualizes totals, per-provider stats, time series, and recent requests (page size 50, offset supported).\n- The Logs panel supports a 30-second request-detail capture window: when enabled it stores request headers/bodies during that window, always keeps error responses for failed requests, and turns off automatically afterward.\n\n## One-click CLI setup\n- Claude Code: writes `~/.claude/settings.json` `env` (`ANTHROPIC_BASE_URL`, `ANTHROPIC_MODEL=claude-sonnet-4-6`, `ANTHROPIC_AUTH_TOKEN` when local key is set).\n- Codex: writes `~/.codex/config.toml` `model=\"gpt-5.5\"`, `model_provider=\"token_proxy\"`, and `[model_providers.token_proxy].base_url` → `http://127.0.0.1:\u003cport\u003e/v1`; writes `~/.codex/auth.json` `OPENAI_API_KEY`.\n- A `.token_proxy.bak` file is created before overwriting; restart the CLI to apply.\n\n## FAQ\n- **Port already in use?** Change `port` in `config.jsonc`; remember to update your client base URL.\n- **Got 401?** If `local_api_key` is set, you must send the format-specific local key (OpenAI/Responses: `Authorization`, Anthropic: `x-api-key`, Gemini: `x-goog-api-key` or `?key=`). With local auth enabled, configure upstream keys in `upstreams[].api_key`.\n- **Got 504?** Upstream did not send response headers or the first body chunk within 120s. For streaming responses, a 120s idle timeout between chunks may also close the connection.\n- **413 Payload Too Large?** Body exceeded `max_request_body_bytes` (default 100 MiB) or the transform limit for format-conversion requests.\n- **Why no system proxy?** By design, `reqwest` is built with `.no_proxy()`; set per-upstream `proxy_url` if needed.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmxyhi%2Ftoken_proxy","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmxyhi%2Ftoken_proxy","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmxyhi%2Ftoken_proxy/lists"}