{"id":50567897,"url":"https://github.com/study8677/llm-router","last_synced_at":"2026-06-04T16:01:16.031Z","repository":{"id":360364443,"uuid":"1249804953","full_name":"study8677/llm-router","owner":"study8677","description":"自托管 OpenAI-compatible AI Gateway：用 auto / auto-coding / auto-longtext 自动选择合适模型，支持流式、工具调用、多模态透传和 fallback。","archived":false,"fork":false,"pushed_at":"2026-05-26T05:38:54.000Z","size":69,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-05-26T06:26:37.908Z","etag":null,"topics":["ai-gateway","ai-router","auto-model","developer-tools","docker","function-calling","llm-gateway","llm-proxy","llm-router","model-routing","multimodal","nodejs","openai-api","openai-compatible","self-hosted","streaming","typescript"],"latest_commit_sha":null,"homepage":"https://github.com/study8677/llm-router#readme","language":"TypeScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/study8677.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":"SECURITY.md","support":"SUPPORT.md","governance":null,"roadmap":"docs/ROADMAP.md","authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-05-26T03:46:35.000Z","updated_at":"2026-05-26T05:37:52.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/study8677/llm-router","commit_stats":null,"previous_names":["study8677/llm-router"],"tags_count":2,"template":false,"template_full_name":null,"purl":"pkg:github/study8677/llm-router","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/study8677%2Fllm-router","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/study8677%2Fllm-router/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/study8677%2Fllm-router/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/study8677%2Fllm-router/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/study8677","download_url":"https://codeload.github.com/study8677/llm-router/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/study8677%2Fllm-router/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":33912343,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-04T02:00:06.755Z","response_time":64,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai-gateway","ai-router","auto-model","developer-tools","docker","function-calling","llm-gateway","llm-proxy","llm-router","model-routing","multimodal","nodejs","openai-api","openai-compatible","self-hosted","streaming","typescript"],"created_at":"2026-06-04T16:01:15.193Z","updated_at":"2026-06-04T16:01:16.023Z","avatar_url":"https://github.com/study8677.png","language":"TypeScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cdiv align=\"center\"\u003e\n\n# LLM Router\n\n**把普通 OpenAI-compatible 中转站，升级成会自动选模型的本地 AI Gateway。**\n\n[![CI](https://github.com/study8677/llm-router/actions/workflows/ci.yml/badge.svg)](https://github.com/study8677/llm-router/actions/workflows/ci.yml)\n[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE)\n![Node.js](https://img.shields.io/badge/node-%3E%3D20-339933)\n![OpenAI compatible](https://img.shields.io/badge/OpenAI-compatible-111827)\n![Streaming](https://img.shields.io/badge/streaming-SSE-2563eb)\n\n`auto` / `auto-coding` / `auto-longtext`\n\n[快速开始](#3-分钟启动) · [虚拟模型](#虚拟模型) · [架构](#架构) · [路由策略](#路由策略) · [Docker](#docker) · [文档](#文档)\n\n\u003c/div\u003e\n\nLLM Router 运行在你的客户端和上游中转站之间。客户端继续使用熟悉的 `/v1/chat/completions`，只需要把模型名写成 `auto`、`auto-coding` 或 `auto-longtext`，路由器会根据请求内容、模型价格、能力约束和失败重试策略，选择更合适的真实模型。\n\n```text\nClient / SDK  -\u003e  LLM Router  -\u003e  your existing OpenAI-compatible relay\nmodel=auto        local policy      selected real model\n```\n\n适合已经有 `base_url` 和 API Key，但不想每次手动切模型的人：简单任务尽量便宜，复杂 coding、长文本推理、架构规划和安全 review 不为了省钱牺牲质量。\n\n## 你会得到什么\n\n- **一个统一入口**：客户端只配一次 `base_url=http://127.0.0.1:8787/v1`。\n- **三个虚拟模型**：`auto`、`auto-coding`、`auto-longtext` 覆盖日常、工程和长文本场景。\n- **OpenAI 兼容**：支持 Chat Completions、SSE streaming、tools/function calling 和多模态透传。\n- **成本感知路由**：路由模型只负责决策，回答仍由被选中的真实模型完成。\n- **可观测 fallback**：响应 header 和结构化日志记录原始模型、目标模型、路由过程和重试。\n\n## 3 分钟启动\n\n```bash\ngit clone https://github.com/study8677/llm-router.git\ncd llm-router\nnpm install\ncp .env.example .env\n```\n\n编辑 `.env`，填入你已有的上游中转站：\n\n```bash\nUPSTREAM_BASE_URL=https://your-relay.example.com\nUPSTREAM_API_KEY=sk-your-upstream-key\n```\n\n启动服务：\n\n```bash\nnpm run build\nnpm start\n```\n\n打开本地配置页：\n\n```text\nhttp://127.0.0.1:8787/admin\n```\n\n这里可以查看上游模型、当前生效的路由模型，并把 `auto` 使用的路由模型从“自动选择最便宜已知价格模型”改成“手动指定某个模型”。配置会保存到 `.llm-router.local.json`。\n\n客户端改成：\n\n```bash\nbase_url=http://127.0.0.1:8787/v1\napi_key=任意值\nmodel=auto\n```\n\n如果设置了 `ROUTER_API_KEY`，客户端的 `api_key` 需要填写这个本地代理 Key。\n\n## 虚拟模型\n\n### `auto`\n\n通用入口，适合问答、翻译、改写、推理、简单分析和大多数日常请求。简单任务优先低成本模型，困难推理会升到更强模型。\n\n### `auto-coding`\n\n工程入口，适合代码生成、debug、架构设计、repo 级规划、PR review 和安全分析。简单代码任务可以走 coding specialist，复杂工程任务会倾向最强模型。\n\n### `auto-longtext`\n\n长文本入口，适合总结、抽取、合同/文档分析、长上下文推理。简单抽取优先低成本长上下文模型，复杂分析会选择更强推理模型。\n\n你也可以继续传真实模型 ID。真实模型会直接转发，不经过路由模型。\n\n## 架构\n\n```mermaid\nflowchart LR\n  Client[\"Client / SDK\"] --\u003e Router[\"LLM Router\"]\n  Router --\u003e RouteModel[\"Router model\u003cbr/\u003eauto cheapest or manual\"]\n  RouteModel --\u003e Decision[\"Routing JSON\"]\n  Decision --\u003e Router\n  Router --\u003e AnswerModel[\"Selected answer model\"]\n  AnswerModel --\u003e Client\n```\n\n路由是两阶段完成的：\n\n1. 路由模型读取原始请求、候选模型、价格、能力提示和当前虚拟模式，输出结构化路由决策。默认使用最便宜且价格已知的模型，也可以在本地 Admin 页手动指定。\n2. 回答模型独立处理原始请求。即使路由模型和回答模型是同一个 ID，也会再调用一次，不复用路由内容。\n\n遇到 timeout、network error、`429`、`5xx` 时，自动路由会按配置重新路由并重试。流式响应只有在上游还没吐出 chunk 前才能 fallback；一旦已经发给客户端，就不能安全换模型。\n\n更多细节见 [Architecture](docs/ARCHITECTURE.md) 和 [Routing Behavior](docs/ROUTING_BEHAVIOR.md)。\n\n## API\n\n健康检查：\n\n```bash\ncurl http://localhost:8787/health\n```\n\n模型列表：\n\n```bash\ncurl http://localhost:8787/v1/models\n```\n\n自动路由：\n\n```bash\ncurl http://localhost:8787/v1/chat/completions \\\n  -H 'content-type: application/json' \\\n  -d '{\n    \"model\": \"auto\",\n    \"messages\": [{\"role\": \"user\", \"content\": \"Explain binary search in one paragraph.\"}]\n  }'\n```\n\n流式自动路由：\n\n```bash\ncurl -N http://localhost:8787/v1/chat/completions \\\n  -H 'content-type: application/json' \\\n  -d '{\n    \"model\": \"auto-coding\",\n    \"stream\": true,\n    \"messages\": [{\"role\": \"user\", \"content\": \"Write a TypeScript debounce function.\"}]\n  }'\n```\n\n响应 header 会暴露关键路由信息：\n\n```text\nx-llm-router-request-id\nx-llm-router-original-model\nx-llm-router-target-model\n```\n\n## 能力矩阵\n\n| 类型 | 当前支持 |\n| --- | --- |\n| OpenAI-compatible API | `POST /v1/chat/completions`、`GET /v1/models`、`GET /health` |\n| Chat Completions | 普通响应、SSE streaming、auto fallback |\n| 请求透传 | tools / function calling、多模态 Chat Completions |\n| 计划中 | `/v1/embeddings`、`/v1/responses`、Anthropic Messages API |\n\n## 路由策略\n\n默认策略偏向“简单任务省成本，复杂任务保质量”：\n\n- 简单聊天、改写、翻译、短问答：优先 `gpt-5.4-mini` 或其他低成本可用模型。\n- 简单代码、小段代码生成、语法帮助、直接 bug 修复：倾向 `gpt-5.3-codex`。\n- 困难 coding、架构规划、repo 级迁移、复杂 debug、PR/安全 review：倾向最强前沿模型，例如 `gpt-5.5`，并使用 `xhigh`。\n- 简单长文本抽取或总结：选择低成本且长上下文可用的模型。\n- 复杂长文本推理、高风险分析：选择最强前沿模型，并使用 `high` 或 `xhigh`。\n\n模型池、价格、能力标签和重试行为都可以通过配置调整。配置入口见 [Configuration](docs/CONFIGURATION.md)。\n\n## 本地 Admin\n\n访问 `http://127.0.0.1:8787/admin` 可以配置 auto 路由第一跳使用的“路由模型”：\n\n- **自动选择**：默认模式，从上游模型列表中选择价格已知且最便宜的模型做路由。\n- **手动指定**：固定使用你选择的某个上游模型做路由，适合你希望路由判断也更聪明的场景。\n- 如果设置了 `ROUTER_API_KEY`，页面会要求输入这个本地 Key 才能读取或保存配置。\n\n这只影响路由判断模型，不会把它当成最终回答复用。最终回答仍然由路由 JSON 选出的目标模型单独调用。\n\n## 多模态和工具调用\n\nLLM Router 会尽量保持最终请求和原始客户端请求一致：\n\n- `tools`、`tool_choice`、`parallel_tool_calls`、旧版 `functions`、`function_call` 会透传给最终模型。\n- 工具调用和多模态请求会在内部路由 payload 里带上 `required_capabilities`。\n- 多模态最终请求会原样转发。\n- 内部路由请求会把 base64 图片、超长 base64 字符串和超长 URL 替换成元数据，避免路由模型上下文被图片 payload 撑爆。\n\n## Docker\n\n```bash\ncp .env.example .env\ndocker compose up --build\n```\n\nDocker 和生产运行建议见 [Operations](docs/OPERATIONS.md)。\n\n## 文档\n\n- [Configuration](docs/CONFIGURATION.md)：环境变量、模型池、fallback 和认证配置。\n- [Routing Behavior](docs/ROUTING_BEHAVIOR.md)：路由输入、决策、失败重试和边界行为。\n- [Client Examples](docs/CLIENTS.md)：常见客户端接入方式。\n- [Architecture](docs/ARCHITECTURE.md)：两阶段路由和内部数据流。\n- [Operations](docs/OPERATIONS.md)：部署、日志、监控和运行建议。\n- [FAQ](docs/FAQ.md)：常见问题。\n- [Roadmap](docs/ROADMAP.md)：后续计划，包括更接近 ccswitch 的 CLI 体验。\n- [Contributing](CONTRIBUTING.md)：贡献指南。\n- [Security](SECURITY.md)：安全策略。\n\n## 开发\n\n```bash\nnpm install\nnpm run build\nnpm test\n```\n\n真实上游路由评估：\n\n```bash\nnpm run test:live-routing\n```\n\n`test:live-routing` 会读取本地 `.env` 并调用真实上游。\n\n## 安全\n\n默认建议把 LLM Router 作为本机或内网服务使用：\n\n- 不要提交 `.env`。\n- 如果服务不是只监听本机可信客户端，请设置 `ROUTER_API_KEY`。\n- 不要把本服务无认证暴露到公网。\n- 安全问题请使用 GitHub private vulnerability reporting。\n\n## License\n\n[MIT](LICENSE)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fstudy8677%2Fllm-router","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fstudy8677%2Fllm-router","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fstudy8677%2Fllm-router/lists"}