{"id":47727507,"url":"https://github.com/blhsing/copilot-adapter","last_synced_at":"2026-04-21T04:05:28.223Z","repository":{"id":346966108,"uuid":"1192271599","full_name":"blhsing/copilot-adapter","owner":"blhsing","description":"An API proxy server that turns GitHub Copilot's services of chat completion / embeddings / responses into OpenAI / Anthropic / Gemini-compatible API endpoints","archived":false,"fork":false,"pushed_at":"2026-04-08T06:29:54.000Z","size":247,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"master","last_synced_at":"2026-04-08T08:00:31.824Z","etag":null,"topics":["anthropic","api-proxy","gemini","github-copilot","openai"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/blhsing.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-03-26T03:49:28.000Z","updated_at":"2026-04-08T06:29:57.000Z","dependencies_parsed_at":null,"dependency_job_id":"f79f0d33-5f3d-4f52-b580-3cc0ffe549ce","html_url":"https://github.com/blhsing/copilot-adapter","commit_stats":null,"previous_names":["blhsing/copilot-adapter"],"tags_count":1,"template":false,"template_full_name":null,"purl":"pkg:github/blhsing/copilot-adapter","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/blhsing%2Fcopilot-adapter","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/blhsing%2Fcopilot-adapter/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/blhsing%2Fcopilot-adapter/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/blhsing%2Fcopilot-adapter/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/blhsing","download_url":"https://codeload.github.com/blhsing/copilot-adapter/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/blhsing%2Fcopilot-adapter/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31825515,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-14T18:05:02.291Z","status":"online","status_checked_at":"2026-04-15T02:00:06.175Z","response_time":63,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["anthropic","api-proxy","gemini","github-copilot","openai"],"created_at":"2026-04-02T20:54:41.436Z","updated_at":"2026-04-21T04:05:28.215Z","avatar_url":"https://github.com/blhsing.png","language":"Python","readme":"# copilot-adapter\n\nAn OpenAI / Anthropic / Gemini-compatible LLM API proxy server backed by GitHub Copilot.\n\nAuthenticates via a GitHub Personal Access Token (PAT) or GitHub's device flow, then proxies requests to GitHub Copilot's backend through a local server that speaks all three major LLM API formats.\n\n## Key features\n\n- [**Multi-account pooling**](#multi-account) — Rotate between multiple GitHub Copilot accounts to pool premium request quotas, with automatic exhaustion detection and account switching\n- [**Per-account plan and quota**](#per-account-plan-and-quota) — Mix accounts on different Copilot tiers with per-account quota limits that auto-derive from the plan\n- [**Smart premium request billing**](#premium-request-billing) — Automatically avoids extra premium request charges for agentic follow-ups, with no client-side changes needed\n- [**Rate limit handling**](#premium-request-billing) — Automatically retries on rate limit errors by rotating to the next available account\n- [**Three API formats**](#endpoints) — Serves OpenAI, Anthropic, and Gemini endpoints simultaneously\n- [**Forward proxy mode**](#forward-proxy-mode) — Acts as an HTTP/HTTPS proxy that intercepts Copilot API traffic and rewrites billing headers, and transparently reroutes requests for OpenAI, Anthropic, and Gemini APIs through Copilot\n- [**One-command tool setup**](#tool-configuration) — Automatically configure popular agentic coding tools (Claude Code, Codex, Gemini CLI, OpenCode) to use this proxy, with easy revert to defaults\n- [**Configurable model mapping**](#model-mapping) — Built-in Claude model-ID normalization plus optional glob-pattern overrides\n- [**Cross-provider reasoning effort mapping**](#parameter-compatibility) — Preserves Anthropic thinking / `output_config.effort` when requests are mapped to OpenAI-style models, including Responses-only targets like `gpt-5.4`\n- [**Server-side web search**](#server-side-web-search) — Converts Anthropic's built-in `web_search` tool type to a function tool, intercepts it server-side via DuckDuckGo, and returns Anthropic-native structured search results to Anthropic clients; strips other unsupported built-in types\n- **Streaming support** — Full SSE streaming across all three formats, including real-time format translation\n- [**Flexible authentication**](#authentication) — Supports multiple GitHub PATs, environment variables, cached tokens, and interactive device-flow OAuth, with automatic fallback\n- **Multi-worker support** — Spawns multiple worker processes for higher throughput\n- **Concurrent-safe token management** — Only one token refresh happens at a time under concurrent load\n- [**Docker ready**](#docker) — Pre-built image on [GHCR](https://github.com/blhsing/copilot-adapter/pkgs/container/copilot-adapter), or build locally\n- **CORS support** — Configurable allowed origins for browser-based applications\n\n## Prerequisites\n\n- Python 3.10+\n- pip\n- A GitHub account with [GitHub Copilot](https://github.com/features/copilot) access (the free tier works; paid plans provide higher premium request quotas)\n\n## Setup\n\n```bash\npip install -r requirements.txt\n```\n\n`orjson` is listed as a dependency; if it fails to install on your platform, the adapter automatically falls back to the stdlib `json` module (expect higher memory use when serializing very large tool schemas).\n\n## Usage\n\n```bash\n# Start the server with a GitHub PAT (no interactive login needed)\npython copilot_adapter.py serve --github-token ghp_xxx\n\n# Or use an environment variable\nexport COPILOT_ADAPTER_GITHUB_TOKEN=ghp_xxx\npython copilot_adapter.py serve\n\n# Interactive device-flow login (opens browser)\npython copilot_adapter.py login\npython copilot_adapter.py serve\n\n# Options\npython copilot_adapter.py serve --host 0.0.0.0 --port 18080\n\n# Multiple worker processes for higher throughput (default: 1)\npython copilot_adapter.py serve --workers 4\n\n# Remove stored credentials\npython copilot_adapter.py logout\n```\n\nToken lookup order: `--github-token` flag \u003e `COPILOT_ADAPTER_GITHUB_TOKEN` env var \u003e `GITHUB_TOKEN` env var \u003e cached tokens \u003e interactive device flow.\n\n### Per-account plan and quota\n\nWhen accounts are on different Copilot tiers, append the plan and quota limit to the token with colons:\n\n```bash\n# TOKEN:PLAN:QUOTA format\npython copilot_adapter.py serve \\\n  --github-token ghp_aaa:pro:300 \\\n  --github-token ghp_bbb:free:50\n\n# TOKEN:PLAN:QUOTA:USAGE format (specify current premium usage)\npython copilot_adapter.py serve \\\n  --github-token ghp_aaa:pro:300:150.5 \\\n  --github-token ghp_bbb:free:50:12\n\n# Bare tokens fall back to the global --plan default (pro) and its quota (300)\npython copilot_adapter.py serve \\\n  --github-token ghp_aaa:enterprise:1000 \\\n  --github-token ghp_bbb:free:50 \\\n  --github-token ghp_ccc\n```\n\n### Multi-account\n\nPool multiple GitHub Copilot accounts to extend your premium request quota:\n\n```bash\n# Add accounts via device-flow login (run multiple times)\npython copilot_adapter.py login   # adds first account\npython copilot_adapter.py login   # adds second account\n\n# Or pass multiple PATs\npython copilot_adapter.py serve --github-token ghp_aaa --github-token ghp_bbb\n\n# Or comma-separated in an env var\nexport COPILOT_ADAPTER_GITHUB_TOKEN=ghp_aaa,ghp_bbb\npython copilot_adapter.py serve\n\n# List cached accounts (shows plan, quota, and usage)\npython copilot_adapter.py accounts\n\n# Add a PAT to the cache (with optional plan/quota/usage)\npython copilot_adapter.py accounts --add ghp_xxx --plan pro --quota-limit 300 --usage 50\n\n# Update plan/quota/usage for a cached account\npython copilot_adapter.py accounts --update octocat --plan pro+ --quota-limit 1500 --usage 200\n\n# Remove a cached account\npython copilot_adapter.py accounts --remove octocat\n\n# Remove all accounts\npython copilot_adapter.py logout --all\n```\n\n**Rotation strategies** (`--strategy`):\n\n| Strategy | Behavior | Pros | Cons |\n|----------|----------|------|------|\n| `max-usage` (default) | Concentrate all usage on one account until its quota is exhausted, then move to the next | Maximizes the number of accounts kept at zero usage as reserves; best server-side cache efficiency since all requests hit the same account's session; simple and predictable | One account bears all the load; if the month resets mid-use, the reserve accounts were never needed |\n| `min-usage` | Always pick the account with the lowest usage | Spreads consumption evenly across all accounts; maximizes headroom on every account; reduces risk of hitting per-account rate limits | All accounts accumulate usage simultaneously, so none are kept clean as a reserve |\n| `round-robin` | Rotate blindly on each user-initiated request | Simple and predictable; spreads load without needing usage data | No awareness of quota — won't avoid accounts nearing their limit |\n\nAgent-initiated requests (tool-use follow-ups) always stay on the same account as the preceding user request to avoid unnecessary premium request charges.\n\n**Quota exhaustion detection**: When a Copilot account's premium request quota is exhausted, GitHub silently downgrades the response to a free fallback model (e.g. GPT-4.1) instead of returning an error. The server detects this by comparing the model in the response against the model that was requested — if they don't match, it marks the account as exhausted and automatically retries the request with the next available account. This works for both streaming and non-streaming requests.\n\nFor proactive switching *before* hitting the limit, set `--quota-limit N` or let it default from the plan. Usage is tracked in-memory with plan-aware model cost multipliers (e.g. Claude Opus 4.7 costs 3x, GPT-4o costs 0x on paid plans). You can specify each account's current usage via the `TOKEN:PLAN:QUOTA:USAGE` format, `--usage` flag, or config file to start tracking from where you left off. These defaults can be overridden per account — see [Per-account plan and quota](#per-account-plan-and-quota).\n\n**Supported plans** (`--plan`):\n\n| Plan | Monthly premium requests | Model multipliers |\n|------|------------------------:|-------------------|\n| `free` | 50 | All models cost 1x |\n| `pro` (default) | 300 | Differentiated (e.g. GPT-4o: 0x, Claude Opus: 3x) |\n| `pro+` | 1500 | Same as `pro` |\n| `business` | 300 | Same as `pro` |\n| `enterprise` | 1000 | Same as `pro` |\n\nWhen `--quota-limit` is not specified, it defaults to the plan's monthly allowance.\n\n### Config file\n\nAll settings can be placed in a JSON config file instead of (or alongside) CLI flags and environment variables. The server looks for `~/.config/copilot-adapter/config.json` by default, or you can specify a path with `--config`:\n\n```bash\npython copilot_adapter.py serve --config /path/to/config.json\n```\n\nExample `~/.config/copilot-adapter/config.json`:\n\n```json\n{\n  \"host\": \"0.0.0.0\",\n  \"port\": 18080,\n  \"strategy\": \"max-usage\",\n  \"plan\": \"pro\",\n  \"log_file\": \"/path/to/copilot-adapter.log\",\n  \"free\": false,\n  \"free_within_minutes\": 5,\n  \"proxy\": false,\n  \"proxy_user\": \"myuser\",\n  \"proxy_password\": \"mypassword\",\n  \"workers\": 4,\n  \"cors_origins\": [\"*\"],\n  \"model_map\": {\n    \"*sonnet*\": \"claude-sonnet-4.6\",\n    \"gpt-4-turbo\": \"gpt-4-0125-preview\"\n  },\n  \"api_tokens\": [\"sk-abc123...\", \"sk-def456...\"],\n  \"web_search_iterations\": 3,\n  \"accounts\": [\n    {\"token\": \"ghp_aaa\", \"plan\": \"enterprise\", \"quota_limit\": 1000, \"premium_used\": 250},\n    {\"token\": \"ghp_bbb\", \"plan\": \"free\"},\n    \"ghp_ccc:pro+:1500:100.5\",\n    \"ghp_ddd\"\n  ]\n}\n```\n\nAccount entries in the `accounts` array can be:\n- **Objects** with `token` (required), `plan`, `quota_limit`, and `premium_used` (all optional) fields\n- **Strings** in `TOKEN:PLAN:QUOTA:USAGE` format (same as the CLI `--github-token` syntax)\n- **Bare token strings** that fall back to the top-level `plan` and `quota_limit` defaults\n\n**Precedence** (highest to lowest): CLI flags \u003e environment variables \u003e config file \u003e built-in defaults.\n\n### Environment variables\n\nAll CLI options can be set via environment variables:\n\n| Flag | Environment variable | Default |\n|------|---------------------|---------|\n| `--config` | `COPILOT_ADAPTER_CONFIG` | `~/.config/copilot-adapter/config.json` |\n| `--host` | `COPILOT_ADAPTER_HOST` | `127.0.0.1` |\n| `--port` | `COPILOT_ADAPTER_PORT` | `18080` |\n| `--github-token` | `COPILOT_ADAPTER_GITHUB_TOKEN` | *(none)* |\n| `--cors-origin` | `COPILOT_ADAPTER_CORS_ORIGIN` | *(none)* |\n| `--workers` | `COPILOT_ADAPTER_WORKERS` | `1` |\n| `--strategy` | `COPILOT_ADAPTER_STRATEGY` | `max-usage` |\n| `--quota-limit` | `COPILOT_ADAPTER_QUOTA_LIMIT` | per plan |\n| `--plan` | `COPILOT_ADAPTER_PLAN` | `pro` |\n| `--log-level` | `COPILOT_ADAPTER_LOG_LEVEL` | `info` |\n| `--log-file` | `COPILOT_ADAPTER_LOG_FILE` | *(none)* |\n| `--free` | `COPILOT_ADAPTER_FREE` | *(off)* |\n| `--free-within-minutes` | `COPILOT_ADAPTER_FREE_WITHIN_MINUTES` | *(off)* |\n| `--proxy` | `COPILOT_ADAPTER_PROXY` | *(off)* |\n| `--ca-dir` | `COPILOT_ADAPTER_CA_DIR` | `~/.config/copilot-adapter` |\n| `--model-map` | `COPILOT_ADAPTER_MODEL_MAP` | *(none — Claude IDs auto-normalized)* |\n| `--proxy-user` | `COPILOT_ADAPTER_PROXY_USER` | *(none)* |\n| `--proxy-password` | `COPILOT_ADAPTER_PROXY_PASSWORD` | *(none)* |\n| `--api-token` | `COPILOT_ADAPTER_API_TOKEN` | stored tokens |\n| `--web-search-iterations` | `COPILOT_ADAPTER_WEB_SEARCH_ITERATIONS` | `3` |\n\nSet `NO_COLOR=1` to disable colored log output. Colors are auto-detected on Windows (requires Windows Terminal or VT-enabled console).\n\nUse `--log-file /path/to/copilot-adapter.log` (or `log_file` in the config file) to append the same logs to a file while keeping console output enabled.\n\n`GITHUB_TOKEN` is also accepted as a fallback for the GitHub token. Multiple tokens can be comma-separated in `COPILOT_ADAPTER_GITHUB_TOKEN` or `GITHUB_TOKEN`.\n\n### Docker\n\nA pre-built image is available on GitHub Container Registry:\n\n```bash\ndocker pull ghcr.io/blhsing/copilot-adapter:latest\n```\n\n```bash\n# Run\ndocker run -p 18080:18080 -e COPILOT_ADAPTER_GITHUB_TOKEN=ghp_xxx ghcr.io/blhsing/copilot-adapter\n\n# Multi-account with rotation\ndocker run -p 18080:18080 \\\n  -e COPILOT_ADAPTER_GITHUB_TOKEN=ghp_aaa,ghp_bbb \\\n  -e COPILOT_ADAPTER_STRATEGY=max-usage \\\n  -e COPILOT_ADAPTER_QUOTA_LIMIT=300 \\\n  ghcr.io/blhsing/copilot-adapter\n```\n\nOr build locally:\n\n```bash\ndocker build -t copilot-adapter .\ndocker run -p 18080:18080 -e COPILOT_ADAPTER_GITHUB_TOKEN=ghp_xxx copilot-adapter\n```\n\n## Endpoints\n\n### [OpenAI](https://platform.openai.com/docs/api-reference)\n\n```\nPOST /v1/chat/completions\nPOST /v1/responses\nGET  /v1/models\nPOST /v1/embeddings\n```\n\n### [Anthropic](https://docs.anthropic.com/en/api)\n\n```\nPOST /v1/messages\nPOST /v1/messages/count_tokens\n```\n\n### [Gemini](https://ai.google.dev/api)\n\n```\nPOST /v1beta/models/{model}:generateContent\nPOST /v1beta/models/{model}:streamGenerateContent\nGET  /v1beta/models\nGET  /v1beta/models/{model}\n```\n\nAll endpoints support streaming.\n\n## Premium request billing\n\nGitHub Copilot uses the `X-Initiator` header to determine whether an API call counts as a premium request:\n\n- `X-Initiator: user` — counts as a premium request\n- `X-Initiator: agent` — free (treated as an autonomous agent follow-up)\n\nThe proxy handles this automatically. When no `X-Initiator` header is provided by the caller, it inspects the request body and infers the correct value:\n\n- **OpenAI format** — `agent` if the last message has `role: \"tool\"`, or if any prior message contains tool calls or tool responses\n- **Anthropic format** — `agent` if the last message contains a `tool_result` content block, or if any prior assistant message contains a `tool_use` block\n- **Gemini format** — `agent` if the last turn contains a `functionResponse` part, or if any prior turn contains a `functionCall` or `functionResponse`\n- Otherwise — `user`\n\nThis means agentic clients like Claude Code that make multiple API calls per user turn (tool-use loops, retries, subagent spawns, auto-continues) will only consume one premium request for the initial prompt — follow-up calls are automatically marked as `agent`. No client-side changes needed.\n\nCallers can also pass `X-Initiator` explicitly to override the heuristic.\n\n### Free mode\n\nUse `--free` to mark **all** requests as agent-initiated, so nothing counts as a premium request:\n\n```bash\npython copilot_adapter.py serve --free\n```\n\nThis is useful when you want to avoid all premium billing regardless of request type. Note that GitHub Copilot may throttle or deprioritize agent-initiated requests compared to user-initiated ones.\n\n### Time-based free mode\n\nUse `--free-within-minutes N` to mark a user-initiated request as agent-initiated only if the last request to the same account was less than N minutes ago:\n\n```bash\npython copilot_adapter.py serve --free-within-minutes 5\n```\n\nThe logic: the first request in a session is billed normally (as `user`), but subsequent requests within the time window are marked as `agent` (free). Once the account has been idle for longer than N minutes, the next request is treated as a new session and billed normally again.\n\nThis is useful when you want to limit premium billing to one request per session rather than eliminating it entirely. It's mutually exclusive with `--free`.\n\nWhen using multi-account rotation, agent-initiated requests always stay on the same account as the preceding user request to avoid billing a premium request on a different account.\n\n## Forward proxy mode\n\nUse `--proxy` to enable a forward HTTP/HTTPS proxy on the same port as the API server. In this mode, the server handles both normal API requests (reverse proxy) and forwarded client traffic (forward proxy) on a single port:\n\n```bash\npython copilot_adapter.py serve --proxy\n```\n\nThe proxy intercepts HTTPS connections to the following hosts:\n\n- **`api.githubcopilot.com`** — rewrites `X-Initiator: user` to `agent` so requests are not billed as premium\n- **`api.openai.com`**, **`api.anthropic.com`**, **`generativelanguage.googleapis.com`** — LLM API requests are redirected to the local adapter and routed through Copilot; non-API requests (e.g. update checks, MCP registry) are forwarded to the original host\n\nAll other traffic is tunneled transparently. If `HTTPS_PROXY` or `HTTP_PROXY` is set, outbound connections are chained through the upstream proxy.\n\n**Client setup:**\n\n```bash\nexport HTTPS_PROXY=http://127.0.0.1:18080\nexport NODE_EXTRA_CA_CERTS=~/.config/copilot-adapter/ca.pem\n```\n\nA self-signed CA certificate is generated automatically on first use and stored in `~/.config/copilot-adapter/` (or the directory specified by `--ca-dir`). Use `ca-cert` to generate the CA ahead of time or show its path:\n\n```bash\npython copilot_adapter.py ca-cert\n# CA certificate: ~/.config/copilot-adapter/ca.pem\n#   Subject:  CN=copilot-adapter MITM CA\n#   Valid:    2026-04-07 to 2036-04-05\n```\n\nThe client must trust this CA for HTTPS interception to work:\n\n- **Node.js clients** (e.g. Claude Code): set `NODE_EXTRA_CA_CERTS` to the CA certificate path\n- **Electron apps** (e.g. Claude Desktop) and **browsers**: install the CA in the system trust store:\n  ```powershell\n  # Windows (run as Administrator)\n  certutil -addstore Root \"%USERPROFILE%\\.config\\copilot-adapter\\ca.pem\"\n  ```\n\nThis mode is useful when you want to transparently reduce premium billing for any client that supports `HTTPS_PROXY`, without changing the client's API endpoint configuration.\n\n## Model mapping\n\nModel names in incoming requests are rewritten before being sent to the Copilot API.\n\n**Built-in Claude normalization** — Copilot uses dotted version numbers for Claude models (e.g. `claude-opus-4.7`) while clients like Claude Code send hyphenated, date-suffixed names (e.g. `claude-opus-4-7-20260215`). The adapter always normalizes Claude model IDs automatically: the dash after the first major version becomes a dot, and trailing `-\u003cdigits\u003e` segments are dropped. Non-digit suffixes (e.g. `-fast`) are preserved.\n\nExamples:\n- `claude-opus-4-7-20260215` → `claude-opus-4.7`\n- `claude-opus-4-7` → `claude-opus-4.7`\n- `claude-haiku-4-5-20251001` → `claude-haiku-4.5`\n- `claude-opus-4-6-fast` → `claude-opus-4.6-fast`\n\n**User-configured mappings** — You can add your own glob-pattern mappings (for cross-provider remaps, deprecated model IDs, etc.). When a user-configured pattern matches, it takes precedence over the built-in Claude normalization.\n\nConfigure via any of these (highest precedence first):\n\n1. **CLI / env var** — repeatable `--model-map` flag or comma-separated env var:\n\n   ```bash\n   python copilot_adapter.py serve \\\n     --model-map 'gpt-4-turbo=gpt-4-0125-preview'\n\n   # Or via environment variable (comma-separated)\n   export COPILOT_ADAPTER_MODEL_MAP='gpt-4-turbo=gpt-4-0125-preview'\n   ```\n\n2. **Config file** — add a `model_map` object to the [config file](#config-file):\n\n   ```json\n   {\n     \"model_map\": {\n       \"gpt-4-turbo\": \"gpt-4-0125-preview\"\n     }\n   }\n   ```\n\nPatterns use glob syntax (`*` matches anything) and are checked in order — the first match wins. If no pattern matches, Claude-family IDs are auto-normalized (above); other models pass through unchanged. Model mapping is applied to all endpoints (chat completions, responses, embeddings, Gemini).\n\nFor Anthropic `/v1/messages` requests, the adapter also uses the final mapped model to choose the upstream Copilot endpoint:\n\n- **Anthropic target** (for example `claude-sonnet-4.6`) — proxied natively to Anthropic Messages\n- **Responses-only OpenAI target** (for example `gpt-5.4`) — converted to OpenAI `/v1/responses`\n- **Other OpenAI-compatible targets** — converted to `/v1/chat/completions`\n\nThis preserves Anthropic features like thinking / `output_config.effort` when a Claude client is mapped to a Responses-only OpenAI model.\n\n\n## Authentication\n\n### API token protection\n\nProtect the reverse API proxy with Bearer tokens so only authorized clients can use it:\n\n```bash\n# Generate a token\npython copilot_adapter.py tokens --generate\npython copilot_adapter.py tokens --generate --label \"my-laptop\"\n\n# List tokens\npython copilot_adapter.py tokens\n\n# Revoke a token by value or label\npython copilot_adapter.py tokens --revoke sk-abc123...\npython copilot_adapter.py tokens --revoke my-laptop\n```\n\nOnce tokens exist (via `tokens --generate`, `--api-token` flag, or `api_tokens` in the config file), all API endpoints except the health check (`GET /`) require `Authorization: Bearer \u003ctoken\u003e`:\n\n```bash\n# Start with stored tokens (generated via `tokens --generate`)\npython copilot_adapter.py serve\n\n# Or pass tokens explicitly\npython copilot_adapter.py serve --api-token sk-abc123...\n\n# Client usage\ncurl -H \"Authorization: Bearer sk-abc123...\" http://127.0.0.1:18080/v1/models\n```\n\nIf no tokens are configured, the API is unprotected (open access).\n\n### Forward proxy authentication\n\nProtect the forward proxy with HTTP Basic authentication:\n\n```bash\npython copilot_adapter.py serve --proxy --proxy-user myuser --proxy-password mypass\n```\n\nClients must include credentials in the proxy URL:\n\n```bash\nexport HTTPS_PROXY=http://myuser:mypass@127.0.0.1:18080\n```\n\nProxy authentication only applies to forward proxy requests (CONNECT and absolute-URL requests). Direct API requests use Bearer token authentication instead.\n\n## Tool configuration\n\nThe `config` subcommand automatically configures popular agentic coding tools to point at this proxy:\n\n```bash\n# Configure a tool to use the proxy\npython copilot_adapter.py config claude-code\npython copilot_adapter.py config codex\npython copilot_adapter.py config gemini-cli\npython copilot_adapter.py config opencode\n\n# Revert a tool back to its default provider\npython copilot_adapter.py config claude-code --revert\npython copilot_adapter.py config codex --revert\n\n# Specify a custom host/port or API token\npython copilot_adapter.py config claude-code --host 0.0.0.0 --port 8080 --api-token sk-abc123...\n```\n\nSupported tools:\n\n| Tool | Config file | What it sets |\n|------|------------|-------------|\n| `claude-code` | `~/.claude/settings.json` | `ANTHROPIC_BASE_URL`, `ANTHROPIC_API_KEY` in the `env` block |\n| `codex` | `~/.codex/config.toml` | `model_provider` and `[model_providers.copilot-adapter]` section |\n| `gemini-cli` | `~/.gemini/settings.json` | `baseUrl` and `apiKey` fields |\n| `opencode` | `~/.config/opencode/opencode.json` | `copilot-adapter` provider block |\n\nA `.copilot-adapter.bak` backup is created before modifying any config file. When reverting, the backup is restored if it exists; otherwise the added keys are removed.\n\nIf `--api-token` is not specified, the first stored API token (from `tokens --generate`) is used automatically, if any.\n\n### Manual client configuration\n\nYou can also point any OpenAI, Anthropic, or Gemini SDK client at the local server manually:\n\n```bash\n# OpenAI\nexport OPENAI_BASE_URL=http://127.0.0.1:18080/v1\nexport OPENAI_API_KEY=unused\n\n# Anthropic\nexport ANTHROPIC_BASE_URL=http://127.0.0.1:18080\nexport ANTHROPIC_API_KEY=unused\n\n# Gemini\nexport GEMINI_API_BASE=http://127.0.0.1:18080/v1beta\n```\n\n## Server-side web search\n\nWhen a model responds with a `web_search` tool call, the adapter intercepts it and executes the search server-side using [DuckDuckGo](https://github.com/deedy5/ddgs) (`ddgs` package). The search results are injected back into the conversation and the model continues generating a response.\n\nFor Anthropic clients, the adapter also emits Anthropic-native `server_tool_use` and `web_search_tool_result` content blocks so clients such as Claude Code can render structured web-search results instead of only seeing plain-text continuation output.\n\nFor non-Anthropic clients, the client still does not see the intermediate tool call.\n\n\nThis enables web search for any model routed through the adapter, even if the client doesn't support executing web search tool calls. It works with both streaming and non-streaming requests.\n\nIf the model returns `web_search` alongside other tool calls, the adapter passes all tool calls through to the client instead of intercepting.\n\nThe model may call `web_search` multiple times in a single request (e.g. refining its query). The adapter allows up to 3 iterations by default, configurable with `--web-search-iterations N` (or `web_search_iterations` in the config file). Set to 0 to disable server-side interception entirely and pass `web_search` calls through to the client.\n\n**Proxy support:** If `HTTPS_PROXY` or `HTTP_PROXY` environment variables are set, DuckDuckGo searches are routed through the proxy.\n\n## Anthropic built-in tools\n\nAnthropic clients (e.g. Claude Code) may send built-in tool types like `web_search_20250305`, `text_editor_20250124`, and `code_execution_20250522`. The Copilot API doesn't support these type-prefixed tools. The adapter handles them as follows:\n\n- **`web_search_*`** — Converted to a `web_search` function tool and [intercepted server-side](#server-side-web-search)\n- **Other built-in types** (e.g. `text_editor_*`, `code_execution_*`) — Stripped from the request, since these are handled client-side and don't need to be sent to the model\n\n## Parameter compatibility\n\nThe proxy normalizes provider-specific request parameters after model mapping so cross-provider remaps keep working.\n\n- **Token limits** — Some targets require different token-limit fields and minimums. The proxy automatically uses the correct field based on the final mapped model and endpoint: `max_tokens` for Claude and Gemini targets, `max_completion_tokens` for OpenAI chat-completions targets, and `max_output_tokens` for OpenAI Responses targets. For Responses targets, very small Anthropic `max_tokens` values are raised to the upstream minimum when required.\n- **Reasoning / thinking effort** — When an Anthropic request includes thinking settings and is mapped to an OpenAI-style target, the proxy converts that intent to the nearest OpenAI reasoning effort. For example, Claude Code effort `max` mapped to `gpt-5.4` becomes reasoning effort `xhigh`.\n- **Endpoint selection for mapped Anthropic requests** — Anthropic `/v1/messages` requests are routed to the upstream endpoint required by the mapped target model, so Responses-only models such as `gpt-5.4` keep reasoning effort and tool support instead of falling back to `/v1/chat/completions`.\n\nExamples:\n- Anthropic `output_config.effort: high` → OpenAI `reasoning_effort: high`\n- Anthropic `output_config.effort: max` → OpenAI `reasoning_effort: xhigh`\n- Anthropic client mapped to `gpt-5.4` → upstream `/v1/responses`\n\n**Effort clamping for native Anthropic passthrough** — Copilot restricts the effort levels accepted for Anthropic models more tightly than Anthropic's direct API. The proxy clamps unsupported values automatically:\n- `claude-opus-4.7`: Copilot only accepts `medium`; any other effort (`low`, `high`, `max`, `xhigh`) is clamped to `medium`.\n- Other Anthropic models: `max` and `xhigh` are clamped to `high` (the highest level Copilot accepts).\n\nThis normalization is based on the final mapped model, so it works even when model mapping redirects requests across providers.\n\n## Available models\n\nRun `python copilot_adapter.py serve` and visit `http://127.0.0.1:18080/v1/models` to see all models available through your Copilot subscription. Models include offerings from OpenAI, Anthropic, Google, and xAI.\n\nNote: some newer models (e.g. `gpt-5.4`) only support the `/v1/responses` endpoint, not `/v1/chat/completions`. The adapter handles this automatically for Anthropic `/v1/messages` requests after model mapping.\n\n## Tests\n\nThe test suite runs integration tests against the live Copilot API, plus unit tests for account rotation logic.\n\n```bash\npip install -r tests/requirements.txt\n\n# Run all tests (authenticates on first run via device flow)\npython -m pytest\n\n# Run a specific test module\npython -m pytest tests/test_client.py\npython -m pytest tests/test_adapters.py\npython -m pytest tests/test_endpoints.py\npython -m pytest tests/test_account_manager.py\n```\n\nTests are organized into four modules:\n\n- **`test_client.py`** — `CopilotClient` directly: models, chat completions, streaming, responses API, embeddings\n- **`test_adapters.py`** — format adapters end-to-end: OpenAI passthrough, Anthropic Messages, Gemini generateContent (streaming + non-streaming, multi-turn, system messages, parameter mapping)\n- **`test_endpoints.py`** — FastAPI routes via ASGI transport: all endpoints across all three API formats\n- **`test_account_manager.py`** — account rotation strategies, agent stickiness, exhaustion detection (unit tests, no auth required)\n\nTests use the cheapest available models (`gpt-4o-mini` for chat, `gpt-5-mini` for responses, `text-embedding-3-small` for embeddings) to minimize premium request usage. Model constants are centralized in `tests/conftest.py`.\n\n## How it works\n\n1. **Device flow OAuth** authenticates with GitHub and stores tokens in `~/.config/copilot-adapter/tokens.json`\n2. GitHub tokens are exchanged for short-lived Copilot API tokens via `api.github.com/copilot_internal/v2/token`, automatically refreshed every ~25 minutes with concurrent-access protection (double-checked locking ensures only one refresh happens at a time)\n3. For multi-account setups, the `AccountManager` selects which account to use based on the configured rotation strategy, sticking to the same account for agent-initiated follow-ups\n4. Incoming requests are translated (if needed) to the format Copilot expects, model names are rewritten via the configurable model map, the correct upstream endpoint is selected (`/v1/messages`, `/v1/chat/completions`, or `/v1/responses`), and responses are translated back to the client's expected format\n5. In forward proxy mode (`--proxy`), the server also accepts `CONNECT` tunnels on the same port — traffic to `api.githubcopilot.com` is MITM'd to rewrite billing headers, traffic to OpenAI/Anthropic/Gemini APIs is redirected to the local adapter, and all other traffic is tunneled transparently\n\n## Known issues\n\n### PowerShell enters debug mode after Ctrl+C (Windows 10)\n\nOn Windows 10, pressing Ctrl+C to stop the server may cause PowerShell to enter debug mode with a message like:\n\n```\nEntering debug mode. Type 'h' or '?' for help.\n```\n\nThis is a [known bug in PSReadLine 2.0.0](https://github.com/PowerShell/PSReadLine/issues/1193) bundled with Windows 10. To fix it, upgrade PSReadLine in an elevated PowerShell:\n\n```powershell\nInstall-Module PSReadLine -Force -SkipPublisherCheck\n```\n\nIf you're behind an HTTP proxy:\n\n```powershell\n[System.Net.WebRequest]::DefaultWebProxy = New-Object System.Net.WebProxy(\"http://proxy-host:port\")\n[System.Net.WebRequest]::DefaultWebProxy.Credentials = [System.Net.CredentialCache]::DefaultCredentials\nInstall-Module PSReadLine -Force -SkipPublisherCheck\n```\n\nRestart PowerShell after upgrading. This issue does not affect Windows 11, Windows Terminal, or cmd.exe.\n\n### PATs don't work for organization-managed Copilot seats\n\nThe `api.github.com/copilot_internal/v2/token` endpoint returns 404 for `ghp_` Personal Access Tokens when the Copilot seat is managed through a GitHub organization (Business or Enterprise plan). This endpoint only works with OAuth tokens obtained via the device flow (`ghu_` prefix).\n\n**Symptom:** `serve` prints \"Authenticated as \u0026lt;username\u0026gt;\" but then fails with:\n\n```\nError: Failed to get Copilot token: Client error '404 Not Found' for url 'https://api.github.com/copilot_internal/v2/token'\n```\n\n**Fix:** Use the device flow instead of a PAT:\n\n```bash\npython copilot_adapter.py login\n```\n\nIf you previously added the PAT to the cache, remove it first:\n\n```bash\npython copilot_adapter.py accounts --remove \u003cusername\u003e\npython copilot_adapter.py login\n```\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fblhsing%2Fcopilot-adapter","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fblhsing%2Fcopilot-adapter","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fblhsing%2Fcopilot-adapter/lists"}