{"id":51277102,"url":"https://github.com/aivinay/switchboard","last_synced_at":"2026-06-29T22:00:13.978Z","repository":{"id":367159047,"uuid":"1277300215","full_name":"aivinay/switchboard","owner":"aivinay","description":"Privacy-aware, local-first router for your CLI coding agents (Codex, Claude Code) and local LLMs (Ollama) — keeps sensitive prompts on-device and cuts premium-model usage.","archived":false,"fork":false,"pushed_at":"2026-06-24T20:49:43.000Z","size":499,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-06-24T21:08:48.065Z","etag":null,"topics":["ai-agents","claude-code","codex","fastapi","llm","llm-orchestration","llm-routing","local-first","local-llm","model-routing","ollama","privacy","privacy-preserving-ai","python","semantic-memory"],"latest_commit_sha":null,"homepage":"https://github.com/aivinay/switchboard","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/aivinay.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":"CITATION.cff","codeowners":null,"security":"SECURITY.md","support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-06-22T19:12:52.000Z","updated_at":"2026-06-24T19:48:50.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/aivinay/switchboard","commit_stats":null,"previous_names":["aivinay/switchboard"],"tags_count":2,"template":false,"template_full_name":null,"purl":"pkg:github/aivinay/switchboard","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aivinay%2Fswitchboard","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aivinay%2Fswitchboard/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aivinay%2Fswitchboard/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aivinay%2Fswitchboard/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/aivinay","download_url":"https://codeload.github.com/aivinay/switchboard/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aivinay%2Fswitchboard/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":34944147,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-29T02:00:05.398Z","response_time":58,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai-agents","claude-code","codex","fastapi","llm","llm-orchestration","llm-routing","local-first","local-llm","model-routing","ollama","privacy","privacy-preserving-ai","python","semantic-memory"],"created_at":"2026-06-29T22:00:13.237Z","updated_at":"2026-06-29T22:00:13.971Z","avatar_url":"https://github.com/aivinay.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003ch1 align=\"center\"\u003e\n  \u003cpicture\u003e\n    \u003csource media=\"(prefers-color-scheme: dark)\" srcset=\"assets/switchboard-wordmark-dark.svg\"\u003e\n    \u003cimg src=\"assets/switchboard-wordmark.svg\" alt=\"Switchboard. Simple work local. Hard work premium. Secrets stay home.\" width=\"780\"\u003e\n  \u003c/picture\u003e\n\u003c/h1\u003e\n\n\u003cp align=\"center\"\u003e\n  \u003cstrong\u003e62% fewer premium-agent calls · 4.1/5 quality vs 4.6/5 always-premium · 0 benchmark leaks observed\u003c/strong\u003e\n\u003c/p\u003e\n\n\u003cp align=\"center\"\u003e\n  \u003ca href=\"https://github.com/aivinay/switchboard/actions/workflows/ci.yml\"\u003e\u003cimg src=\"https://github.com/aivinay/switchboard/actions/workflows/ci.yml/badge.svg\" alt=\"CI\"\u003e\u003c/a\u003e\n  \u003ca href=\"https://pypi.org/project/switchboard-local/\"\u003e\u003cimg src=\"https://img.shields.io/pypi/v/switchboard-local.svg\" alt=\"PyPI\"\u003e\u003c/a\u003e\n  \u003cimg src=\"https://img.shields.io/badge/python-3.11%2B-blue.svg\" alt=\"Python 3.11+\"\u003e\n  \u003ca href=\"LICENSE\"\u003e\u003cimg src=\"https://img.shields.io/badge/license-MIT-green.svg\" alt=\"License: MIT\"\u003e\u003c/a\u003e\n  \u003ca href=\"https://doi.org/10.5281/zenodo.20836918\"\u003e\u003cimg src=\"https://img.shields.io/badge/DOI-10.5281%2Fzenodo.20836918-blue.svg\" alt=\"DOI\"\u003e\u003c/a\u003e\n\u003c/p\u003e\n\n\u003cp align=\"center\"\u003e\n  \u003ca href=\"#get-started\"\u003eInstall\u003c/a\u003e ·\n  \u003ca href=\"#evaluation\"\u003eEvaluation\u003c/a\u003e ·\n  \u003ca href=\"#how-it-works\"\u003eHow it works\u003c/a\u003e ·\n  \u003ca href=\"#privacy\"\u003ePrivacy\u003c/a\u003e ·\n  \u003ca href=\"#the-paper\"\u003ePaper\u003c/a\u003e ·\n  \u003ca href=\"docs/\"\u003eDocs\u003c/a\u003e\n\u003c/p\u003e\n\n---\n\n![Switchboard automatic routing demo](auto_route_demo.gif)\n\n\u003cp align=\"center\"\u003e\u003cem\u003eOne session, three backends: local by default, Codex for code, Claude Code for reasoning.\u003c/em\u003e\u003c/p\u003e\n\nSwitchboard wraps the CLI tools you already use — no separate service, no proxy, no resold API access — and routes each prompt with deterministic rules before any learned classifier runs.\n\nIn its 100-case benchmark, Switchboard kept **62% of requests off premium\nagents** while reaching **4.1/5 quality** against a **4.6/5 always-premium\nbaseline**, with **100% answered** and **no benchmark leaks observed**. See\n[Evaluation](#evaluation) for the numbers and reproduction bundle.\n\nUse it when you want to:\n\n- **Spend premium agent quota where it matters** instead of sending every prompt\n  to the most expensive backend.\n- **Keep sensitive prompts local** with a deterministic privacy floor that\n  learned routing cannot override.\n- **Switch backends mid-session without losing context** — shared session history, semantic memory, and redaction travel with you across Ollama, Codex, and Claude Code.\n\n## What it does\n\n- **Routes** across local [Ollama](https://ollama.com) models, the **Codex** CLI, and **Claude Code** — deterministic rules first, with optional tiny learned classifiers for recall.\n- **Private mode** — a deterministic keyword/PII/secret-format floor blocks sensitive prompts from ever reaching a subscription backend, even on fallback.\n- **Grounds** answers with deterministic tools (time/date, safe calculator, unit conversion, keyless live stock \u0026 news) instead of letting a model guess.\n- **Carries context** across backend switches: recent user, assistant, and tool turns are assembled into one redacted session prompt.\n- **Compresses** long context with a Headroom-inspired layer; the model-boundary pass only summarizes recent conversation, while trusted facts, retrieved memory, and the current request survive intact.\n- **Remembers** across backends via local embedding-based semantic memory, with SQLite search available for direct memory lookup.\n- **Explains every decision** and records metadata-only telemetry (no prompt/response bodies).\n- **Ships its own evaluation** — a 100-case quality benchmark, a local LLM-as-judge, and a multi-run statistical harness.\n\n## How it works\n\n```\n  UI / CLI  ──►  Session manager (shared history across all backends)\n                      │\n                      ▼\n              Capability detector (regex) ◄──► deterministic tools\n                      │  (learned tool dispatcher recovers misses; tool verifies)\n                      ▼\n              Privacy floor  (keywords + PII + secret formats — a match is FINAL)\n                      │  (learned sensitivity escalator may only ADD protection)\n                      ▼\n              Deterministic policy   ← always wins; unknown ⇒ local\n                      │  (learned router supplies recall: tool / local / coding / reasoning)\n                      ▼\n              Context builder + redaction ◄── semantic memory\n                      │\n                      ▼\n              Compression (metadata + history-only context pass)\n                      │\n                      ▼\n        Ollama (default) │ Codex (coding) │ Claude Code (reasoning)\n                      │\n                      ▼\n              Response sanitizer ─► metadata-only telemetry\n```\n\nThe organizing invariant: **deterministic policy always precedes and overrides\nthe learned components.** Privacy, tool grounding, forced selection, and\nfallback keep working even when the local model runtime — and therefore every\nlearned component — is down.\n\n## Get started\n\n```bash\npip install switchboard-local\n```\n\n```bash\n# point it at a local model runtime (install Ollama, then pull a small model)\nollama pull llama3.2:3b\n\n# sanity-check your setup\nswitchboard doctor\n\n# ask — Switchboard routes it, grounds it, and tells you why\nswitchboard ask \"summarize this error log and suggest a fix\"\n\n# see the routing decision without running anything\nswitchboard route \"refactor the auth module and add tests\"\n\n# prefer your browser? launch the local web UI, then open http://127.0.0.1:8080/ui\nswitchboard ui\n```\n\nRequires **Python 3.11+**. Codex / Claude Code backends are optional — without\nthem, everything routes locally. See [docs/usage.md](docs/usage.md).\n\n## Context, memory, and tokens\n\nSwitchboard has two user-facing CLI surfaces:\n\n- `switchboard route ...` previews the same core backend decision without calling a model.\n- The web UI, bare `switchboard ask ...`, and `switchboard ask --backend auto ...` use the stateful core workflow: shared sessions, model switching, semantic-memory retrieval, context-boundary compression, and backend telemetry all run on the same path.\n\nExample stateful CLI session:\n\n```bash\nswitchboard ask --backend auto --new-session \"Remember: prefer local models for private notes.\"\nswitchboard ask --backend auto --session \u003csession_id\u003e --memory \"What should you remember?\"\n```\n\nLong prompts and long sessions record token estimates and savings metadata. The request-level pass can shorten an oversized raw prompt; the context-boundary pass then compresses only `\u003crecent_conversation\u003e`. The `\u003ctrusted_facts\u003e`, `\u003clong_term_memory\u003e`, and `\u003ccurrent_user_request\u003e` blocks are protected from that second pass so grounding and intent are not traded away for token budget.\n\nMemory is local. `switchboard memory add` stores the item in SQLite and, when `semantic_memory_enabled` is on and Ollama can serve `nomic-embed-text`, indexes an embedding for cross-backend retrieval. `switchboard memory search` works as local text search even when embeddings are unavailable.\n\nDetails: [docs/context-memory-compression.md](docs/context-memory-compression.md).\n\n## Evaluation\n\nA 100-case benchmark across five task categories (coding, reasoning,\nsummarization, private, grounding), run on real backends and judged by a local\nmodel, over **multiple independent runs** (means shown; full per-condition\nnumbers, confidence intervals, and significance tests are in the paper):\n\n| Policy            | Quality (1–5) | Premium usage | Privacy leaks | Answered |\n|-------------------|:-------------:|:-------------:|:-------------:|:--------:|\n| always-local      | 3.4           | 0%            | **0**         | 100%     |\n| rules             | 3.8           | 27%           | **0**         | 100%     |\n| hybrid            | 3.9           | 28%           | **0**         | 100%     |\n| **learned**       | **4.1**       | 38%           | **0**         | 100%     |\n| always-premium    | 4.6           | 100%          | **0**         | 61%¹     |\n\n\u003csub\u003e¹ The \"just use the premium agent for everything\" baseline must \u003cem\u003eblock\u003c/em\u003e every\nsensitive prompt to stay leak-free, so its coverage collapses — exactly the gap\nSwitchboard closes. \u003cstrong\u003eNo benchmark leaks were observed in any condition or run.\u003c/strong\u003e\u003c/sub\u003e\n\nThese numbers come from a real-backend benchmark whose full harness travels with the paper's [reproduction bundle on Zenodo](https://doi.org/10.5281/zenodo.20836918).\n\n## Context: why this exists (Uber, Microsoft, 2026)\n\nSome employers have begun rationing AI coding-tool spend: Uber reportedly\ncapped engineers at $1,500/month per AI tool after burning its 2026 AI budget\nin four months ([Bloomberg](https://www.bloomberg.com/news/articles/2026-06-02/uber-caps-usage-of-ai-tools-like-claude-code-to-cut-costs));\nMicrosoft's Experiences + Devices org reportedly moved off Claude Code to\nGitHub Copilot CLI ([Windows Central](https://www.windowscentral.com/microsoft/microsoft-cancels-claude-code-licenses-shifting-developers-to-github-copilot-cli-a-move-likely-driven-by-financial-motives)).\n\nA spend cap controls the invoice, but it does not decide which work actually\nneeds a premium model or which prompts should never leave the machine. A better\npattern is **routing, not blanket rationing**: decide request by request what\nbelongs local, what needs a coding agent, and what is worth premium reasoning.\n\nSwitchboard is a reference implementation of that pattern for a single\nworkstation. It is not yet an enterprise product; it is the smallest honest\nproof that local-first routing can work, with a reproducible benchmark to back\nit.\n\n## Privacy\n\nSwitchboard is local-first and privacy-aware by construction:\n\n- The **deterministic privacy floor runs before any non-local routing**; a positive verdict is final and cannot be overridden by a learned component or by prompt wording.\n- **Secret-format detection** (cloud keys, JWTs, PEM blocks, env credentials) shares its patterns with context redaction, so the routing boundary and the redactor can't drift apart.\n- **Metadata-only telemetry** — prompt and response bodies are not stored by default.\n- Semantic-memory **embeddings and the eval judge run locally**.\n\nSwitchboard deliberately does **not** resell API access, scrape web UIs, or\nbypass provider limits — subscription CLIs are invoked exactly as the\nauthenticated user could invoke them, in read-only sandbox modes. See\n[SECURITY.md](SECURITY.md) and [docs/privacy.md](docs/privacy.md).\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cb\u003eWhat's inside\u003c/b\u003e\u003c/summary\u003e\n\n- **Deterministic router** — keyword rules; unknown prompts default local-first.\n- **Learned router / tool dispatcher / sensitivity escalator** — tiny softmax classifiers over a locally-computed embedding (~50 ms, pure-Python inference), each retrainable in seconds from your own thumbs-down corrections behind golden-accuracy gates. They fail closed to the deterministic path.\n- **Tools** — time/date with timezones, safe abstract-syntax-tree calculator, unit conversion, keyless live stock quotes \u0026 news.\n- **Compression** — structure-aware, deterministic, dependency-free; preserves task header, code blocks, tracebacks, and grounded facts.\n- **Semantic memory** — `nomic-embed-text` embeddings, cosine retrieval, local memory commands, and SQLite text-search fallback for direct search.\n- **Evaluation** — mock evals (CI), real-backend smoke suite, 100-case quality benchmark, adversarial tester/developer dogfooding loop.\n\n\u003c/details\u003e\n\n## Configuration\n\nSettings live in `config/personal.yaml` (ships with safe local-first defaults —\nsee `config/personal.example.yaml`). Highlights:\n\n```yaml\npreferences:\n  router_mode: \"learned\"      # rules | llm | hybrid | learned\n  private_mode: true          # block sensitive prompts from non-local backends\n  allow_cloud: false\n  compression_enabled: true\n  compression_threshold_tokens: 1000\n  semantic_memory_enabled: true\n  semantic_memory_top_k: 3\n  claude_code_web_search: true  # allow Claude Code WebSearch for live-data fallback\n  finance_provider: \"yahoo\"\n  news_provider: \"google_news_rss\"\n```\n\nProvider API keys are referenced **by environment-variable name** (e.g.\n`OPENAI_API_KEY`), never inline. See [docs/overrides.md](docs/overrides.md).\n\n## The paper\n\nSwitchboard is described in a preprint — *\"Privacy-Aware Hybrid Routing Across\nHeterogeneous AI Agents.\"* The manuscript, the multi-run\nbenchmark harness, the statistical-aggregation and figure scripts, and the\nper-case records are archived together as a reproduction bundle on Zenodo:\n[10.5281/zenodo.20836918](https://doi.org/10.5281/zenodo.20836918).\n\nThis repository ships only the software. It deliberately does not carry the\npaper's experiment-running or figure-generation tooling — that lives with the\narchival record so the code stays focused on the router itself.\n\n## Development\n\n```bash\nmake install     # .venv + editable install with dev extras\nmake check       # ruff + mypy + the full test suite\n```\n\nSee [CONTRIBUTING.md](CONTRIBUTING.md). Issues and PRs welcome — please preserve\nthe privacy invariant described there.\n\n## Citing Switchboard\n\nA preprint is available on Zenodo with a citable DOI —\n[10.5281/zenodo.20836918](https://doi.org/10.5281/zenodo.20836918). See\n[CITATION.cff](CITATION.cff) for machine-readable metadata.\n\n\u003e V. Gupta, \"Switchboard: Privacy-Aware Hybrid Routing Across Heterogeneous AI\n\u003e Agents,\" Zenodo, 2026, doi:10.5281/zenodo.20836918.\n\n## License\n\n[MIT](LICENSE) © 2026 Vinay Gupta\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Faivinay%2Fswitchboard","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Faivinay%2Fswitchboard","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Faivinay%2Fswitchboard/lists"}