{"id":50358615,"url":"https://github.com/weisser-dev/open-model-prism","last_synced_at":"2026-05-30T00:01:26.350Z","repository":{"id":351846310,"uuid":"1212753118","full_name":"weisser-dev/open-model-prism","owner":"weisser-dev","description":"Multi-tenant OpenAI-compatible LLM gateway with intelligent auto-routing, benchmark-weighted model selection, cost tracking, and full admin UI. Self-hosted, source-available.","archived":false,"fork":false,"pushed_at":"2026-04-16T19:13:27.000Z","size":894,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-04-16T19:34:18.927Z","etag":null,"topics":["ai","llm","model-router","token"],"latest_commit_sha":null,"homepage":"https://weisser-dev.github.io/open-model-prism/","language":"JavaScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/weisser-dev.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":"roadmap.md","authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-04-16T17:40:48.000Z","updated_at":"2026-04-16T19:13:31.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/weisser-dev/open-model-prism","commit_stats":null,"previous_names":["weisser-dev/open-model-prism"],"tags_count":1,"template":false,"template_full_name":null,"purl":"pkg:github/weisser-dev/open-model-prism","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/weisser-dev%2Fopen-model-prism","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/weisser-dev%2Fopen-model-prism/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/weisser-dev%2Fopen-model-prism/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/weisser-dev%2Fopen-model-prism/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/weisser-dev","download_url":"https://codeload.github.com/weisser-dev/open-model-prism/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/weisser-dev%2Fopen-model-prism/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":33675019,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-05-29T02:00:06.066Z","response_time":107,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai","llm","model-router","token"],"created_at":"2026-05-30T00:01:23.070Z","updated_at":"2026-05-30T00:01:26.337Z","avatar_url":"https://github.com/weisser-dev.png","language":"JavaScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cp align=\"center\"\u003e\n  \u003cimg src=\"docs/logo.svg\" alt=\"Open Model Prism\" width=\"140\" /\u003e\n\u003c/p\u003e\n\n# Open Model Prism\n\n[![Live Demo](https://img.shields.io/badge/Live%20Demo-weisser--dev.github.io-blue)](https://weisser-dev.github.io/open-model-prism/)\n\n\u003e **Idea \u0026 Architecture by [weisser-dev](https://github.com/weisser-dev) — Developed 100% by Claude Sonnet \u0026 Opus**\n\nA **multi-tenant, OpenAI-API-compatible LLM gateway** with intelligent model routing, cost tracking, and a full admin UI.\n\nConnect any LLM provider (OpenAI, Anthropic, AWS Bedrock, Azure OpenAI, Ollama, vLLM, OpenRouter, and more) through a single unified endpoint. Automatically classify incoming requests and route them to the optimal model based on task type, content signals, cost tier, and capability benchmarks.\n\n---\n\n## Key Features\n\n- **Multi-provider gateway** — OpenAI, Anthropic, AWS Bedrock, Azure OpenAI, Ollama, vLLM, OpenRouter, and more through a single OpenAI-compatible API\n- **Intelligent auto-routing** — LLM classifier + keyword rules + system prompt role detection route each request to the optimal model and tier\n- **Four-step force-route strictness** \u003csup\u003ev2.1\u003c/sup\u003e — `off` / `fim_only` / `smart` / `all`. `smart` keeps the user's deliberate model choice whenever the classified category is substantial (coding, reasoning, system design, analysis …) and only re-routes trivial categories like smalltalk or chat title generation. A configurable per-request opt-out token (`--model-prism-accept-model` by default) lets senior developers bypass force-routing for a single prompt\n- **Category-specific target system prompts** \u003csup\u003ev2.1.1\u003c/sup\u003e — Each routing category can carry an optional system prompt that is injected into the forwarded request, priming the target model for the specific task type (e.g. coding categories → \"You are an expert software engineer…\"; legal → \"You are a careful legal analyst…\"). Configurable per category in the admin UI; 10 built-in categories ship with sensible defaults\n- **8-tier cost hierarchy** — micro, minimal, low, medium, advanced, high, ultra, critical — with configurable cost mode (economy/balanced/quality) and explicit tier boost\n- **Test Route** — dry-run any prompt through the full routing pipeline and see a step-by-step trace of every decision (signal extraction, rule matching, classifier output, overrides, model selection)\n- **Synthetic Tests** \u003csup\u003eAI\u003c/sup\u003e — generate test prompts using AI, run them against the current routing config, and evaluate results with AI-powered analysis including quality and cost optimization suggestions\n- **Routing Debug Panel** — every auto-routed request in the log shows an expandable debug view with extracted signals, pre-routing status, classifier confidence, applied overrides, and final model selection\n- **Configurable override rules** — vision upgrade, tool call minimum tier, frustration detection, conversation turn escalation, domain gate, confidence fallback — each with description tooltips\n- **Multi-tenant isolation** — per-tenant API keys, model whitelists, rate limits, quotas, and independent routing configuration\n- **Cost tracking \u0026 savings** — real-time dashboard with spending, savings vs baseline, model distribution, and daily trends\n- **Guided tour** — interactive walkthrough highlighting key UI areas including the new routing debug and synthetic test features\n\n---\n\n## Table of Contents\n\n- [Key Features](#key-features)\n- [Requirements](#requirements)\n- [Local Development](#local-development)\n- [Environment Variables](#environment-variables)\n- [Architecture](#architecture)\n- [Gateway Usage](#gateway-usage)\n- [Production Deployment](#production-deployment)\n- [Operations \u0026 Maintenance](#operations--maintenance)\n- [Troubleshooting](#troubleshooting)\n- [Documentation](#documentation)\n\n---\n\n## Requirements\n\n| Dependency | Version | Notes |\n|---|---|---|\n| Node.js | 22+ | Backend + frontend build |\n| Docker | 24+ | MongoDB in dev, full stack in prod |\n| Docker Compose | v2 (`docker compose`) | Not `docker-compose` v1 |\n| [just](https://github.com/casey/just) | any | Optional but recommended command runner |\n| MongoDB | 7 | Provided via Docker Compose |\n\nInstall `just` on macOS:\n```bash\nbrew install just\n```\n\nOn Linux:\n```bash\ncurl --proto '=https' --tlsv1.2 -sSf https://just.systems/install.sh | bash -s -- --to /usr/local/bin\n```\n\n---\n\n## Local Development\n\n### 1. Clone and enter the repo\n\n```bash\ngit clone https://github.com/weisser-dev/open-model-prism\ncd model-prism\n```\n\n### 2. Copy environment file\n\n```bash\ncp .env.example .env\n```\n\nEdit `.env` — at minimum set `JWT_SECRET` and `ENCRYPTION_KEY`:\n\n```bash\nJWT_SECRET=your-random-secret-at-least-32-chars\nENCRYPTION_KEY=exactly-32-chars-here-!-padding\n```\n\nGenerate secure values:\n```bash\n# JWT_SECRET\nopenssl rand -hex 32\n\n# ENCRYPTION_KEY (must be exactly 32 characters)\nopenssl rand -hex 16\n```\n\n### 3. Start everything\n\n```bash\njust dev\n```\n\nThis single command:\n- Starts MongoDB in Docker\n- Waits for MongoDB to be healthy\n- Starts the backend on **http://localhost:3000** (hot reload via nodemon)\n- Starts the frontend on **http://localhost:5173** (hot reload via Vite HMR)\n\nOpen **http://localhost:5173** and follow the 4-step setup wizard.\n\n\u003e The setup wizard creates your first admin user, connects your first provider, and sets up your first tenant. Takes about 2 minutes.\n\n### 4. Available dev commands\n\n```bash\njust dev          # Start MongoDB + backend + frontend with hot reload\njust dev-clean    # Wipe DB, fresh start (setup wizard reappears)\njust logs         # Tail backend logs\njust mongo        # Open MongoDB shell\njust build        # Build production Docker image\njust up           # Start via Docker Compose (production-like)\njust down         # Stop Docker Compose\njust clean        # Remove all containers, volumes, node_modules\n```\n\n### 5. Manual setup (without just)\n\n```bash\n# Terminal 1 — MongoDB\ndocker compose up -d mongodb\n\n# Terminal 2 — Backend\ncd server \u0026\u0026 npm install \u0026\u0026 npm run dev\n\n# Terminal 3 — Frontend\ncd frontend \u0026\u0026 npm install \u0026\u0026 npm run dev\n```\n\n---\n\n## Environment Variables\n\nAll configuration after initial setup lives in the admin UI. Only infrastructure-level settings need env vars:\n\n| Variable | Required | Default | Description |\n|---|---|---|---|\n| `JWT_SECRET` | **yes** | — | JWT signing secret. Min 32 chars. Never rotate without invalidating all sessions. |\n| `ENCRYPTION_KEY` | **yes** | — | AES-256-GCM key for encrypting provider API keys at rest. Exactly 32 chars. **Changing this breaks all saved provider credentials.** |\n| `MONGODB_URI` | no | `mongodb://mongodb:27017/openmodelprism` | MongoDB connection string |\n| `PORT` | no | `3000` | Backend HTTP port |\n| `NODE_ENV` | no | `development` | `production` enables structured JSON logs |\n| `NODE_ROLE` | no | `full` | `full` / `control` / `worker` — see [Horizontal Scaling](#horizontal-scaling) |\n| `CORS_ORIGINS` | no | `*` | Comma-separated allowed origins |\n| `LOG_LEVEL` | no | `info` | `debug` / `info` / `warn` / `error` |\n| `OFFLINE` | no | `false` | `true` disables all outbound internet (air-gapped mode) |\n\n---\n\n## Architecture\n\n**Single-pod (default)** — everything in one container:\n```\nClients  →  Model Prism (NODE_ROLE=full)  →  MongoDB\n                 Admin UI + Gateway + API\n```\n\n**Scaled deployment** — control plane + worker pods:\n```\nClients (Continue · Cursor · Claude Code · Open WebUI · SDK …)\n  │\n  ├─ /api/:tenant/v1/*  ───────────────────► Worker pods  (NODE_ROLE=worker, scale freely)\n  ├─ /api/v1/*  (default tenant shorthand) ┘\n  ├─ /v1/*      (no-prefix shorthand)      ┘\n  │\n  └─ /* (admin UI, /api/prism/admin/*, auth) ────► Control plane  (NODE_ROLE=control, 1–2 pods)\n\nAll pods share one MongoDB — config changes propagate via Change Streams (\u003c500ms)\nor 15s polling fallback on standalone MongoDB.\n```\n\n**Gateway request pipeline:**\n```\nPOST /api/{tenant}/v1/chat/completions\n  ├─ Tenant Auth (API key → SHA-256 hash, expiry, enabled check)\n  ├─ Per-Tenant Rate Limiting (sliding window, in-memory)\n  ├─ Model Policy (whitelist/blacklist gate — auto-prism always passes)\n  ├─ Signal Extraction (token count, keywords, system prompt, code language)\n  ├─ Override Rules (vision, domain gate, security escalation, budget cap, …)\n  ├─ [LLM Classifier — only called when pre-routing confidence \u003c threshold]\n  ├─ Model Selection (category → benchmark-weighted price-performance matching)\n  ├─ Context Pre-flight (token estimate vs context window → auto-upgrade)\n  ├─ max_tokens Clamping (auto-clamp to model output limit)\n  ├─ Budget Guard (auto-economy mode when threshold reached)\n  ├─ Provider Adapter (OpenAI-compat · Bedrock · Azure · Ollama)\n  ├─ Response Enrichment (cost_info, auto_routing, context_fallback)\n  └─ Async Analytics (RequestLog, DailyStat — fire-and-forget)\n```\n\n---\n\n## Gateway Usage\n\n```bash\n# Standard OpenAI-compatible request\ncurl https://your-host/api/my-team/v1/chat/completions \\\n  -H \"Authorization: Bearer omp-your-api-key\" \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\"model\": \"gpt-4o\", \"messages\": [{\"role\": \"user\", \"content\": \"Hello\"}]}'\n\n# Auto-routing — Model Prism classifies and routes to the optimal model\ncurl ... -d '{\"model\": \"auto\", \"messages\": [...]}'\n```\n\nAuto-routing responses include:\n```json\n{\n  \"auto_routing\": { \"category\": \"code_generation\", \"model_id\": \"deepseek-coder-v2\", \"confidence\": 0.91 },\n  \"cost_info\": { \"actual_cost\": 0.0004, \"baseline_cost\": 0.0031, \"saved\": 0.0027 }\n}\n```\n\n**Client tool integration** — use the per-tenant **Generate Config** button in the admin UI for ready-to-paste snippets for Continue, OpenCode, Cursor, Claude Code, Open WebUI, Python/Node.js SDKs.\n\n---\n\n## Production Deployment\n\n### Docker Compose (single pod)\n\n```bash\ngit clone https://github.com/weisser-dev/open-model-prism\ncd model-prism\n\n# Create .env with real secrets\ncat \u003e .env \u003c\u003cEOF\nJWT_SECRET=$(openssl rand -hex 32)\nENCRYPTION_KEY=$(openssl rand -hex 16)\nNODE_ENV=production\nEOF\n\ndocker compose up -d\n```\n\nThe app is now running on port 3000. Put a reverse proxy (Caddy, nginx, Traefik) in front.\n\n**Caddy example:**\n```caddy\nprism.yourdomain.com {\n    reverse_proxy localhost:3000\n}\n```\n\n### Kubernetes (Helm)\n\nA Helm chart is included in the `helm/` directory:\n\n```bash\nhelm install model-prism ./helm \\\n  --namespace model-prism \\\n  --create-namespace\n\nkubectl port-forward svc/model-prism 3000:80 -n model-prism\n# → http://localhost:3000 — setup wizard\n```\n\nSee [`helm/README.md`](helm/README.md) for the full configuration reference.\n\n### First-run checklist\n\n- [ ] `JWT_SECRET` set to a random 32+ char string\n- [ ] `ENCRYPTION_KEY` set to exactly 32 characters — **write this down, cannot be changed later without re-entering all provider API keys**\n- [ ] MongoDB data directory persisted (bind mount or named volume)\n- [ ] Reverse proxy configured with TLS\n- [ ] Setup wizard completed (admin user, first provider, first tenant)\n\n---\n\n## Operations \u0026 Maintenance\n\n### Horizontal Scaling\n\n```bash\n# Scale worker pods (gateway only)\ndocker compose -f docker-compose.yml -f docker-compose.scaled.yml up --scale worker=3 -d\n```\n\nWorker pods (`NODE_ROLE=worker`) handle gateway traffic. The control pod (`NODE_ROLE=control`) runs the admin UI and analytics. All pods share one MongoDB.\n\n### Logs\n\n```bash\n# Docker\ndocker logs open-model-prism -f\n\n# Just\njust logs\n\n# Adjust log level without restart — via admin UI: System → Log Level\n```\n\n### MongoDB Backup \u0026 Restore\n\n**Manual backup:**\n```bash\ndocker exec open-model-prism-mongodb \\\n  sh -lc 'mongodump --archive --gzip' \u003e backup_$(date +%Y%m%d).archive.gz\n```\n\n**Restore:**\n```bash\ncat backup_20260101.archive.gz | docker exec -i open-model-prism-mongodb \\\n  sh -lc 'mongorestore --archive --gzip --drop'\n```\n\n### Upgrading\n\n```bash\ngit pull\ndocker compose build --no-cache\ndocker compose up -d\ndocker image prune -f\n```\n\nThe GitHub Actions workflow does this automatically on push to `main`.\n\n### Rotating JWT_SECRET\n\nAll active admin sessions will be invalidated immediately. Provider API keys are **not** affected.\n\n1. Stop the app\n2. Update `JWT_SECRET` in `.env`\n3. Start the app — all users must log in again\n\n### Rotating ENCRYPTION_KEY\n\n⚠️ **This is destructive.** All saved provider API keys will become undecryptable.\n\n1. Export all provider configs before rotating\n2. Update `ENCRYPTION_KEY`\n3. Restart — re-enter all provider API keys via the admin UI\n\n---\n\n## Troubleshooting\n\n**Setup wizard doesn't appear**\n→ The app detects an existing admin user. If starting fresh, run `just dev-clean` to wipe the DB.\n\n**Provider connection fails (\"Could not reach provider\")**\n→ Use the \"Check Connection\" button — it runs a detailed probe and shows exactly which URL/path failed. Common causes: wrong base URL (should not include `/v1`), wrong API key, SSL certificate issues (enable \"Skip SSL\" for self-signed certs).\n\n**\"Cannot read properties of undefined (reading 'toLowerCase')\"on Models page**\n→ A provider is missing the `providerName` field. Re-save the provider in the admin UI to trigger model re-discovery.\n\n**Charts show no data on dashboard**\n→ Analytics are written asynchronously. Wait for a few requests to complete, then refresh. If still empty, check `LOG_LEVEL=debug` for analytics errors.\n\n**Context overflow / model escalation not working**\n→ Ensure the provider's models have `contextWindow` values in the model registry. Models with no context window set will not trigger auto-upgrade.\n\n**LDAP login fails**\n→ Check the LDAP config in Settings → LDAP. The `bindDN` user needs read access to the user search base. Enable `debug` log level to see the full LDAP bind attempt.\n\n---\n\n## Documentation\n\n| Document | Contents |\n|---|---|\n| [docs/getting-started.md](docs/getting-started.md) | Installation, setup wizard, first tenant, production deployment |\n| [docs/architecture.md](docs/architecture.md) | System overview, request flow, directory structure, security, RBAC |\n| [docs/routing.md](docs/routing.md) | Full routing pipeline: signal extraction, override rules, LLM classifier, categories, presets, model selection |\n| [docs/providers.md](docs/providers.md) | Provider types, connection testing, model discovery, adapters, model registry |\n| [docs/tenants.md](docs/tenants.md) | Tenant config, API key lifecycle, model access control, routing config, self-service portal |\n| [docs/analytics.md](docs/analytics.md) | Cost tracking, token stats, dashboard metrics, Prometheus |\n| [docs/api-reference.md](docs/api-reference.md) | All gateway and admin API endpoints with request/response examples |\n| [docs/operations.md](docs/operations.md) | Production deployment: scaling, capacity planning, nginx config, security hardening, backup, upgrades |\n| [CHANGELOG.md](CHANGELOG.md) | Version history |\n\n---\n\n## License\n\nLicensed under the **Apache License 2.0**.\n\nSee [LICENSE](./LICENSE) for details.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fweisser-dev%2Fopen-model-prism","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fweisser-dev%2Fopen-model-prism","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fweisser-dev%2Fopen-model-prism/lists"}