{"id":50718178,"url":"https://github.com/ThinkWatchProject/ThinkWatch","last_synced_at":"2026-06-26T22:00:39.113Z","repository":{"id":348741001,"uuid":"1199670940","full_name":"ThinkWatchProject/ThinkWatch","owner":"ThinkWatchProject","description":"Enterprise AI bastion host for secure AI API and MCP access, with unified proxying, RBAC, audit logs, rate limiting, and cost tracking across OpenAI, Anthropic, Gemini, and self-hosted LLMs.","archived":false,"fork":false,"pushed_at":"2026-05-27T08:49:35.000Z","size":4822,"stargazers_count":1011,"open_issues_count":0,"forks_count":19,"subscribers_count":10,"default_branch":"main","last_synced_at":"2026-05-27T09:07:52.669Z","etag":null,"topics":["ai","ai-gateway","ai-security","ai-tools","mcp","mcp-gateway","mcp-security","mcp-server","security"],"latest_commit_sha":null,"homepage":"https://thinkwat.ch","language":"Rust","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ThinkWatchProject.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-04-02T15:29:52.000Z","updated_at":"2026-05-27T08:49:40.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/ThinkWatchProject/ThinkWatch","commit_stats":null,"previous_names":["agentbastion/agentbastion","thinkwatchproject/thinkwatch"],"tags_count":2,"template":false,"template_full_name":null,"purl":"pkg:github/ThinkWatchProject/ThinkWatch","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ThinkWatchProject%2FThinkWatch","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ThinkWatchProject%2FThinkWatch/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ThinkWatchProject%2FThinkWatch/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ThinkWatchProject%2FThinkWatch/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ThinkWatchProject","download_url":"https://codeload.github.com/ThinkWatchProject/ThinkWatch/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ThinkWatchProject%2FThinkWatch/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":34834415,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-26T02:00:06.560Z","response_time":106,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai","ai-gateway","ai-security","ai-tools","mcp","mcp-gateway","mcp-security","mcp-server","security"],"created_at":"2026-06-09T21:00:25.962Z","updated_at":"2026-06-26T22:00:39.106Z","avatar_url":"https://github.com/ThinkWatchProject.png","language":"Rust","funding_links":[],"categories":["LLMOps","*Ops for AI"],"sub_categories":["LLM Gateways \u0026 Proxies","LLMOps"],"readme":"\u003cp align=\"center\"\u003e\n  \u003cpicture\u003e\n    \u003csource media=\"(prefers-color-scheme: dark)\" srcset=\"assets/logo-dark.png\"\u003e\n    \u003cimg src=\"assets/logo.png\" alt=\"ThinkWatch\" width=\"480\"\u003e\n  \u003c/picture\u003e\n\u003c/p\u003e\n\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"https://img.shields.io/badge/Rust-000000?style=for-the-badge\u0026logo=rust\u0026logoColor=white\" /\u003e\n  \u003cimg src=\"https://img.shields.io/badge/React-20232A?style=for-the-badge\u0026logo=react\u0026logoColor=61DAFB\" /\u003e\n  \u003cimg src=\"https://img.shields.io/badge/PostgreSQL-316192?style=for-the-badge\u0026logo=postgresql\u0026logoColor=white\" /\u003e\n  \u003cimg src=\"https://img.shields.io/badge/Redis-DC382D?style=for-the-badge\u0026logo=redis\u0026logoColor=white\" /\u003e\n  \u003cimg src=\"https://img.shields.io/badge/Docker-2496ED?style=for-the-badge\u0026logo=docker\u0026logoColor=white\" /\u003e\n  \u003cimg src=\"https://img.shields.io/badge/Kubernetes-326CE5?style=for-the-badge\u0026logo=kubernetes\u0026logoColor=white\" /\u003e\n\u003c/p\u003e\n\n# ThinkWatch\n\n**[English](README.md) | [中文](README.zh-CN.md)**\n\n**The enterprise-grade secure gateway for AI.** Secure, audit, and govern every AI API call and MCP tool invocation across your organization — from a single control plane.\n\nJust as an SSH secure gateway is the single gateway through which all server access must flow, ThinkWatch is the single gateway through which all AI access must flow. Every model request. Every tool call. Every token. Authenticated, authorized, rate-limited, logged, and accounted for.\n\n```\n                    ┌──────────────────────────────────────┐\n Claude Code ──────\u003e│                                      │──\u003e OpenAI\n Cursor ───────────\u003e│    Gateway  :3000                    │──\u003e Anthropic\n Custom Agent ─────\u003e│    AI API + MCP Unified Proxy        │──\u003e Google Gemini\n CI/CD Pipeline ───\u003e│                                      │──\u003e Azure OpenAI / AWS Bedrock\n                    └──────────────────────────────────────┘\n                    ┌──────────────────────────────────────┐\n Admin Browser ────\u003e│    Console  :3001                    │\n                    │    Management UI + Admin API          │\n                    └──────────────────────────────────────┘\n```\n\n## Why ThinkWatch?\n\nAs AI agents proliferate across engineering teams, organizations face a growing governance challenge:\n\n- **API keys scattered everywhere** — hardcoded in `.env` files, shared in Slack, rotated never\n- **Zero visibility** — who used which model, how many tokens, at what cost?\n- **No access control** — every developer has direct access to every model and every MCP tool\n- **Compliance gaps** — no audit trail for AI-assisted code generation or data access\n- **Cost surprises** — monthly AI bills that nobody can explain or attribute\n\nThinkWatch solves all of this with a single deployment.\n\n## Key Features\n\n### AI API Gateway\n- **Multi-format API proxy** — natively serves OpenAI Chat Completions (`/v1/chat/completions`), Anthropic Messages (`/v1/messages`), and OpenAI Responses (`/v1/responses`) APIs on a single port; works as a drop-in replacement for Cursor, Continue, Cline, Claude Code, and the OpenAI/Anthropic SDKs\n- **Multi-provider routing** — OpenAI, Anthropic, Google Gemini, Azure OpenAI, AWS Bedrock, or any OpenAI-compatible endpoint\n- **Automatic format conversion** — Anthropic Messages API, Google Gemini, Azure OpenAI, AWS Bedrock Converse API, and more, all behind a unified interface\n- **Provider auto-loading** — active providers are loaded from the database at startup and registered in the model router; default model prefixes (`gpt-`/`o1-`/`o3-`/`o4-` for OpenAI, `claude-` for Anthropic, `gemini-` for Google) route automatically; Azure and Bedrock require explicit model registration\n- **Streaming SSE pass-through** — zero-overhead forwarding with real-time token counting\n- **Virtual API keys** — issue scoped `tw-` keys; the same `tw-` token works on both the AI gateway and the MCP gateway via a per-key `surfaces` allowlist\n- **API key lifecycle management** — automatic rotation with grace periods, per-key inactivity timeout, expiry warnings, and background policy enforcement\n- **Composable rate limits \u0026 budgets** — multi-window sliding limits (1m / 5m / 1h / 5h / 1d / 1w) and natural-period token budgets (daily / weekly / monthly), keyed per user, per API key, per provider, or per MCP server. See [Rate limits \u0026 budgets](#rate-limits--budgets) below\n- **Per-model token weighting** — gpt-4o tokens can count more than gpt-3.5 tokens against the same quota via configurable `input_multiplier` / `output_multiplier`\n- **Circuit breaker** — three-state (Closed/Open/HalfOpen) circuit breaker with configurable failure threshold and recovery period\n- **Retry with exponential backoff** — configurable retries with jitter for network errors and upstream rate limits\n- **Real-time cost tracking** — per-model pricing with team attribution\n\n### MCP Gateway\n\nThinkWatch's MCP gateway is built on a single design choice that most MCP proxies skip: **the upstream server sees the real end user, not a shared service account.** Every other capability follows from that. See [MCP Gateway: how we compare](#mcp-gateway-how-we-compare) below.\n\n- **Per-user upstream identity** — every MCP request carries the calling user's own OAuth token or PAT to GitHub / Notion / Linear / Slack / Atlassian / Feishu / GitLab / Cloudflare / Google / Discord / etc. Tokens are AES-256-GCM encrypted in `mcp_user_credentials`. Most \"MCP gateways\" pin one shared admin token to the server config — so the upstream's audit log shows every action as the same service account. ThinkWatch propagates real identity end to end.\n- **Multi-account per user** — bind work and personal GitHub accounts to the same server, label them, mark one as default. The same physical user can have multiple credential rows per server.\n- **API-key → account override** — pin different `tw-` keys to different upstream accounts on the same server. Your Cursor key uses your personal GitHub; the CI key uses the service-bot. One user, multiple agents, multiple identities — without re-issuing credentials.\n- **One-paste OAuth onboarding** — paste an MCP URL, click 一键发现. The probe walks the full RFC 9728 → RFC 8414 → RFC 7591 chain: triggers `WWW-Authenticate` from a JSON-RPC `initialize`, follows the `resource_metadata` hint, fetches AS metadata at the path-aware well-known location, and runs Dynamic Client Registration if the upstream advertises it. When DCR isn't supported the UI shows three concrete next steps (copy callback URL → register app upstream → paste Client ID back) with no protocol jargon.\n- **Public-client support** — detects `token_endpoint_auth_methods_supported: [\"none\"]` and propagates `is_public_client` end to end. The Client Secret input is hidden for issuers like Feishu that don't use one.\n- **Static-token vault** — for upstreams that only speak PATs / API keys (GitHub PATs, Notion integration tokens). Same per-user surface, same encrypted storage, same /connections UX. Static tokens are verified at paste time so users find out immediately if the token is wrong.\n- **Per-user tool catalogs** — when an upstream filters tool visibility by scope or role (Atlassian, enterprise IDPs), the user-authenticated `tools/list` is cached in `mcp_user_tools` and **only ever returned to that user**. The system-level `mcp_tools` catalog only stores anonymous-discoverable tools. No cross-user leakage; auth-required servers are no longer \"0 tools\" until someone manually fixes it.\n- **Three-tier upstream subject resolution** — `/connections` shows real upstream identities (`@octocat`, `alice@acme.com`, Slack `Bob`). Resolver tries JWT decode (free) → userinfo endpoint (priority-ranked extractor: `preferred_username` → `sub` → `accountId` → `login` → `email`) → `.well-known` discovery. Pre-seeded for GitHub, Notion, Slack, Atlassian, Cloudflare, GitLab, Discord, Google.\n- **MCP Store with 23+ curated templates** — GitHub, Notion, Linear, Slack, Atlassian, Cloudflare, GitLab, Discord, Google, Feishu and more, pre-seeded with the right OAuth scopes, userinfo endpoints, and PAT help URLs. One-click install. Daily catalog refresh from the registry.\n- **Generic MCP client UX** — for users who haven't authorized yet, the gateway still serves the tool catalog but tags every entry with `_meta: { requires_user_auth: true, server_id, server_name, authorize_url }`. `tools/call` against an unauthorized server returns JSON-RPC error code `-32050` with the authorize URL, so Cursor / Claude Desktop / any compliant MCP client can prompt the user to authorize without the gateway hiding the catalog.\n- **Tool-level RBAC** — per-role tool grants on the server side, per-key `allowed_mcp_tools` allowlist on the API-key side (bounded by the issuing role's grants). A locked-down service key can hold exactly two tools and nothing else.\n- **`mcp:connect` permission** — gates the /connections page and authorize/revoke flow. Granted to admin / team_manager / developer by default.\n- **Cache scoped by `(user, account_label)`** — MCP response cache never serves Alice's authorized response to Bob. Direct-mode (no per-user creds) servers still get global caching.\n- **Race-free token refresh** — OAuth refresh holds a `pg_advisory_xact_lock` keyed by `(server, user, label)` so concurrent tool calls don't race two refresh attempts. Terminal refresh failure purges the row so the next call cleanly surfaces `NeedsUserCredentials`.\n- **Health probe robustness** — 401/403 from an anonymous probe is *expected* on auth-required MCPs; the server is marked `auth_required` (amber), not `disconnected` (red). The /mcp/servers list shows \"—\" tool count with a hover tooltip for that state.\n- **Step-by-step registration wizard** — auth-mode-aware edit form, per-credential Test Connection button on /connections, admin foot-gun guards (verify static tokens at paste time, no silent fall-through to default account).\n- **SSRF hardening** — discovery and OAuth probe URLs are validated through an injected URL validator; private IP ranges, link-local, and metadata-service hosts are rejected.\n- **Namespace isolation** — `github__create_issue`, `postgres__query` — no tool name collisions across upstreams.\n- **Connection pooling \u0026 health monitoring** — automatic reconnection, periodic background probes, per-server health surfaced on the dashboard.\n- **Full audit trail** — every tool invocation logged with user, account label, parameters, response, latency, and error in ClickHouse alongside the AI gateway logs.\n- **Rate limits + budgets apply to MCP** — the same engine that meters AI tokens also meters MCP tool calls; per-user, per-API-key, per-server subjects all stack. See [Rate limits \u0026 budgets](#rate-limits--budgets).\n- **One key, two surfaces** — the same `tw-` virtual key works on both `/v1/chat/completions` and `/mcp` via a per-key `surfaces` allowlist (`ai_gateway`, `mcp_gateway`, or both).\n\n### MCP Gateway: how we compare\n\nMost \"MCP gateways\" available today are thin reverse proxies: one shared admin token per upstream, no end-user identity, and \"auth\" means \"did this user pass the gateway's bearer token\". That model works for hobby setups and breaks the moment a real organization plugs it into GitHub / Atlassian / Linear / Slack — every tool call shows up as the same service account, scopes can't differ per user, and there's no honest answer to \"who renamed this Linear ticket?\".\n\nThinkWatch is built for the second case.\n\n| Capability | Typical MCP proxy | ThinkWatch |\n|---|---|---|\n| **Upstream sees the real user** | ❌ shared admin token / env var | ✅ per-user OAuth tokens + PAT vault, AES-256-GCM encrypted at rest |\n| **Multi-account per user** | ❌ one config = one identity | ✅ work + personal accounts, labelled, default + named |\n| **API key → account binding** | ❌ keys are opaque | ✅ Cursor → personal, cron → service-bot, all on the same user |\n| **OAuth onboarding** | ❌ hand-edit JSON / env | ✅ paste URL, one-click DCR (RFC 9728 → 8414 → 7591), public-client support |\n| **Per-user tool visibility** | ❌ assumes uniform catalog (privilege-escalation if cached) | ✅ separate `mcp_user_tools` per user, system catalog only holds anonymous-discoverable tools |\n| **Generic MCP client UX (Cursor/Claude Desktop)** | ❌ unauthorized = blank list | ✅ catalog returned with `_meta.requires_user_auth` markers + `-32050` with `authorize_url` |\n| **Tool-level RBAC** | ❌ all-or-nothing | ✅ per-role grants + per-key `allowed_mcp_tools` allowlist bounded by role |\n| **Built-in catalog** | ❌ DIY everything | ✅ 23+ templates seeded (GitHub / Notion / Linear / Slack / Atlassian / Cloudflare / GitLab / Discord / Google / Feishu …) |\n| **Audit / rate limits / budgets** | ❌ LLM-only or absent | ✅ same engine meters AI tokens AND MCP tool calls |\n| **Response cache safety** | ❌ shared cache leaks across users | ✅ scoped by `(user, account_label)` for OAuth/PAT servers |\n| **OAuth refresh races** | ❌ duplicate refresh attempts under concurrency | ✅ `pg_advisory_xact_lock` per `(server, user, label)` |\n| **Health classification** | ❌ 401/403 = \"unhealthy\" (false alarms) | ✅ `auth_required` is a first-class amber state |\n| **SSRF protection** | ❌ raw fetcher | ✅ injected URL validator, private/link-local/metadata IPs rejected |\n| **One key, two surfaces** | ❌ separate stacks for AI vs MCP | ✅ single `tw-` key, per-key `surfaces` allowlist |\n\nIf your only requirement is \"expose a few public MCP servers to a small team\", the simple proxies do fine. The moment you need *who did what, on whose behalf, with what scopes, billed to which cost center* — ThinkWatch is the design point.\n\n### Security \u0026 Compliance\n- **Dual-port architecture** — gateway (public-facing) and console (internal-only) on separate ports\n- **Role-based access control** — 5-tier RBAC: Super Admin, Admin, Team Manager, Developer, Viewer\n- **SSO/OIDC** — plug into Zitadel, Okta, Azure AD, or any OIDC-compliant provider\n- **AES-256-GCM encryption** — provider API keys and secrets encrypted at rest\n- **SHA-256 key hashing** — virtual API keys stored as hashes; plaintext shown exactly once\n- **Content Security Policy** — CSP headers on the console port to prevent XSS and injection attacks\n- **JWT entropy enforcement** — minimum 32-character secret with entropy validation at startup\n- **Startup dependency validation** — verifies PostgreSQL, Redis, and encryption key availability with clear error messages before accepting traffic\n- **Security headers** — X-Content-Type-Options, X-Frame-Options, CORS whitelisting, request timeouts\n- **Soft-delete** — users, providers, and API keys use soft-delete (`deleted_at` column) with automatic purge after 30 days\n- **Password complexity** — minimum 8 characters with required uppercase, lowercase, and digit\n- **Session IP binding** — admin sessions bound to client IP; stolen tokens cannot be replayed from a different network\n- **Distroless containers** — minimal attack surface in production (2MB runtime image, no shell)\n\n### Operations \u0026 Configuration\n- **Dynamic configuration** — most settings stored in database (`system_settings` table), configurable via Web UI (Admin \u003e Settings with 7 category tabs)\n- **First-run setup wizard** — guided `/setup` wizard creates the super_admin account, configures the site, and optionally adds the first provider and API key\n- **Configuration Guide** — built-in `/gateway/guide` page in the web console with copy-paste setup instructions for Claude Code, Cursor, Continue, Cline, OpenAI SDK, Anthropic SDK, and cURL; auto-detects the gateway URL\n- **Multi-instance sync** — configuration changes propagated across instances via Redis Pub/Sub\n- **Data retention policies** — configurable retention periods for usage records and audit logs with automatic daily purge\n\n### Observability\n- **Prometheus metrics** — `GET /metrics` endpoint on the gateway port (3000) exposing `gateway_requests_total`, `gateway_request_duration_seconds`, `gateway_tokens_total`, `gateway_rate_limited_total`, `circuit_breaker_state`, `gateway_stream_completion_total`, `audit_log_dropped_total`, and more. **Disabled by default** — set `METRICS_BEARER_TOKEN` (the secret-generation script populates it automatically) to mount the route, then pass the same value as `Authorization: Bearer \u003ctoken\u003e` from your scraper. When unset, the route returns 404 and the recorder isn't even installed (zero memory / CPU cost).\n- **Enhanced health checks** — `/health/live` (liveness probe), `/health/ready` (readiness probe verifying PostgreSQL, Redis, **and at least one active provider** — so K8s won't route AI traffic to a fresh pod with an empty router), `/api/health` (detailed latency and pool statistics)\n- **ClickHouse-powered audit logs** — SQL-queryable audit logs across all API calls and tool invocations, stored in ClickHouse for high-performance columnar analytics\n- **Audit log forwarding** — multi-channel delivery: UDP/TCP Syslog (RFC 5424), Kafka, and HTTP webhooks — route audit events to any SIEM, data lake, or alerting pipeline\n- **Usage analytics** — token consumption by user, team, model, and time period\n- **Cost analytics** — MTD spend, budget utilization, per-model cost breakdown\n- **Health dashboard** — real-time status of PostgreSQL, Redis, ClickHouse, and all MCP servers\n- **Unified log explorer** — search across audit, gateway, MCP, access, and platform logs from a single page with structured query syntax\n\n## Rate limits \u0026 budgets\n\nThinkWatch enforces two parallel kinds of quota at every gateway request,\nboth managed from the same admin UI:\n\n| | Sliding-window rate limits | Natural-period budget caps |\n|---|---|---|\n| **What it counts** | Requests OR weighted tokens, depending on the rule's `metric` | Weighted tokens only |\n| **Window shape** | Rolling 60-bucket window: `1m / 5m / 1h / 5h / 1d / 1w` | Calendar-aligned: `daily / weekly / monthly` (resets on the period boundary) |\n| **Backing store** | Redis ZSET-style buckets | Redis INCR counters keyed by `subject:period:bucket_id` |\n| **When it fires** | Pre-flight (requests metric) AND post-flight (tokens metric) | Post-flight only |\n| **Hard or soft?** | Hard for requests metric, soft for tokens metric | Soft cap — exactly one request can push you over before subsequent calls in the same period are rejected |\n\n### Subjects\n\nA single request can be subject to multiple rules and budgets at once. The\nengine resolves the request to a set of `(subject_kind, subject_id)` tuples\nand runs every enabled rule against all of them in one atomic Lua check.\n**Any rule rejecting → the request is rejected. All-or-nothing INCR.**\n\n| Subject | Rate limit rules | Budget caps |\n|---|---|---|\n| `user`        | ✅ ai_gateway / mcp_gateway | ✅ |\n| `api_key`     | ✅ ai_gateway / mcp_gateway | ✅ |\n| `provider`    | ✅ ai_gateway only          | ✅ |\n| `mcp_server`  | ✅ mcp_gateway only         | ❌ (no token cost concept) |\n| `team`        | (use user / api_key)        | ✅ |\n\nFor an AI request the engine resolves: `api_key + user + provider`. For an\nMCP request: `user + mcp_server`. Per-subject limits stack — a developer\ncan have a personal cap, AND their API key can have a tighter cap, AND the\nprovider can have a global cap, all enforced simultaneously.\n\n### Three flavors of \"tokens\"\n\nThree numbers float around the system. Don't confuse them.\n\n| Number | Source | Used for | Where it shows up |\n|---|---|---|---|\n| **Raw tokens** | `gateway_logs.input_tokens / output_tokens` | Real provider-billed token counts | Analytics, cost reports |\n| **Weighted tokens** | `raw × models.input_multiplier / output_multiplier` | Quota accounting (rate limits + budgets) | Limits panel \"X / Y used\" |\n| **USD cost** | `raw × models.input_price / output_price` | Billing | Costs page |\n\nThe two `models` columns are independent. Weighted tokens are a *relative*\nunit (gpt-3.5-turbo = 1.0 by convention); they have no global USD value.\nUSD always comes from the real per-token price. By default every model has\nmultiplier `1.0`, which means quotas count raw tokens. Tune the multipliers\non the model management page to make a 1M-token monthly cap actually\nsurvive a single gpt-4o burst.\n\n### Example\n\n\u003e *Operator goal:* \"developers get 60 requests/minute on the AI gateway,\n\u003e 1M weighted tokens/day, and 20M weighted tokens/month — but the entire\n\u003e OpenAI provider has a 100k requests/hour ceiling.\"\n\n```\nOn the developer USER subject:\n  rate_limit_rule  ai_gateway / requests / 60s   → 60\n  rate_limit_rule  ai_gateway / tokens   / 1d    → 1_000_000\n  budget_cap       monthly                       → 20_000_000\n\nOn the OpenAI PROVIDER subject:\n  rate_limit_rule  ai_gateway / requests / 1h    → 100_000\n```\n\nA request from any developer key against gpt-4o then has to clear:\n1. Developer's per-minute request rule\n2. OpenAI provider's per-hour request rule\n3. After the response: developer's per-day token rule\n4. After the response: developer's monthly token budget\n\nAny one of those failing → 429 with the rule label in the body\n(`user:requests/1m`, `provider:requests/1h`, etc).\n\n### Failure mode\n\nWhen Redis is unavailable the engine defaults to **fail open** and bumps\nthe `gateway_rate_limiter_fail_open_total` / `gateway_budget_fail_open_total`\nmetrics so the AI control plane keeps running through a Redis blip.\nOperators who would rather refuse traffic than miss accounting can flip\n`security.rate_limit_fail_closed = true` on the Settings page; the\ngateway then returns 429 (`rate_limiter_unavailable`) for any request\nthe engine couldn't check, and bumps `gateway_rate_limiter_fail_closed_total`.\n\n### Budget alerts\n\nCrossing 50% / 80% / 95% / 100% of any budget cap fires a structured\n`budget threshold crossed` warn log and bumps\n`gateway_budget_alert_total{subject_kind, period, threshold_pct}`.\nEach threshold fires at most once per period bucket — if a request\ntakes you from 60% straight past 100% the 80 / 95 / 100 lines all\nfire on that single response, but the next request in the same\nperiod won't re-fire any of them.\n\n### Streaming token accounting\n\nToken-metric rules and budget caps fire on streaming responses too,\nprovided the upstream actually surfaces usage on the SSE stream:\n\n- **OpenAI**: requires the client to set\n  `stream_options.include_usage = true` on the request body.\n- **Anthropic**: cumulative usage on the final `message_delta` event\n  is captured automatically.\n\nIf neither upstream surfaces usage on the stream the post-flight\naccounting silently no-ops for that request — the rate-limit and\nbudget counters stay accurate within the limits of what the\nupstream is willing to tell us.\n\n### PII redaction and streaming responses\n\nThe PII redactor (configured at Admin \u003e Settings \u003e PII patterns)\nruns on every prompt before it's forwarded upstream — emails,\nphone numbers, ID cards etc. are replaced with `{{EMAIL_xxx_1}}`\nstyle placeholders so the upstream never sees the original. On\n**non-streaming** responses the gateway then runs `restore_response`\non the way back, so the client sees the original PII the model\nwould have echoed.\n\nOn **streaming** (SSE) responses the gateway does NOT restore the\nplaceholders — re-stitching them across chunk boundaries is its own\nproject. As a result, streaming clients see the placeholder text\nverbatim if the model echoes user PII back in its answer. The\nprompt-side redaction still happens, so the upstream provider\nnever sees the original PII either way; this is purely a\nclient-side cosmetic gap on streaming responses. Switch the client\nto non-streaming if it needs the original text restored.\n\n## Tech Stack\n\n| Layer | Technology |\n|-------|-----------|\n| Backend | Rust, Axum 0.8, SQLx 0.8, fred 10 (Redis), OpenTelemetry |\n| Frontend | React 19, TypeScript 6, Vite 8, shadcn/ui, Tailwind CSS 4 |\n| Database | PostgreSQL 18 |\n| Cache \u0026 Rate Limiting | Redis 8 |\n| Audit Log Storage | ClickHouse (columnar OLAP database) |\n| SSO | Zitadel (or any OIDC provider) |\n| Containers | Distroless (2MB runtime), Helm Chart for K8s |\n\n## Quick Start\n\n```bash\n# 1. Start infrastructure\nmake infra\n\n# 2. Generate dev secrets + start backend (gateway :3000 + console :3001)\nmake dev-secrets       # writes .env from .env.example with random secrets\nmake dev-backend\n\n# 3. Start frontend dev server\ncd web \u0026\u0026 pnpm install \u0026\u0026 pnpm dev\n\n# 4. Complete the setup wizard at http://localhost:5173/setup\n```\n\nSee the **[Deployment Guide](https://thinkwat.ch/docs/deployment-guide)** for production setup with Docker Compose or Kubernetes.\n\n## Documentation\n\nFull documentation: **[thinkwat.ch/docs](https://thinkwat.ch/docs)**\n\n| Document | Description |\n|----------|-------------|\n| **[Architecture](https://thinkwat.ch/docs/architecture)** | System design, dual-port model, data flow diagrams |\n| **[Deployment Guide](https://thinkwat.ch/docs/deployment-guide)** | Docker Compose, Kubernetes Helm, SSL, production hardening |\n| **[Configuration](https://thinkwat.ch/docs/configuration)** | All environment variables and their effects |\n| **[API Reference](https://thinkwat.ch/docs/api-reference)** | Complete endpoint documentation for Gateway and Console |\n| **[Security](https://thinkwat.ch/docs/security)** | Auth model, encryption, RBAC, threat model, hardening checklist |\n| **[Secret Rotation](https://thinkwat.ch/docs/secret-rotation)** | Rotating provider keys, JWT secrets, and admin credentials |\n\n## Port Architecture\n\n| Port | Server | Exposure | Purpose |\n|------|--------|----------|---------|\n| `3000` | Gateway | **Public** — expose to AI clients | `/v1/chat/completions`, `/v1/messages`, `/v1/responses`, `/v1/models`, `/mcp`, `/metrics`†, `/health/*` |\n| `3001` | Console | **Internal** — behind VPN/firewall | `/api/*` management endpoints, Web UI |\n\n† `/metrics` is only mounted when `METRICS_BEARER_TOKEN` is set. Without the env var the route returns 404 and the Prometheus recorder isn't installed.\n\n\u003e In production, **only port 3000** should be reachable from the internet. Port 3001 should be restricted to your admin network.\n\n## Project Structure\n\n```\nThinkWatch/\n├── crates/\n│   ├── server/          # Dual-port Axum server (gateway + console)\n│   ├── gateway/         # AI API proxy: routing, streaming, rate limiting, cost tracking\n│   ├── mcp-gateway/     # MCP proxy: JSON-RPC, tool aggregation, access control\n│   ├── auth/            # JWT, OIDC, API key, password hashing, RBAC\n│   └── common/          # Config, DB, models, crypto, validation, audit logger\n├── db/                  # Declarative PostgreSQL schema (schema.sql + seeds.sql)\n├── web/                 # React frontend — ~20 page components\n├── deploy/\n│   ├── docker/          # Dockerfile.server (distroless), Dockerfile.web (nginx)\n│   ├── docker-compose.yml       # Production deployment\n│   ├── docker-compose.dev.yml   # Development (PG + Redis + ClickHouse + Zitadel)\n│   └── helm/think-watch/      # Kubernetes Helm chart\n└── ...\n```\n\n\u003e Documentation: **[thinkwat.ch/docs](https://thinkwat.ch/docs)**\n\n## Contributing\n\nContributions are welcome. Please open an issue to discuss before submitting a PR for major changes.\n\n## License\n\nThinkWatch is source-available under the [Business Source License 1.1](LICENSE).\nNon-production use is free. Production use is free up to both `10,000,000`\nBillable Tokens and `10,000` MCP Tool Calls per UTC calendar month; above\neither threshold, a commercial license is required and priced by usage tiers.\n\nSee [LICENSING.md](LICENSING.md) for the production-use thresholds, the\nBillable Token and MCP Tool Call definitions, the tiering model, and the\nchangeover to `GPL-2.0-or-later`.\n\n## Star History\n\n\u003ca href=\"https://www.star-history.com/#ThinkWatchProject/ThinkWatch\u0026Date\"\u003e\n \u003cpicture\u003e\n   \u003csource media=\"(prefers-color-scheme: dark)\" srcset=\"https://api.star-history.com/svg?repos=ThinkWatchProject/ThinkWatch\u0026type=Date\u0026theme=dark\" /\u003e\n   \u003csource media=\"(prefers-color-scheme: light)\" srcset=\"https://api.star-history.com/svg?repos=ThinkWatchProject/ThinkWatch\u0026type=Date\" /\u003e\n   \u003cimg alt=\"Star History Chart\" src=\"https://api.star-history.com/svg?repos=ThinkWatchProject/ThinkWatch\u0026type=Date\" /\u003e\n \u003c/picture\u003e\n\u003c/a\u003e\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FThinkWatchProject%2FThinkWatch","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FThinkWatchProject%2FThinkWatch","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FThinkWatchProject%2FThinkWatch/lists"}