{"id":47894896,"url":"https://github.com/fazer-ai/langfuse-proxy","last_synced_at":"2026-04-04T03:38:08.656Z","repository":{"id":344955340,"uuid":"1183490747","full_name":"fazer-ai/langfuse-proxy","owner":"fazer-ai","description":"Lightweight Bun + Elysia proxy for OpenAI-compatible APIs with Langfuse telemetry","archived":false,"fork":false,"pushed_at":"2026-03-17T02:23:16.000Z","size":140,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-03-17T13:24:24.335Z","etag":null,"topics":["anthropic","bun","elysia","gemini","langfuse","llm","observability","openai","proxy","telemetry"],"latest_commit_sha":null,"homepage":null,"language":"TypeScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/fazer-ai.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-03-16T16:56:05.000Z","updated_at":"2026-03-17T02:22:57.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/fazer-ai/langfuse-proxy","commit_stats":null,"previous_names":["fazer-ai/langfuse-proxy"],"tags_count":3,"template":false,"template_full_name":"fazer-ai/bun-elysia-react-tailwind","purl":"pkg:github/fazer-ai/langfuse-proxy","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fazer-ai%2Flangfuse-proxy","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fazer-ai%2Flangfuse-proxy/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fazer-ai%2Flangfuse-proxy/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fazer-ai%2Flangfuse-proxy/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/fazer-ai","download_url":"https://codeload.github.com/fazer-ai/langfuse-proxy/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fazer-ai%2Flangfuse-proxy/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31387017,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-04T01:22:39.193Z","status":"online","status_checked_at":"2026-04-04T02:00:07.569Z","response_time":60,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["anthropic","bun","elysia","gemini","langfuse","llm","observability","openai","proxy","telemetry"],"created_at":"2026-04-04T03:38:08.096Z","updated_at":"2026-04-04T03:38:08.646Z","avatar_url":"https://github.com/fazer-ai.png","language":"TypeScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"# langfuse-proxy\n\nA transparent proxy that forwards API requests to upstream LLM providers and sends telemetry to [Langfuse](https://langfuse.com) in the background. Zero latency overhead on the response path.\n\nSupports **OpenAI**, **Anthropic**, and **Google Gemini** APIs natively.\n\n## Architecture\n\n```sh\n                              +--\u003e Upstream OpenAI    (/v1/*)\nConsumer  --\u003e  Proxy  --------+--\u003e Upstream Anthropic (/v1/messages)\n                |             +--\u003e Upstream Gemini    (/v1beta/*)\n                |\n                v (background, non-blocking)\n             Langfuse\n```\n\n**How it works:**\n\n1. Consumer sends a standard API request to the proxy\n2. Proxy forwards it to the appropriate upstream provider\n3. Upstream response stream is split via `ReadableStream.tee()` — one branch goes to the consumer immediately, the other is consumed in the background for telemetry\n4. Langfuse receives a trace with full input/output, model, token usage, TTFB, and total duration\n\n**Key features:**\n\n- **Multi-provider** — native support for OpenAI, Anthropic, and Gemini APIs with provider-specific stream parsing and telemetry\n- **Passthrough auth** — consumers send their own API key, proxy forwards it upstream. No user management.\n- **OpenAI catch-all** — `ALL /v1/*` forwards any OpenAI-compatible request. Chat completions, embeddings, audio, images, assistants — all work automatically.\n- **Streaming support** — SSE streams are split and returned immediately. For OpenAI, the proxy injects `stream_options.include_usage` so Langfuse always gets token counts.\n- **Full telemetry** — every request is logged to Langfuse with input messages, output content, model, full token usage breakdown, TTFB, and total duration.\n- **Optional auth gate** — set `PROXY_API_KEY` to require consumers to authenticate with the proxy itself (timing-safe comparison).\n- **Upstream key override** — set `UPSTREAM_API_KEY` / `ANTHROPIC_API_KEY` / `GEMINI_API_KEY` to use a single key for all upstream requests regardless of what consumers send.\n- **Graceful shutdown** — SIGTERM/SIGINT stops accepting connections, waits for in-flight requests, and flushes Langfuse before exiting.\n\n## Getting Started\n\n**Prerequisites:** [Bun](https://bun.sh/) v1.0+\n\n```bash\n# Install dependencies\nbun install\n\n# Configure environment\ncp .env.example .env\n```\n\nEdit `.env` with your settings. At minimum, configure Langfuse credentials to enable telemetry:\n\n```env\nLANGFUSE_BASE_URL=https://cloud.langfuse.com\nLANGFUSE_PUBLIC_KEY=pk-lf-...\nLANGFUSE_SECRET_KEY=sk-lf-...\n```\n\nStart the server:\n\n```bash\n# Development (hot reload)\nbun dev\n\n# Production\nbun start\n```\n\n## Usage\n\n### OpenAI\n\nPoint any OpenAI-compatible SDK at the proxy:\n\n#### Python\n\n```python\nfrom openai import OpenAI\n\nclient = OpenAI(\n    base_url=\"http://localhost:3000/v1\",\n    api_key=\"sk-your-openai-key\",  # forwarded to upstream\n)\n\nresponse = client.chat.completions.create(\n    model=\"gpt-4o-mini\",\n    messages=[{\"role\": \"user\", \"content\": \"Hello!\"}],\n)\n```\n\n#### TypeScript / Node.js\n\n```typescript\nimport OpenAI from \"openai\";\n\nconst client = new OpenAI({\n  baseURL: \"http://localhost:3000/v1\",\n  apiKey: \"sk-your-openai-key\",\n});\n\nconst response = await client.chat.completions.create({\n  model: \"gpt-4o-mini\",\n  messages: [{ role: \"user\", content: \"Hello!\" }],\n});\n```\n\n### Anthropic\n\nUse the Anthropic SDK pointed at the proxy:\n\n#### Python\n\n```python\nfrom anthropic import Anthropic\n\nclient = Anthropic(\n    base_url=\"http://localhost:3000\",\n    api_key=\"sk-ant-your-key\",  # forwarded to upstream\n)\n\nmessage = client.messages.create(\n    model=\"claude-sonnet-4-20250514\",\n    max_tokens=1024,\n    messages=[{\"role\": \"user\", \"content\": \"Hello!\"}],\n)\n```\n\n#### TypeScript / Node.js\n\n```typescript\nimport Anthropic from \"@anthropic-ai/sdk\";\n\nconst client = new Anthropic({\n  baseURL: \"http://localhost:3000\",\n  apiKey: \"sk-ant-your-key\",\n});\n\nconst message = await client.messages.create({\n  model: \"claude-sonnet-4-20250514\",\n  max_tokens: 1024,\n  messages: [{ role: \"user\", content: \"Hello!\" }],\n});\n```\n\n### Gemini\n\nSend requests to the `/v1beta/*` endpoints:\n\n```bash\ncurl \"http://localhost:3000/v1beta/models/gemini-2.0-flash:generateContent\" \\\n  -H \"x-goog-api-key: your-gemini-key\" \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\"contents\":[{\"parts\":[{\"text\":\"Hello!\"}]}]}'\n```\n\n### curl (OpenAI)\n\n```bash\n# Non-streaming\ncurl http://localhost:3000/v1/chat/completions \\\n  -H \"Authorization: Bearer sk-your-openai-key\" \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\"model\":\"gpt-4o-mini\",\"messages\":[{\"role\":\"user\",\"content\":\"Hello!\"}]}'\n\n# Streaming\ncurl http://localhost:3000/v1/chat/completions \\\n  -H \"Authorization: Bearer sk-your-openai-key\" \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\"model\":\"gpt-4o-mini\",\"stream\":true,\"messages\":[{\"role\":\"user\",\"content\":\"Hello!\"}]}'\n```\n\n### Health check\n\n```bash\ncurl http://localhost:3000/api/health\n```\n\n## Endpoints\n\n| Endpoint           | Description                                                      |\n| ------------------ | ---------------------------------------------------------------- |\n| `ALL /v1/messages` | Anthropic pass-through — forwards to Anthropic API               |\n| `ALL /v1beta/*`    | Gemini pass-through — forwards to Gemini API                     |\n| `ALL /v1/*`        | OpenAI catch-all — forwards any request to upstream provider     |\n| `GET /api/health`  | Health check — returns app version and per-provider reachability |\n\n\u003e Routes are matched in order: `/v1/messages` is matched before the `/v1/*` catch-all, so Anthropic requests are routed correctly.\n\nThe health endpoint returns per-provider status:\n\n```json\n{\n  \"name\": \"langfuse-proxy\",\n  \"version\": \"0.0.0\",\n  \"status\": \"ok\",\n  \"upstream\": {\n    \"openai\": \"ok\",\n    \"anthropic\": \"ok\",\n    \"gemini\": \"not_configured\"\n  }\n}\n```\n\n- `status` is `\"degraded\"` if OpenAI is unreachable or any configured provider has errors\n- Anthropic and Gemini show `\"not_configured\"` if their API key is not set\n\n## Langfuse Telemetry\n\nEvery proxied request creates a Langfuse trace with:\n\n- **Trace**: request path, input messages, output content, HTTP metadata\n- **Generation**: model name, full input/output, token usage with detailed breakdowns, timing\n\nThe `usageDetails` field includes the full OpenAI token breakdown:\n\n| Field                     | Description                                  |\n| ------------------------- | -------------------------------------------- |\n| `input`                   | Non-cached prompt tokens                     |\n| `input_cached_tokens`     | Prompt tokens served from OpenAI's cache     |\n| `input_audio_tokens`      | Audio input tokens                           |\n| `output`                  | Completion tokens                            |\n| `output_reasoning_tokens` | Reasoning/chain-of-thought tokens (o1, etc.) |\n| `output_audio_tokens`     | Audio output tokens                          |\n\nAnthropic and Gemini providers report their native token usage in the same format.\n\nTiming metadata on each generation:\n\n| Field                 | Description                                           |\n| --------------------- | ----------------------------------------------------- |\n| `startTime`           | When the proxy received the request                   |\n| `completionStartTime` | When the first byte was received from upstream (TTFB) |\n| `endTime`             | When the full response was consumed                   |\n\nSet `TELEMETRY_MAX_BODY_BYTES` to limit how much response data is buffered for telemetry (default 1MB). The consumer always gets the full response regardless of this limit.\n\nLeave `LANGFUSE_PUBLIC_KEY` and `LANGFUSE_SECRET_KEY` empty to disable telemetry entirely.\n\n## Environment Variables\n\n| Variable                   | Description                                                 | Default                                     |\n| -------------------------- | ----------------------------------------------------------- | ------------------------------------------- |\n| `NODE_ENV`                 | Environment mode                                            | `development`                               |\n| `PORT`                     | Server port                                                 | `3000`                                      |\n| `LOG_LEVEL`                | Pino log level (`debug`, `info`, `warn`, `error`, `silent`) | `info`                                      |\n| **OpenAI / catch-all**     |                                                             |                                             |\n| `UPSTREAM_BASE_URL`        | Upstream LLM provider base URL                              | `https://api.openai.com`                    |\n| `UPSTREAM_API_KEY`         | Override consumer's key for upstream (optional)             | -                                           |\n| `PROXY_API_KEY`            | Gate consumers with this key (optional)                     | -                                           |\n| `PROXY_TIMEOUT_MS`         | Upstream request timeout in ms                              | `300000` (5 min)                            |\n| `TELEMETRY_MAX_BODY_BYTES` | Max response body to buffer for telemetry                   | `1048576` (1MB)                             |\n| **Anthropic**              |                                                             |                                             |\n| `ANTHROPIC_BASE_URL`       | Anthropic API base URL                                      | `https://api.anthropic.com`                 |\n| `ANTHROPIC_API_KEY`        | Override consumer's key for Anthropic (optional)            | -                                           |\n| `ANTHROPIC_VERSION`        | Default `anthropic-version` header                          | `2023-06-01`                                |\n| **Gemini**                 |                                                             |                                             |\n| `GEMINI_BASE_URL`          | Gemini API base URL                                         | `https://generativelanguage.googleapis.com` |\n| `GEMINI_API_KEY`           | Override consumer's key for Gemini (optional)               | -                                           |\n| **Langfuse**               |                                                             |                                             |\n| `LANGFUSE_BASE_URL`        | Langfuse instance URL                                       | `https://cloud.langfuse.com`                |\n| `LANGFUSE_PUBLIC_KEY`      | Langfuse public key (empty = telemetry disabled)            | -                                           |\n| `LANGFUSE_SECRET_KEY`      | Langfuse secret key (empty = telemetry disabled)            | -                                           |\n\n## Deployment\n\n### Docker\n\n```bash\ndocker build -t langfuse-proxy .\ndocker run -p 3000:3000 --env-file .env langfuse-proxy\n```\n\nThe Dockerfile uses a multi-stage build that compiles the app to a standalone binary (~50MB image).\n\n### Coolify\n\nDeploy using the **Dockerfile** build pack — configure environment variables in the Coolify dashboard. Set the health check to `/api/health` on port 3000 for rolling updates. No database or external services required.\n\n## Development\n\n```bash\nbun install       # Install dependencies\nbun dev           # Start with hot reload\nbun test          # Run tests with coverage\nbun lint          # Lint with Biome\nbun format        # Auto-fix lint and formatting\nbun check         # Lint + type-check + tests (runs in pre-commit hook)\n```\n\n### Project Structure\n\n```sh\nsrc/\n├── api/\n│   ├── features/\n│   │   ├── anthropic/                 # ALL /v1/messages\n│   │   │   ├── anthropic.controller.ts    Anthropic handler, auth, header forwarding\n│   │   │   └── anthropic.stream.ts        Anthropic SSE parsing\n│   │   ├── gemini/                    # ALL /v1beta/*\n│   │   │   ├── gemini.controller.ts       Gemini handler, API key forwarding\n│   │   │   └── gemini.stream.ts           Gemini stream parsing\n│   │   ├── health/                    # GET /api/health\n│   │   │   └── health.controller.ts       Per-provider reachability checks\n│   │   └── proxy/                     # ALL /v1/*\n│   │       ├── proxy.controller.ts        Catch-all handler, auth gate, header forwarding\n│   │       ├── proxy.stream.ts            Stream consumption, SSE parsing, JSON parsing\n│   │       ├── proxy.telemetry.ts         Background Langfuse reporting (all providers)\n│   │       └── proxy.types.ts             TypeScript interfaces\n│   └── lib/\n│       ├── langfuse.ts                Langfuse client singleton + shutdown\n│       └── logger.ts                  Pino logger with pretty-print (dev) / JSON (prod)\n├── app.ts                             Elysia app setup (logging, error handling, routes)\n├── config.ts                          Environment configuration\n└── index.ts                           Entry point, server startup, graceful shutdown\ntests/\n└── api/features/\n    ├── anthropic/                     Anthropic controller and stream parser tests\n    ├── gemini/                        Gemini controller and stream parser tests\n    ├── health/                        Health endpoint tests\n    └── proxy/                         Proxy controller and stream parser tests\n```\n\n## License\n\n[MIT](LICENSE)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffazer-ai%2Flangfuse-proxy","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ffazer-ai%2Flangfuse-proxy","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffazer-ai%2Flangfuse-proxy/lists"}