{"id":44112340,"url":"https://github.com/chicogong/stream-relay-go","last_synced_at":"2026-02-08T16:34:17.153Z","repository":{"id":330131602,"uuid":"1120404742","full_name":"chicogong/stream-relay-go","owner":"chicogong","description":"A lightweight Go streaming relay for LLM/TTS APIs with production-grade observability and policy controls","archived":false,"fork":false,"pushed_at":"2025-12-24T06:39:37.000Z","size":542,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"master","last_synced_at":"2025-12-25T01:33:19.880Z","etag":null,"topics":["anthropic","api-gateway","docker","gin","golang","grafana","llm","llm-proxy","observability","openai","prometheus","proxy","rate-limiting","siliconflow","sse","streaming","tts"],"latest_commit_sha":null,"homepage":null,"language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/chicogong.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-12-21T06:18:55.000Z","updated_at":"2025-12-24T11:47:30.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/chicogong/stream-relay-go","commit_stats":null,"previous_names":["chicogong/stream-relay-go"],"tags_count":null,"template":false,"template_full_name":null,"purl":"pkg:github/chicogong/stream-relay-go","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/chicogong%2Fstream-relay-go","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/chicogong%2Fstream-relay-go/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/chicogong%2Fstream-relay-go/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/chicogong%2Fstream-relay-go/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/chicogong","download_url":"https://codeload.github.com/chicogong/stream-relay-go/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/chicogong%2Fstream-relay-go/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":29236900,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-02-08T14:18:14.570Z","status":"ssl_error","status_checked_at":"2026-02-08T14:18:14.071Z","response_time":57,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["anthropic","api-gateway","docker","gin","golang","grafana","llm","llm-proxy","observability","openai","prometheus","proxy","rate-limiting","siliconflow","sse","streaming","tts"],"created_at":"2026-02-08T16:34:16.435Z","updated_at":"2026-02-08T16:34:17.148Z","avatar_url":"https://github.com/chicogong.png","language":"Go","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Stream Relay Go\n\n[![CI](https://github.com/chicogong/stream-relay-go/actions/workflows/ci.yml/badge.svg)](https://github.com/chicogong/stream-relay-go/actions/workflows/ci.yml)\n[![Go Report Card](https://goreportcard.com/badge/github.com/chicogong/stream-relay-go)](https://goreportcard.com/report/github.com/chicogong/stream-relay-go)\n[![codecov](https://codecov.io/gh/chicogong/stream-relay-go/branch/master/graph/badge.svg)](https://codecov.io/gh/chicogong/stream-relay-go)\n[![Go Version](https://img.shields.io/badge/Go-1.23+-00ADD8?style=flat\u0026logo=go)](https://golang.org)\n[![License](https://img.shields.io/badge/License-MIT-blue.svg)](LICENSE)\n[![PRs Welcome](https://img.shields.io/badge/PRs-welcome-brightgreen.svg)](CONTRIBUTING.md)\n\nEnglish | [简体中文](README_zh.md)\n\nA lightweight, high-performance streaming relay for LLM and TTS APIs with built-in observability.\n\n## ✨ Features\n\n- **🚀 Low-latency Streaming** - Token-by-token SSE streaming with immediate flush\n- **🔐 Auto Authentication** - Automatic Bearer token injection for upstream APIs\n- **📊 Real-time Monitoring** - Prometheus metrics + beautiful Grafana dashboards\n- **🎯 Multi-provider Support** - SiliconFlow, OpenAI, Anthropic, Azure TTS\n- **⚡ Zero Dependency** - Optional Redis/ClickHouse, works standalone\n- **🛡️ Production Ready** - Rate limiting, health checks, graceful shutdown\n\n## 🏗️ Architecture\n\n```\nClient Request\n     ↓\nAPI Key Auth\n     ↓\nRate Limiting\n     ↓\nUpstream Auth Injection\n     ↓\nSSE Streaming Proxy ← → Upstream API\n     ↓\nMetrics Collection\n     ↓\nClient Response\n```\n\n## 🚀 Quick Start\n\n### Prerequisites\n\n- Go 1.21+\n- (Optional) Docker for monitoring stack\n\n### Installation\n\n```bash\n# Clone the repository\ngit clone https://github.com/chicogong/stream-relay-go.git\ncd stream-relay-go\n\n# Build\nmake build\n\n# Run\n./bin/relay -config configs/config.yaml\n```\n\n### Configuration\n\n1. Copy the example environment file:\n```bash\ncp .env.example .env\n```\n\n2. Add your API keys to `.env`:\n```bash\nSILICONFLOW_API_KEY=sk-your-key-here\nOPENAI_API_KEY=sk-your-key-here\nANTHROPIC_API_KEY=sk-ant-your-key-here\n```\n\n3. Start the relay:\n```bash\nmake dev\n```\n\nThe relay will start on `http://localhost:8080`\n\n### Testing\n\n```bash\n# Health check\ncurl http://localhost:8080/healthz\n\n# Streaming request\ncurl -N http://localhost:8080/v1/chat/completions \\\n  -H 'Authorization: Bearer sk-relay-test-key-123' \\\n  -H 'Content-Type: application/json' \\\n  -d '{\n    \"model\": \"Qwen/Qwen2.5-7B-Instruct\",\n    \"messages\": [{\"role\": \"user\", \"content\": \"Hello\"}],\n    \"stream\": true,\n    \"max_tokens\": 20\n  }'\n```\n\n## 📊 Monitoring\n\n### Start Grafana + Prometheus\n\n```bash\ncd deployments/grafana\ndocker-compose up -d\n```\n\n### Access Dashboards\n\n- **Grafana**: http://localhost:3000 (admin/admin)\n- **Prometheus**: http://localhost:9090\n- **Metrics Endpoint**: http://localhost:8080/metrics\n\n### Beautiful Dashboard\n\n![Grafana Dashboard](docs/images/grafana-dashboard.png)\n\nThe dashboard provides real-time insights:\n- 📊 **Total Requests** - Cumulative request count\n- ✅ **Success Rate** - Real-time success percentage (color-coded: 🟢 \u003e99%, 🟡 \u003e95%, 🟠 \u003e90%, 🔴 \u003c90%)\n- 📈 **Request Rate** - Requests per minute with smooth curves\n- ⏱️ **Response Time** - p50/p95/p99 latency percentiles\n- 🔥 **Heatmap** - Visual latency distribution\n- 🚨 **Error Monitoring** - Instant error detection with alerts\n\n### 🚀 Enhanced Dashboard with Logs\n\n![Enhanced Grafana Dashboard](docs/images/enhanced-grafana-dashboard.png)\n\nThe enhanced dashboard (`enhanced-dashboard.json`) includes **15 comprehensive panels** with integrated log viewing:\n\n**Metrics Panels:**\n- 📊 Total Requests, Success Rate, Avg Response Time\n- 🔗 Active Connections, Error Count, Storage Latency\n- 📈 Request Rate Trend \u0026 Response Time Percentiles (p50/p95/p99)\n- 🎯 Requests by Route (donut chart)\n- 📊 Status Code Distribution (2xx/4xx/5xx bar gauge)\n- 🚨 Error Types Table \u0026 Active Connections Over Time\n- 🔥 Request Latency Heatmap\n- 📋 Recent Activity Log Table\n\n**Log Integration (Loki):**\n- 📝 Live application logs with filtering\n- 🔍 Search logs by level (ERROR, INFO, DEBUG)\n- 📊 Unified metrics + logs view for faster debugging\n\n**Setup:**\nThe enhanced monitoring stack includes Loki + Promtail for log aggregation. See [deployments/grafana/README.md](deployments/grafana/README.md) for full setup instructions.\n\n### Generate Demo Traffic\n\n```bash\n# Run the test script to generate sample requests\n./test_relay.sh\n\n# Or manually send requests\nfor i in {1..10}; do\n  curl -N http://localhost:8080/v1/chat/completions \\\n    -H 'Authorization: Bearer sk-relay-test-key-123' \\\n    -H 'Content-Type: application/json' \\\n    -d \"{\\\"model\\\": \\\"Qwen/Qwen2.5-7B-Instruct\\\", \\\"messages\\\": [{\\\"role\\\": \\\"user\\\", \\\"content\\\": \\\"Count to $i\\\"}], \\\"stream\\\": true, \\\"max_tokens\\\": 20}\"\ndone\n```\n\nWatch the metrics update in real-time at http://localhost:3000\n\n\u003e 💡 **Tip**: Use `./scripts/generate-demo.sh` to populate the dashboard with demo traffic!\n\n## 📁 Project Structure\n\n```\nstream-relay-go/\n├── cmd/relay/          # Application entry point\n├── internal/           # Core implementation\n│   ├── config.go       # Configuration management\n│   ├── proxy.go        # Streaming proxy logic\n│   ├── server.go       # HTTP server setup\n│   ├── metrics.go      # Prometheus metrics\n│   ├── limiter.go      # Rate limiting\n│   └── storage.go      # Optional storage layer\n├── configs/            # Configuration files\n├── deployments/        # Docker \u0026 Grafana configs\n└── docs/              # Documentation\n```\n\n## ⚙️ Configuration\n\n### Server\n\n```yaml\nserver:\n  port: 8080\n  timeout: 300s\n  max_body_size: 10485760  # 10MB\n```\n\n### Routes\n\n```yaml\nroutes:\n  - name: siliconflow\n    path: /v1/chat/completions\n    upstream: https://api.siliconflow.cn\n    auth_header: Authorization\n    auth_env: SILICONFLOW_API_KEY\n    kind: sse\n```\n\n### Rate Limiting\n\n```yaml\nrate_limit:\n  enabled: true\n  default: 100  # requests per minute per tenant\n  burst: 20\n```\n\n## 🔧 Advanced Usage\n\n### Custom Routes\n\nAdd custom routes in `configs/config.yaml`:\n\n```yaml\nroutes:\n  - name: custom-provider\n    path: /custom/path\n    upstream: https://api.custom.com\n    auth_header: X-API-Key\n    auth_env: CUSTOM_API_KEY\n    kind: sse\n```\n\n### Storage Backend\n\nEnable optional storage for detailed logging:\n\n```yaml\nstorage:\n  redis:\n    addr: localhost:6379\n    password: \"\"\n    db: 0\n```\n\n## 📈 Metrics\n\nThe relay exposes comprehensive Prometheus metrics at `/metrics` endpoint:\n\n### Core Metrics\n\n| Metric Name | Type | Description | Labels |\n|-------------|------|-------------|--------|\n| `relay_requests_total` | Counter | Total number of requests processed | `route`, `status` (2xx/4xx/5xx) |\n| `relay_duration_ms` | Histogram | Request duration in milliseconds | `route` |\n| `relay_errors_total` | Counter | Total number of errors | `route`, `type` |\n| `relay_active_connections` | Gauge | Current number of active connections | `route` |\n| `relay_storage_write_ms` | Histogram | Storage write latency in milliseconds | - |\n\n### Histogram Buckets\n\n- **Duration Buckets**: 100ms, 500ms, 1s, 2s, 5s, 10s, 30s, 60s\n- **Storage Write Buckets**: 1ms, 5ms, 10ms, 50ms, 100ms, 500ms, 1s\n\n### Example Queries\n\n```promql\n# Request rate (requests per minute)\nrate(relay_requests_total[1m]) * 60\n\n# Average latency\nrate(relay_duration_ms_sum[1m]) / rate(relay_duration_ms_count[1m])\n\n# P95 latency\nhistogram_quantile(0.95, rate(relay_duration_ms_bucket[1m]))\n\n# Success rate\nsum(relay_requests_total{status=\"2xx\"}) / sum(relay_requests_total) * 100\n\n# Error rate\nrate(relay_errors_total[1m])\n\n# Active connections by route\nrelay_active_connections\n```\n\n### Grafana Dashboard\n\nImport `deployments/grafana/beautiful-dashboard.json` for a pre-configured dashboard with:\n- Real-time request rate\n- Latency percentiles (p50, p95, p99)\n- Success rate gauge\n- Error monitoring\n- Request heatmap\n- Recent activity table\n\n## 🤝 Contributing\n\nContributions are welcome! Please see [CONTRIBUTING.md](CONTRIBUTING.md) for details.\n\n## 📝 License\n\nThis project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.\n\n## 🙏 Acknowledgments\n\n- Built with [Gin](https://github.com/gin-gonic/gin)\n- Monitoring powered by [Prometheus](https://prometheus.io) and [Grafana](https://grafana.com)\n- Inspired by best practices in API gateway design\n\n## 📮 Support\n\n- 🐛 [Report Bug](https://github.com/chicogong/stream-relay-go/issues)\n- 💡 [Request Feature](https://github.com/chicogong/stream-relay-go/issues)\n- 📧 Email: your-email@example.com\n\n---\n\n**Made with ❤️ for the LLM community**\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fchicogong%2Fstream-relay-go","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fchicogong%2Fstream-relay-go","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fchicogong%2Fstream-relay-go/lists"}