https://github.com/aashari/go-generative-api-router
Go microservice that proxies OpenAI-compatible API calls to multiple LLM vendors (OpenAI, Gemini) using configurable selection strategies. Supports vendor filtering, streaming responses, and tool calling while maintaining transparent request/response handling.
https://github.com/aashari/go-generative-api-router
ai api api-gateway api-proxy api-router artificial-intelligence cloud-native gemini go golang large-language-model llm microservice models openai proxy routing streaming tool-calling vendor-selection
Last synced: 4 months ago
JSON representation
Go microservice that proxies OpenAI-compatible API calls to multiple LLM vendors (OpenAI, Gemini) using configurable selection strategies. Supports vendor filtering, streaming responses, and tool calling while maintaining transparent request/response handling.
- Host: GitHub
- URL: https://github.com/aashari/go-generative-api-router
- Owner: aashari
- License: mit
- Created: 2025-05-17T05:39:52.000Z (5 months ago)
- Default Branch: main
- Last Pushed: 2025-06-09T11:22:47.000Z (4 months ago)
- Last Synced: 2025-06-09T12:31:25.988Z (4 months ago)
- Topics: ai, api, api-gateway, api-proxy, api-router, artificial-intelligence, cloud-native, gemini, go, golang, large-language-model, llm, microservice, models, openai, proxy, routing, streaming, tool-calling, vendor-selection
- Language: Go
- Size: 542 KB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
Awesome Lists containing this project
README
# Generative API Router
[](https://goreportcard.com/report/github.com/aashari/go-generative-api-router)
[](https://opensource.org/licenses/MIT)
[](https://github.com/aashari/go-generative-api-router)A production-ready Go microservice that provides a **unified OpenAI-compatible API** for multiple LLM vendors (OpenAI, Gemini). This transparent proxy router simplifies AI integration by offering a single interface while intelligently distributing requests across multiple vendors and preserving your original model names in responses.
## ๐๏ธ **Architecture Overview**
### **Multi-Vendor OpenAI-Compatible Router**
This service acts as a **transparent proxy** that provides a unified OpenAI-compatible API interface while routing requests to multiple LLM vendors behind the scenes:
- **OpenAI API Compatibility**: All vendors accessed through OpenAI-compatible endpoints
- **Transparent Model Handling**: Preserves your original model names in responses
- **Multi-Vendor Design**: Currently supports **19 credentials** (18 Gemini + 1 OpenAI) with **4 models**
- **Even Distribution**: Fair selection across **114 vendor-credential-model combinations**### **Enterprise-Grade Features (2024)**
Recent comprehensive improvements include:
- ๐ **Security**: AES-GCM encryption for credentials, sensitive data masking in logs
- ๐ **Reliability**: Exponential backoff retry logic, circuit breaker pattern implementation
- ๐ **Monitoring**: Comprehensive health checks with vendor connectivity monitoring
- โก **Performance**: Production-optimized logging, conditional detail levels
- ๐งน **Code Quality**: DRY principles, centralized utilities, eliminated code duplication## Features
- **Multi-Vendor Support**: Routes requests to OpenAI or Gemini using OpenAI API compatibility
- **Even Distribution Selection**: Fair distribution across all vendor-credential-model combinations
- **Vendor Filtering**: Supports explicit vendor selection via `?vendor=` query parameter
- **Transparent Proxy**: Maintains all original request/response data (except for model selection)
- **Streaming Support**: Properly handles chunked streaming responses for real-time applications
- **Tool Calling**: Supports function calling/tools for AI agents with proper validation
- **Enterprise Reliability**: Circuit breakers, retry logic, comprehensive health monitoring
- **Security**: Encrypted credential storage, sensitive data masking
- **Modular Design**: Clean separation of concerns with selector, validator, and client components
- **Configuration Driven**: Easily configure available models and credentials via JSON files
- **Health Check**: Built-in health check endpoint with service status monitoring
- **Comprehensive Testing**: Full test coverage with unit tests for all components
- **๐ Public Image URL Support**: Automatic downloading and conversion of public image URLs to base64
- **๐ Advanced File Processing**: Comprehensive document processing supporting PDF, Word, Excel, PowerPoint, ZIP archives, and more via markitdown integration## Quick Start
### Prerequisites
- Go 1.21 or higher
- API keys for OpenAI and/or Google Gemini
- Make (for build automation)
- Python 3.8+ with markitdown for file processing (automatically installed via setup)### Installation
1. **Clone the Repository**:
```bash
git clone https://github.com/aashari/go-generative-api-router.git
cd go-generative-api-router
```2. **Setup Environment**:
```bash
make setup
```
This will:
- Download Go dependencies
- Install development tools
- Create `configs/credentials.json` from the example template3. **Verify Configuration**:
```bash
# Check existing configuration (service likely has working credentials)
cat configs/credentials.json | jq length && echo "credentials configured"
cat configs/models.json | jq length && echo "models configured"
```4. **Configure Credentials** (if needed):
Edit `configs/credentials.json` with your API keys:
```json
[
{
"platform": "openai",
"type": "api-key",
"value": "sk-your-openai-key"
},
{
"platform": "gemini",
"type": "api-key",
"value": "your-gemini-key"
}
]
```5. **Configure Models** (if needed):
Edit `configs/models.json` to define which vendor-model pairs can be selected:
```json
[
{
"vendor": "gemini",
"model": "gemini-2.0-flash"
},
{
"vendor": "openai",
"model": "gpt-4o"
}
]
```6. **Run the Service**:
```bash
make run
```
The service will be available at http://localhost:8082## Selection Strategy
The router uses an **Even Distribution Selector** that ensures fair distribution across all vendor-credential-model combinations. This approach provides true fairness where each combination has exactly equal probability of being selected.
### How It Works
1. **Combination Generation**: The system creates a flat list of all valid vendor-credential-model combinations
2. **Equal Probability**: Each combination gets exactly `1/N` probability where N = total combinations
3. **Fair Distribution**: Unlike traditional two-stage selection (vendor โ model), this ensures no bias toward vendors with fewer models### Example Distribution
With the current configuration:
- **18 Gemini credentials** ร **6 models** = 108 combinations
- **1 OpenAI credential** ร **6 models** = 6 combinations
- **Total**: 114 combinationsEach combination has exactly **1/114 = 0.877%** probability:
- **Gemini overall**: 108/114 = 94.7%
- **OpenAI overall**: 6/114 = 5.3%This reflects the actual resource availability rather than artificial vendor-level balancing.
### Benefits
- โ **True Fairness**: Each credential-model combination has exactly equal probability
- โ **Resource Proportional**: Distribution reflects actual available resources
- โ **Scalable**: Automatically adapts as credentials/models are added/removed
- โ **Transparent**: Clear logging shows selection and total combination count
- โ **No Bias**: Eliminates bias toward vendors with fewer models per credential### Monitoring Selection
The service logs each selection decision for transparency:
```
Even distribution selected combination - Vendor: openai, Model: gpt-4o (from 114 total combinations)
```You can monitor the distribution by checking the server logs to verify fair selection across all combinations.
## Usage
### Basic API Usage
```bash
# Health check
curl http://localhost:8082/health# List available models
curl http://localhost:8082/v1/models# Chat completion (any model name)
curl -X POST http://localhost:8082/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "my-preferred-model",
"messages": [{"role": "user", "content": "Hello!"}]
}'# Process PDF documents
curl -X POST http://localhost:8082/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "document-analyzer",
"messages": [
{
"role": "user",
"content": [
{
"type": "text",
"text": "Please summarize this research paper:"
},
{
"type": "file_url",
"file_url": {
"url": "https://example.com/research-paper.pdf"
}
}
]
}
]
}'# Force specific vendor
curl -X POST "http://localhost:8082/v1/chat/completions?vendor=openai" \
-H "Content-Type: application/json" \
-d '{
"model": "my-model",
"messages": [{"role": "user", "content": "Hello from OpenAI!"}]
}'
```### Using Example Scripts
Example scripts are provided for common use cases:
```bash
# Basic usage examples
./examples/curl/basic.sh# Streaming examples
./examples/curl/streaming.sh# Tool calling examples
./examples/curl/tools.sh
```### Client Libraries
Example implementations are available for multiple languages:
- **Python**: `examples/clients/python/client.py`
- **Node.js**: `examples/clients/nodejs/client.js`
- **Go**: `examples/clients/go/client.go`### Docker Deployment
Build and run using Docker:
```bash
# Build and run with Docker Compose
make docker-build
make docker-run# Stop the service
make docker-stop
```## Testing
### Multi-Vendor Testing
**IMPORTANT**: This is a multi-vendor service. Always test both vendors:
```bash
# Test OpenAI vendor
curl -X POST "http://localhost:8082/v1/chat/completions?vendor=openai" \
-H "Content-Type: application/json" \
-d '{"model": "test-openai", "messages": [{"role": "user", "content": "Hello"}]}'# Test Gemini vendor
curl -X POST "http://localhost:8082/v1/chat/completions?vendor=gemini" \
-H "Content-Type: application/json" \
-d '{"model": "test-gemini", "messages": [{"role": "user", "content": "Hello"}]}'# Monitor vendor distribution
grep "Even distribution selected combination" logs/server.log | tail -5
```### Development Testing
```bash
# Run all tests
make test# Run with coverage
make test-coverage# Full CI check
make ci-check
```## Documentation
### ๐ **Complete Documentation**
- **[Documentation Index](docs/README.md)** - Complete documentation roadmap
- **[User Guide](docs/user-guide.md)** - API usage and integration guide
- **[API Reference](docs/api-reference.md)** - Complete API documentation
- **[Development Guide](docs/development-guide.md)** - Setup and development workflow### ๐ง **Development Guides**
- **[Contributing Guide](docs/contributing-guide.md)** - How to contribute to the project
- **[Testing Guide](docs/testing-guide.md)** - Testing strategies and procedures
- **[Logging Guide](docs/logging-guide.md)** - Comprehensive logging documentation
- **[Deployment Guide](docs/deployment-guide.md)** - AWS infrastructure and deployment### ๐ **Cursor AI Context**
For Cursor AI development, see the comprehensive guides in `.cursor/rules/`:
- **[Development Guide](.cursor/rules/development_guide.mdc)** - Complete workflow, architecture, Git practices
- **[Running & Testing Guide](.cursor/rules/running_and_testing.mdc)** - Setup, testing, debugging## Architecture
### Core Components
- **Proxy Handler**: Routes requests to selected vendors, handles streaming/non-streaming responses
- **Vendor Selector**: Implements even distribution selection across vendor-credential-model combinations
- **Request Validator**: Validates OpenAI-compatible requests, preserves original model names
- **Response Processor**: Processes vendor responses while maintaining model name transparency
- **Health Monitor**: Comprehensive health checks with vendor connectivity monitoring
- **Circuit Breaker**: Reliability pattern implementation for vendor communication
- **Retry Logic**: Exponential backoff for failed vendor requests### Key Principles
1. **Transparent Proxy**: Original model names preserved in responses
2. **Vendor Agnostic**: Unified interface regardless of backend vendor
3. **Fair Distribution**: Even probability across all vendor-model combinations
4. **OpenAI Compatibility**: 100% compatible with OpenAI API format
5. **Enterprise Reliability**: Circuit breakers, retries, comprehensive monitoring## Production Deployment
The service is production-ready with:
- **AWS ECS Deployment**: Containerized deployment on AWS
- **Load Balancing**: High availability with load balancer integration
- **Monitoring**: CloudWatch integration and comprehensive logging
- **Security**: Encrypted credentials, sensitive data masking
- **Reliability**: Circuit breakers, retry logic, health monitoringSee [Deployment Guide](docs/deployment-guide.md) for complete deployment instructions.
## Contributing
We welcome contributions! Please see our [Contributing Guide](docs/contributing-guide.md) for details on:
- Development setup and workflow
- Code standards and review process
- Testing requirements
- Pull request guidelines## License
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
## Support
- **Documentation**: [Complete documentation](docs/README.md)
- **Issues**: [GitHub Issues](https://github.com/aashari/go-generative-api-router/issues)
- **Discussions**: [GitHub Discussions](https://github.com/aashari/go-generative-api-router/discussions)---
**Need help?** Check the [documentation](docs/README.md) or open an issue on GitHub.