https://github.com/riipandi/radium
[WIP] Radium is an open-source LLM proxy gateway built for resource efficiency and high performance.
https://github.com/riipandi/radium
ai ai-gateway antrophic gateway llm llm-gateway llm-proxy llmops mcp-gateway openai openai-compatible openai-proxy
Last synced: 19 days ago
JSON representation
[WIP] Radium is an open-source LLM proxy gateway built for resource efficiency and high performance.
- Host: GitHub
- URL: https://github.com/riipandi/radium
- Owner: riipandi
- License: other
- Created: 2025-06-16T10:46:50.000Z (12 months ago)
- Default Branch: main
- Last Pushed: 2026-03-30T09:17:20.000Z (2 months ago)
- Last Synced: 2026-04-26T17:34:03.173Z (about 1 month ago)
- Topics: ai, ai-gateway, antrophic, gateway, llm, llm-gateway, llm-proxy, llmops, mcp-gateway, openai, openai-compatible, openai-proxy
- Language: TypeScript
- Homepage:
- Size: 646 KB
- Stars: 4
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
- Agents: AGENTS.md
Awesome Lists containing this project
README
# Radium LLM Proxy Gateway
[](https://github.com/riipandi/radium/releases)
[](https://www.rust-lang.org)
[](https://deps.rs/repo/github/riipandi/radium)

[](https://github.com/riipandi/radium/graphs/contributors)
Radium is an open-source, unified HTTP API gateway for accessing multiple AI model providers. Built in Rust
as a resource-efficient, high-performance open-source LLM proxy, Radium acts as a central orchestrator,
handling every request and response with precision. It delivers uniform endpoints for text and chat
completions, intelligent fallback logic, and complete observability, transforming client-to-LLM interactions
into a seamless, elegantly managed experience.
## Key Features
- **🚀 High Performance**: Leverages Rust's speed and memory safety for low-latency, high-throughput proxying.
- **🔌 Multi-Provider Support**: Seamlessly connects to OpenAI, Anthropic, AWS Bedrock, Cohere, etc.
- **⚡ Flexible Integration**: Minimal configuration required for various LLM backends.
- **📊 Built-in Monitoring**: Prometheus metrics and comprehensive observability.
- **🛠️ Developer-Friendly**: Simple setup, clear documentation, and extensible design.
- **🔄 Fallback Support**: Automatic failover between providers for reliability.
- **🌐 CORS Support**: Configurable Cross-Origin Resource Sharing.
- **📝 Structured Logging**: Configurable logging with rotation and timestamps.
- **🐳 Docker Ready**: Container support with multi-platform builds.
- **📈 Scalable Architecture**: Connection pooling and request timeout handling.
- **📝 Open Source**: Licensed under Apache 2.0.
## Supported Providers
| Provider | Key | Configuration Section | Status |
|------------------|-------------|-----------------------|--------|
| OpenAI | `openai` | `[openai]` | ✅ |
| Anthropic | `anthropic` | `[anthropic]` | ⏳ |
| Azure OpenAI | `azure` | `[azure_openai]` | ⏳ |
| AWS Bedrock | `bedrock` | `[bedrock]` | ⏳ |
| Cohere | `cohere` | `[cohere]` | ⏳ |
| Google Vertex AI | `vertex` | `[vertext]` | ⏳ |
## Getting Started
### Prerequisites
- **Rust**: Ensure you have Rust installed (version 1.93 or later). Install via [rustup](https://rustup.rs/)
- **Git**: Required to clone the repository
- **API Keys**: Valid API keys for your chosen LLM providers
- **Optional**: Docker for containerized deployment
### Installation
1. Clone the repository:
```sh
git clone https://github.com/riipandi/radium.git && cd radium
```
2. Build the project:
```sh
# Using cargo directly
cargo build --release
# Or using just (recommended)
just build
```
3. Set up configuration:
```sh
# Copy example configuration
cp config.toml.example config.toml
# Edit with your API keys and settings
nano config.toml
```
### Configuration
Create your `config.toml` file based on the [`config.toml.example`](./config.toml.example)
### Running the Server
```sh
# Using cargo
cargo run -- serve
# Using just (with auto-reload for development)
just dev
# Using built binary
./target/release/radium serve
# With custom config path
./target/release/radium serve -config /path/to/config.toml
```
## API Endpoints
Radium provides OpenAI-compatible API endpoints:
- `POST /v1/chat/completions` - Chat completions with conversation context
- `POST /v1/text/completions` - Simple text completions
- `GET /metrics` - Prometheus metrics for monitoring
### Base URL
```
http://localhost:8000
```
## Performance Benchmarks
Here are example benchmarks using [bombardier](https://github.com/codesenberg/bombardier) HTTP benchmarking tool:
### Health Check Endpoint Performance
**Test Configuration:**
- **Concurrent connections**: 125
- **Number of requests**: 100,000
- **Target endpoint**: `GET /healthz`
- **Environment**: local development server
```sh
bombardier -c 125 -n 100000 -m GET http://localhost:8000/healthz
```
**Example Results:**
```
Statistics Avg Stdev Max
Reqs/sec 35160.01 8333.65 46008.85
Latency 3.55ms 2.04ms 44.74ms
HTTP codes:
1xx - 0, 2xx - 100000, 3xx - 0, 4xx - 0, 5xx - 0
others - 0
Throughput: 16.64MB/s
```
### Load Testing Different Scenarios
```sh
# Light load - 50 concurrent connections, 1000 requests
bombardier -n 1000 -c 50 http://localhost:8000/healthz
# Medium load - 100 concurrent connections, 2500 requests with 10s duration
bombardier -d 10s -n 2500 -c 100 http://localhost:8000/healthz
# Heavy load - 500 concurrent connections, 10000 requests
bombardier -d 10s -n 10000 -c 500 http://localhost:8000/healthz
# Sustained load test - 30 seconds duration
bombardier -c 100 -d 30s http://localhost:8000/healthz
```
### API Endpoint Benchmarks
For testing actual LLM proxy endpoints:
```sh
# Test chat completions endpoint (requires valid API key)
bombardier -n 100 -c 10 -m POST \
-H "Content-Type: application/json" \
-H "Authorization: Bearer your-api-key" \
-b '{"model":"gpt-4o-mini","messages":[{"role":"user","content":"Hello"}]}' \
http://localhost:8000/v1/chat/completions
```
**Performance Notes:**
- Actual LLM endpoints performance depends on upstream provider latency
- Connection pooling and keep-alive significantly improve throughput
- Memory usage remains stable under high concurrent load
## Documentation
For detailed documentation, see:
- **[HTTP Transport Documentation](./docs/http-transport.md)** - Complete API reference
- **[OpenAPI Specification](./docs/openapi.json)** - Machine-readable API spec
- **[Example Requests](./specs)** - Sample requests using [httl](https://httl.dev)
## Monitoring
Radium provides comprehensive monitoring through Prometheus metrics at `/metrics` endpoint, including:
- Request counts by provider, model, and status
- Request latency histograms
- Token usage statistics
- Error rates and types
- Connection pool statistics
## Docker Support
Radium includes full Docker support with multi-platform builds:
```sh
# Build Docker image
just docker-build
# Run with Docker
just docker-run serve
# Using Docker Compose
just compose-up
```
## Contributing
We welcome contributions to make Radium even better!
- Read our **[Contributing Guidelines](./CONTRIBUTING.md)** for detailed guidelines
- Fork the repository and create a feature branch
- Submit a pull request with a clear title and description
- Join the discussion on [GitHub Issues](https://github.com/riipandi/radium/issues)
Join the flow. Amplify your your AI-powered applications with Radium! 🚀
## License
Radium is licensed under the [Apache License 2.0](https://choosealicense.com/licenses/apache-2.0).
See the [LICENSE](./LICENSE) file for more information.
> Unless you explicitly state otherwise, any contribution intentionally submitted
> for inclusion in this project by you shall be licensed under the Apache License 2.0,
> without any additional terms or conditions.
Copyrights in this project are retained by their contributors.
---
🤫 Psst! If you like my work you can support me via [GitHub sponsors](https://github.com/sponsors/riipandi).
[](https://x.com/intent/follow?screen_name=riipandi)