https://github.com/majiayu000/litellm-rs
A high-performance AI Gateway written in Rust — call 100+ LLM APIs using OpenAI format
https://github.com/majiayu000/litellm-rs
ai-gateway anthropic api-client async-rust aws-bedrock embeddings gemini llm load-balancing multi-provider ollama openai rust streaming
Last synced: 2 months ago
JSON representation
A high-performance AI Gateway written in Rust — call 100+ LLM APIs using OpenAI format
- Host: GitHub
- URL: https://github.com/majiayu000/litellm-rs
- Owner: majiayu000
- License: mit
- Created: 2025-07-15T06:04:01.000Z (11 months ago)
- Default Branch: main
- Last Pushed: 2026-03-28T10:39:00.000Z (3 months ago)
- Last Synced: 2026-03-28T14:35:18.005Z (3 months ago)
- Topics: ai-gateway, anthropic, api-client, async-rust, aws-bedrock, embeddings, gemini, llm, load-balancing, multi-provider, ollama, openai, rust, streaming
- Language: Rust
- Size: 8.9 MB
- Stars: 34
- Watchers: 1
- Forks: 6
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
- Code of conduct: CODE_OF_CONDUCT.md
- Security: SECURITY.md
Awesome Lists containing this project
README
# litellm-rs
A high-performance Rust library and gateway for calling 100+ LLM APIs in an OpenAI-compatible format.
[](https://crates.io/crates/litellm-rs)
[](https://docs.rs/litellm-rs)
[](https://opensource.org/licenses/MIT)
## Features
- **100+ AI Providers** - OpenAI, Anthropic, Google, Azure, AWS Bedrock, and more
- **OpenAI-Compatible API** - Drop-in replacement for OpenAI SDK
- **High Performance** - 10,000+ requests/second, <10ms routing overhead
- **Intelligent Routing** - Load balancing, failover, cost optimization
- **Enterprise Ready** - Auth, rate limiting, caching, observability
## Quick Start (5 Minutes, API-Only Recommended)
Most users use this project as a unified API library, not as a gateway server. Start with API-only mode first.
```toml
[dependencies]
litellm-rs = { version = "0.4", default-features = false, features = ["lite"] }
```
For crate users, no `make` is required.
## Usage
### As a Library (API Integration)
```rust
use litellm_rs::{completion, user_message, system_message};
#[tokio::main]
async fn main() -> Result<(), Box> {
let response = completion(
"gpt-4",
vec![
system_message("You are a helpful assistant."),
user_message("Hello!"),
],
None,
).await?;
println!("{}", response.choices[0].message.content.as_ref().unwrap());
Ok(())
}
```
### As a Gateway Server
#### Run from source repository
```bash
git clone https://github.com/majiayu000/litellm-rs.git
cd litellm-rs
cp config/gateway.yaml.example config/gateway.yaml
cargo run --bin gateway
```
#### Install binary and run
```bash
cargo install litellm-rs --bin gateway
mkdir -p config
curl -L https://raw.githubusercontent.com/majiayu000/litellm-rs/main/config/gateway.yaml.example -o config/gateway.yaml
gateway
```
Notes:
- `gateway` and `google-gateway` binaries require `storage` feature at build time.
- Default features include `sqlite`, so default `cargo run`/`cargo install` satisfy this requirement.
## Installation
```toml
# Full gateway with SQLite + Redis (default)
[dependencies]
litellm-rs = "0.4"
# API-only - lightweight, no actix-web/argon2/aes-gcm/clap
[dependencies]
litellm-rs = { version = "0.4", default-features = false }
# API-only with metrics
[dependencies]
litellm-rs = { version = "0.4", default-features = false, features = ["lite"] }
# Gateway modules in library context (not standalone gateway binary runtime)
[dependencies]
litellm-rs = { version = "0.4", default-features = false, features = ["gateway"] }
```
## Supported Providers
| Provider | Chat | Embeddings | Images | Audio |
|----------|------|------------|--------|-------|
| OpenAI | ✅ | ✅ | ✅ | ✅ |
| Anthropic | ✅ | - | - | - |
| Google (Gemini) | ✅ | ✅ | ✅ | - |
| Azure OpenAI | ✅ | ✅ | ✅ | ✅ |
| AWS Bedrock | ✅ | ✅ | - | - |
| Google Vertex AI | ✅ | ✅ | ✅ | - |
| Groq | ✅ | - | - | ✅ |
| DeepSeek | ✅ | - | - | - |
| Kimi (Moonshot AI) | ✅ | - | - | - |
| GLM (Zhipu AI) | ✅ | - | - | - |
| MiniMax | ✅ | - | - | - |
| Mistral | ✅ | ✅ | - | - |
| Cohere | ✅ | ✅ | - | - |
| OpenRouter | ✅ | - | - | - |
| Together AI | ✅ | ✅ | - | - |
| Fireworks AI | ✅ | ✅ | - | - |
| Perplexity | ✅ | - | - | - |
| Replicate | ✅ | - | ✅ | - |
| Hugging Face | ✅ | ✅ | - | - |
| Ollama | ✅ | ✅ | - | - |
| And 80+ more... | | | | |
## Environment Variables
```bash
# Provider API Keys
OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...
GOOGLE_API_KEY=...
AZURE_OPENAI_API_KEY=...
AWS_ACCESS_KEY_ID=...
AWS_SECRET_ACCESS_KEY=...
GROQ_API_KEY=...
DEEPSEEK_API_KEY=...
MOONSHOT_API_KEY=...
ZHIPU_API_KEY=...
MINIMAX_API_KEY=...
# Optional
LITELLM_VERBOSE=true # Enable verbose logging
```
## Examples
### Multi-Provider Routing
```rust
use litellm_rs::{completion, user_message};
// Automatically routes to the right provider based on model name
let openai = completion("gpt-4", vec![user_message("Hi")], None).await?;
let anthropic = completion("anthropic/claude-3-opus", vec![user_message("Hi")], None).await?;
let google = completion("gemini/gemini-pro", vec![user_message("Hi")], None).await?;
let bedrock = completion(
"bedrock/us.anthropic.claude-3-5-sonnet-20241022-v2:0",
vec![user_message("Hi")],
None,
)
.await?;
```
### Embeddings
```rust
use litellm_rs::{embedding, embed_text};
// Single text
let embedding = embed_text("text-embedding-3-small", "Hello world").await?;
// Batch
let embeddings = embedding(
"text-embedding-3-small",
vec!["Hello", "World"],
None,
).await?;
```
### Streaming
```rust
use litellm_rs::{completion_stream, user_message};
use futures::StreamExt;
let mut stream = completion_stream(
"gpt-4",
vec![user_message("Tell me a story")],
None,
).await?;
while let Some(chunk) = stream.next().await {
if let Ok(chunk) = chunk {
print!("{}", chunk.choices[0].delta.content.unwrap_or_default());
}
}
```
## Performance
- **Throughput**: 10,000+ requests/second
- **Latency**: <10ms routing overhead
- **Memory**: ~50MB base footprint
- **Concurrency**: Fully async with Tokio
## Troubleshooting
### Build/test uses too much CPU or memory
- Use API-only defaults first: `cargo test --lib --tests --no-default-features --features "lite"`
- Limit local parallelism when needed: `CARGO_BUILD_JOBS=4 cargo test --lib --tests --no-default-features --features "lite" -- --test-threads=4`
- Avoid `--all-features` unless you are doing release/nightly validation
### I only need provider API aggregation, not gateway
- Prefer `default-features = false` with `features = ["lite"]`
- Use gateway runtime commands only when you need HTTP server/auth/storage middleware
## Documentation
- [API Documentation](https://docs.rs/litellm-rs)
- [Documentation Index](./docs/README.md)
- [Configuration Guide](./config/gateway.yaml.example)
- [Examples](./examples/README.md)
## Contributing
See [CONTRIBUTING.md](./CONTRIBUTING.md) for development setup and guidelines.
## Security
See [SECURITY.md](./SECURITY.md) for security policy and vulnerability reporting.
## License
MIT License - see [LICENSE](./LICENSE) for details.
## Acknowledgments
Inspired by [LiteLLM](https://github.com/BerriAI/litellm) (Python).