An open API service indexing awesome lists of open source software.

https://github.com/kreuzberg-dev/liter-llm

Universal LLM API client — 142+ providers, 11 native language bindings, powered by Rust core
https://github.com/kreuzberg-dev/liter-llm

anthropic api-client llm machine-learning openai polyglot python rust streaming typescript

Last synced: about 1 month ago
JSON representation

Universal LLM API client — 142+ providers, 11 native language bindings, powered by Rust core

Awesome Lists containing this project

README

          

# liter-llm




Rust


Python


Node.js


WASM


Java


Go


C#


PHP


Ruby


Elixir


C FFI


Docker


Homebrew



License


Docs

kreuzberg.dev



Discord

**A lighter, faster, safer universal LLM API client** -- one Rust core, 11 native language bindings, 143 providers.

## Why liter-llm?

A universal LLM API client, compiled from the ground up in Rust. No interpreter, no transitive dependency tree, no supply chain surface area. One binary, 11 native language bindings, 143 providers.

- **Compiled Rust core.** No `pip install` supply chain. No `.pth` auto-execution hooks. No runtime dependency tree to compromise. The kind of [supply chain attack that hit litellm](https://www.xda-developers.com/popular-python-library-backdoor-machine/) in 2026 is structurally impossible here.
- **Secrets stay secret.** API keys are wrapped in [`secrecy::SecretString`](https://docs.rs/secrecy/) -- zeroed on drop, redacted in logs, never serialized.
- **Polyglot from day one.** Python, TypeScript, Go, Java, Ruby, PHP, C#, Elixir, WebAssembly, C/FFI -- all thin wrappers around the same Rust core. No reimplementation drift.
- **Observability built in.** Production-grade [OpenTelemetry](https://opentelemetry.io/) with GenAI semantic conventions -- not an afterthought callback system.
- **Composable middleware.** Rate limiting, caching, cost tracking, health checks, and fallback as [Tower](https://docs.rs/tower/) layers you stack like building blocks.

We give credit to [litellm](https://github.com/BerriAI/litellm) for proving the category -- our provider registry was bootstrapped from theirs. See [ATTRIBUTIONS.md](ATTRIBUTIONS.md).

## Feature Comparison

An honest look at where things stand. We're newer and leaner -- litellm has breadth we haven't matched yet, and we have depth they can't easily retrofit.

| | liter-llm | litellm |
|---|---|---|
| **Language** | Rust (compiled, memory-safe) | Python |
| **Bindings** | 11 native (Rust, Python, TS, Go, Java, Ruby, PHP, C#, Elixir, WASM, C) | Python (+ OpenAI-compatible proxy) |
| **Providers** | 143 (compiled at build time) | 100+ (runtime resolution) |
| **Streaming** | SSE + AWS EventStream binary protocol | SSE + AWS EventStream |
| **Observability** | Built-in OpenTelemetry (GenAI semconv) | 40+ callback integrations |
| **API key safety** | `secrecy::SecretString` (zeroed, redacted) | Plain strings |
| **Middleware** | Composable Tower stack | Built-in callback system |
| **Proxy / Gateway** | Yes (22 OpenAI-compatible endpoints, 35MB Docker) | Yes |
| **Guardrails** | -- | 10+ integrations, 4 execution modes (advanced: enterprise) |
| **Semantic caching** | -- | Redis + Qdrant backends |
| **Virtual key mgmt** | Yes (per-key model restrictions, RPM/TPM, budgets) | Yes (key rotation: enterprise) |
| **Management API** | Config-driven (REST admin API planned) | Multi-tenant (teams, budgets, keys; tiers + reporting: enterprise) |
| **Fine-tuning API** | -- | Enterprise only |
| **Load balancer** | Fallback + round-robin via Tower router | Full router with strategies |
| **Cost tracking** | Embedded pricing + OTEL spans | Per-key/team/model budgets |
| **Rate limiting** | Per-model RPM/TPM (Tower layer) | Per-key/user/team/model |
| **Caching** | In-memory LRU + 40+ backends via OpenDAL (S3, Redis, GCS, DynamoDB, disk, ...) | 7 backends (Redis, S3, GCS, disk, Qdrant) |
| **Tool calling** | Parallel tools, structured output, JSON schema | Full support |
| **Embeddings** | Yes | Yes |
| **Batch API** | Yes | Yes |
| **Audio / Speech** | Yes | Yes |
| **Lifecycle hooks** | onRequest/onResponse/onError per-client | Callback integrations |
| **Budget enforcement** | Per-model + global limits, hard/soft modes | Per-key/team budgets |
| **Health checks** | Automatic provider probes + cooldown | -- |
| **Custom providers** | Runtime API + TOML config file | Config + code-based |
| **Config files** | TOML with auto-discovery (`liter-llm.toml`) | YAML proxy config |
| **Search / OCR** | 12 search + 4 OCR providers | Yes |
| **Image generation** | Yes | Yes |

## Key Features

- **143 providers** -- OpenAI, Anthropic, Google, AWS Bedrock, Groq, Mistral, Together AI, Fireworks, Perplexity, DeepSeek, Cohere, and [130+ more](schemas/providers.json)
- **11 native bindings** -- Rust, Python, TypeScript/Node.js, Go, Java, Ruby, PHP, C#, Elixir, WebAssembly, C/FFI
- **First-class streaming** -- SSE and AWS EventStream binary protocol with zero-copy buffers
- **TOML configuration** -- `liter-llm.toml` with auto-discovery, custom providers, cache backends, middleware config
- **OpenTelemetry** -- GenAI semantic conventions, cost tracking spans, HTTP-level tracing
- **Tower middleware** -- Rate limiting, caching (40+ OpenDAL backends), cost tracking, budget enforcement, health checks, cooldowns, hooks, fallback -- all composable
- **Search & OCR** -- Web search across 12 providers, document OCR across 4 providers
- **Tool calling** -- Parallel tools, structured outputs, JSON schema validation
- **Embeddings** -- Dimension selection, base64 format, multi-provider support
- **Per-request routing** -- Automatic provider detection from model name prefix, custom provider registration at runtime
- **Schema-driven** -- Provider registry and API types compiled from JSON schemas, no runtime lookups
- **Local LLM support** — Run models locally with Ollama, LM Studio, vLLM, llama.cpp, LocalAI, and llamafile via OpenAI-compatible APIs

## Proxy Server & CLI

Drop-in replacement for litellm's proxy -- 22 OpenAI-compatible endpoints. Install the `liter-llm` CLI (which ships both the proxy server and the MCP tool server) one of three ways:

```bash
# Homebrew (macOS / Linux)
brew install kreuzberg-dev/tap/liter-llm

# Pre-built binary (Linux x86_64/arm64, macOS arm64, Windows x86_64)
curl -L https://github.com/kreuzberg-dev/liter-llm/releases/latest/download/liter-llm-${VERSION}-${TARGET}.tar.gz | tar xz

# Docker (35MB image)
docker run -p 4000:4000 -e LITER_LLM_MASTER_KEY=sk-your-key ghcr.io/kreuzberg-dev/liter-llm
```

Then call it like OpenAI:

```bash
curl http://localhost:4000/v1/chat/completions \
-H "Authorization: Bearer sk-your-key" \
-d '{"model": "openai/gpt-4o", "messages": [{"role": "user", "content": "Hello"}]}'
```

Or with a TOML config file:

```toml
# liter-llm-proxy.toml
[general]
master_key = "${LITER_LLM_MASTER_KEY}"

[[models]]
name = "gpt-4o"
provider_model = "openai/gpt-4o"
api_key = "${OPENAI_API_KEY}"

[[models]]
name = "claude-sonnet"
provider_model = "anthropic/claude-sonnet-4-20250514"
api_key = "${ANTHROPIC_API_KEY}"

[[keys]]
key = "sk-team-a"
models = ["gpt-4o"]
rpm = 100
```

**CLI:**

```bash
liter-llm api --config liter-llm-proxy.toml # Start proxy server
liter-llm mcp --transport stdio # Start MCP tool server
```

**Features:** Model routing, virtual API keys, per-key rate limiting (RPM/TPM), cost tracking, budget enforcement, response caching, SSE streaming, OpenAPI 3.1 spec at `/openapi.json`, MCP server with 22 tools, graceful shutdown.

## Architecture

```text
liter-llm/
├── crates/
│ ├── liter-llm/ # Rust core library
│ ├── liter-llm-py/ # Python (PyO3) core
│ ├── liter-llm-node/ # Node.js (NAPI-RS) core
│ ├── liter-llm-ffi/ # C-compatible FFI layer
│ ├── liter-llm-php/ # PHP (ext-php-rs) core
│ └── liter-llm-wasm/ # WebAssembly (wasm-bindgen) core
├── packages/
│ ├── python/ # Python package
│ ├── typescript/ # TypeScript/Node.js package
│ ├── go/ # Go (cgo) module
│ ├── java/ # Java (Panama FFI) package
│ ├── ruby/ # Ruby (Magnus) gem
│ ├── elixir/ # Elixir (Rustler NIF) package
│ ├── csharp/ # .NET (P/Invoke) package
│ └── php/ # PHP (Composer) package
└── schemas/ # Provider registry and API schemas
```

## Quick Start

Install in your language of choice:

| Language | Install |
|----------|---------|
| Python | `pip install liter-llm` |
| Node.js | `pnpm add @kreuzberg/liter-llm` |
| Rust | `cargo add liter-llm` |
| Go | `go get github.com/kreuzberg-dev/liter-llm/packages/go` |
| Java | `dev.kreuzberg:liter-llm` (Maven/Gradle) |
| Ruby | `gem install liter_llm` |
| PHP | `composer require kreuzberg/liter-llm` |
| C# | `dotnet add package LiterLlm` |
| Elixir | `{:liter_llm, "~> 1.0"}` in mix.exs |
| WASM | `pnpm add @kreuzberg/liter-llm-wasm` |
| C/FFI | Build from source -- see [FFI crate](crates/liter-llm-ffi) |

### Usage

```python
import asyncio, os
from liter_llm import LlmClient

async def main():
client = LlmClient(api_key=os.environ["OPENAI_API_KEY"])

# Chat with any provider using the provider/model prefix
response = await client.chat(
model="openai/gpt-4o",
messages=[{"role": "user", "content": "Hello!"}],
)
print(response.choices[0].message.content)

# Switch providers by changing the prefix -- no other code changes
client2 = LlmClient(api_key=os.environ["ANTHROPIC_API_KEY"])
response = await client2.chat(
model="anthropic/claude-sonnet-4-20250514",
messages=[{"role": "user", "content": "Hello!"}],
)
print(response.choices[0].message.content)

asyncio.run(main())
```

Or use a `liter-llm.toml` config file instead of passing everything in code:

```toml
api_key = "sk-..."
timeout_secs = 120

[cache]
max_entries = 512
ttl_seconds = 600
backend = "redis"
backend_config = { connection_string = "redis://localhost:6379" }

[budget]
global_limit = 50.0
enforcement = "hard"

[[providers]]
name = "my-provider"
base_url = "https://my-llm.example.com/v1"
model_prefixes = ["my-provider/"]
```

The same API is available in all 11 languages -- see the language READMEs below for idiomatic examples.

## Core API

All bindings expose a unified `chat()` function:

| Language | Usage |
| -------- | ----- |
| Rust | `DefaultClient::new(config).chat(messages, options).await` |
| Python | `LlmClient(api_key=...).chat(messages, config)` |
| Node.js | `new LlmClient({ apiKey }).chat(messages, config)` |
| Go | `client.Chat(ctx, messages, config)` |
| Java | `client.chat(messages, configJson)` |
| Ruby | `LiterLlm::LlmClient.new(api_key, config).chat(messages)` |
| Elixir | `LiterLlm.chat(messages, config)` |
| PHP | `LiterLlm\LlmClient::new($apiKey)->chat($messages, $config)` |
| C# | `new LlmClient(apiKey).ChatAsync(messages, config)` |
| WASM | `new LlmClient({ apiKey }).chat(messages, config)` |
| C FFI | `liter_llm_chat(client, messages_json, config_json)` |

## Language READMEs

| Language | README | Binding |
| -------- | ------ | ------- |
| Python | [packages/python](packages/python/README.md) | PyO3 |
| TypeScript / Node.js | [crates/liter-llm-node](crates/liter-llm-node/README.md) | NAPI-RS |
| Go | [packages/go](packages/go/README.md) | cgo |
| Java | [packages/java](packages/java/README.md) | Panama FFI |
| Ruby | [packages/ruby](packages/ruby/README.md) | Magnus |
| Elixir | [packages/elixir](packages/elixir/README.md) | Rustler NIF |
| PHP | [packages/php](packages/php/README.md) | ext-php-rs |
| .NET (C#) | [packages/csharp](packages/csharp/README.md) | P/Invoke |
| WebAssembly | [crates/liter-llm-wasm](crates/liter-llm-wasm/README.md) | wasm-bindgen |
| C/C++ (FFI) | [crates/liter-llm-ffi](crates/liter-llm-ffi) | C ABI |

## Part of kreuzberg.dev

liter-llm is built by the [kreuzberg.dev](https://kreuzberg.dev) team -- the same people behind [Kreuzberg](https://github.com/kreuzberg-dev/kreuzberg) (document extraction for 91+ formats), [tree-sitter-language-pack](https://github.com/kreuzberg-dev/tree-sitter-language-pack) (multilingual parsing), and [html-to-markdown](https://github.com/kreuzberg-dev/html-to-markdown). All our libraries share the same Rust-core, polyglot-bindings architecture. Visit [kreuzberg.dev](https://kreuzberg.dev) or find us on [GitHub](https://github.com/kreuzberg-dev).

## Contributing

Contributions are welcome! See [CONTRIBUTING.md](CONTRIBUTING.md) for guidelines.

Join our [Discord community](https://discord.gg/xt9WY3GnKR) for questions and discussion.

## License

MIT -- see [LICENSE](LICENSE) for details.