https://github.com/kreuzberg-dev/liter-llm

Universal LLM API client — 142+ providers, 11 native language bindings, powered by Rust core
https://github.com/kreuzberg-dev/liter-llm
anthropic api-client llm machine-learning openai polyglot python rust streaming typescript
Last synced: 3 months ago
JSON representation
Universal LLM API client — 142+ providers, 11 native language bindings, powered by Rust core
Host: GitHub
URL: https://github.com/kreuzberg-dev/liter-llm
Owner: kreuzberg-dev
License: mit
Created: 2026-03-25T07:30:37.000Z (4 months ago)
Default Branch: main
Last Pushed: 2026-04-24T16:31:22.000Z (3 months ago)
Last Synced: 2026-04-24T18:10:07.621Z (3 months ago)
Topics: anthropic, api-client, llm, machine-learning, openai, polyglot, python, rust, streaming, typescript
Language: Rust
Homepage: https://kreuzberg.dev
Size: 11.5 MB
Stars: 153
Watchers: 1
Forks: 9
Open Issues: 13
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
- Codeowners: .github/CODEOWNERS
Awesome Lists containing this project

README

          # liter-llm



  

  

    

  

  

    

  

  

    

  

  

    

  

  

    

  

  

    

  

  

    

  

  

    

  

  

    

  

  

    

  

  

    

  

  

    

  

  

    

  

  

  

    

  

  

    

  







  

    

  



**A lighter, faster, safer universal LLM API client** -- one Rust core, 11 native language bindings, 143 providers.

## Why liter-llm?

A universal LLM API client, compiled from the ground up in Rust. No interpreter, no transitive dependency tree, no supply chain surface area. One binary, 11 native language bindings, 143 providers.

- **Compiled Rust core.** No `pip install` supply chain. No `.pth` auto-execution hooks. No runtime dependency tree to compromise. The kind of [supply chain attack that hit litellm](https://www.xda-developers.com/popular-python-library-backdoor-machine/) in 2026 is structurally impossible here.

- **Secrets stay secret.** API keys are wrapped in [`secrecy::SecretString`](https://docs.rs/secrecy/) -- zeroed on drop, redacted in logs, never serialized.

- **Polyglot from day one.** Python, TypeScript, Go, Java, Ruby, PHP, C#, Elixir, WebAssembly, C/FFI -- all thin wrappers around the same Rust core. No reimplementation drift.

- **Observability built in.** Production-grade [OpenTelemetry](https://opentelemetry.io/) with GenAI semantic conventions -- not an afterthought callback system.

- **Composable middleware.** Rate limiting, caching, cost tracking, health checks, and fallback as [Tower](https://docs.rs/tower/) layers you stack like building blocks.

We give credit to [litellm](https://github.com/BerriAI/litellm) for proving the category -- our provider registry was bootstrapped from theirs. See [ATTRIBUTIONS.md](ATTRIBUTIONS.md).

## Feature Comparison

An honest look at where things stand. We're newer and leaner -- litellm has breadth we haven't matched yet, and we have depth they can't easily retrofit.

| | liter-llm | litellm |

|---|---|---|

| **Language** | Rust (compiled, memory-safe) | Python |

| **Bindings** | 11 native (Rust, Python, TS, Go, Java, Ruby, PHP, C#, Elixir, WASM, C) | Python (+ OpenAI-compatible proxy) |

| **Providers** | 143 (compiled at build time) | 100+ (runtime resolution) |

| **Streaming** | SSE + AWS EventStream binary protocol | SSE + AWS EventStream |

| **Observability** | Built-in OpenTelemetry (GenAI semconv) | 40+ callback integrations |

| **API key safety** | `secrecy::SecretString` (zeroed, redacted) | Plain strings |

| **Middleware** | Composable Tower stack | Built-in callback system |

| **Proxy / Gateway** | Yes (22 OpenAI-compatible endpoints, 35MB Docker) | Yes |

| **Guardrails** | -- | 10+ integrations, 4 execution modes (advanced: enterprise) |

| **Semantic caching** | -- | Redis + Qdrant backends |

| **Virtual key mgmt** | Yes (per-key model restrictions, RPM/TPM, budgets) | Yes (key rotation: enterprise) |

| **Management API** | Config-driven (REST admin API planned) | Multi-tenant (teams, budgets, keys; tiers + reporting: enterprise) |

| **Fine-tuning API** | -- | Enterprise only |

| **Load balancer** | Fallback + round-robin via Tower router | Full router with strategies |

| **Cost tracking** | Embedded pricing + OTEL spans | Per-key/team/model budgets |

| **Rate limiting** | Per-model RPM/TPM (Tower layer) | Per-key/user/team/model |

| **Caching** | In-memory LRU + 40+ backends via OpenDAL (S3, Redis, GCS, DynamoDB, disk, ...) | 7 backends (Redis, S3, GCS, disk, Qdrant) |

| **Tool calling** | Parallel tools, structured output, JSON schema | Full support |

| **Embeddings** | Yes | Yes |

| **Batch API** | Yes | Yes |

| **Audio / Speech** | Yes | Yes |

| **Lifecycle hooks** | onRequest/onResponse/onError per-client | Callback integrations |

| **Budget enforcement** | Per-model + global limits, hard/soft modes | Per-key/team budgets |

| **Health checks** | Automatic provider probes + cooldown | -- |

| **Custom providers** | Runtime API + TOML config file | Config + code-based |

| **Config files** | TOML with auto-discovery (`liter-llm.toml`) | YAML proxy config |

| **Search / OCR** | 12 search + 4 OCR providers | Yes |

| **Image generation** | Yes | Yes |

## Key Features

- **143 providers** -- OpenAI, Anthropic, Google, AWS Bedrock, Groq, Mistral, Together AI, Fireworks, Perplexity, DeepSeek, Cohere, and [130+ more](schemas/providers.json)

- **11 native bindings** -- Rust, Python, TypeScript/Node.js, Go, Java, Ruby, PHP, C#, Elixir, WebAssembly, C/FFI

- **First-class streaming** -- SSE and AWS EventStream binary protocol with zero-copy buffers

- **TOML configuration** -- `liter-llm.toml` with auto-discovery, custom providers, cache backends, middleware config

- **OpenTelemetry** -- GenAI semantic conventions, cost tracking spans, HTTP-level tracing

- **Tower middleware** -- Rate limiting, caching (40+ OpenDAL backends), cost tracking, budget enforcement, health checks, cooldowns, hooks, fallback -- all composable

- **Search & OCR** -- Web search across 12 providers, document OCR across 4 providers

- **Tool calling** -- Parallel tools, structured outputs, JSON schema validation

- **Embeddings** -- Dimension selection, base64 format, multi-provider support

- **Per-request routing** -- Automatic provider detection from model name prefix, custom provider registration at runtime

- **Schema-driven** -- Provider registry and API types compiled from JSON schemas, no runtime lookups

- **Local LLM support** — Run models locally with Ollama, LM Studio, vLLM, llama.cpp, LocalAI, and llamafile via OpenAI-compatible APIs

## Proxy Server & CLI

Drop-in replacement for litellm's proxy -- 22 OpenAI-compatible endpoints. Install the `liter-llm` CLI (which ships both the proxy server and the MCP tool server) one of three ways:

```bash

# Homebrew (macOS / Linux)

brew install kreuzberg-dev/tap/liter-llm

# Pre-built binary (Linux x86_64/arm64, macOS arm64, Windows x86_64)

curl -L https://github.com/kreuzberg-dev/liter-llm/releases/latest/download/liter-llm-${VERSION}-${TARGET}.tar.gz | tar xz

# Docker (35MB image)

docker run -p 4000:4000 -e LITER_LLM_MASTER_KEY=sk-your-key ghcr.io/kreuzberg-dev/liter-llm

```

Then call it like OpenAI:

```bash

curl http://localhost:4000/v1/chat/completions \

  -H "Authorization: Bearer sk-your-key" \

  -d '{"model": "openai/gpt-4o", "messages": [{"role": "user", "content": "Hello"}]}'

```

Or with a TOML config file:

```toml

# liter-llm-proxy.toml

[general]

master_key = "${LITER_LLM_MASTER_KEY}"

[[models]]

name = "gpt-4o"

provider_model = "openai/gpt-4o"

api_key = "${OPENAI_API_KEY}"

[[models]]

name = "claude-sonnet"

provider_model = "anthropic/claude-sonnet-4-20250514"

api_key = "${ANTHROPIC_API_KEY}"

[[keys]]

key = "sk-team-a"

models = ["gpt-4o"]

rpm = 100

```

**CLI:**

```bash

liter-llm api --config liter-llm-proxy.toml    # Start proxy server

liter-llm mcp --transport stdio                 # Start MCP tool server

```

**Features:** Model routing, virtual API keys, per-key rate limiting (RPM/TPM), cost tracking, budget enforcement, response caching, SSE streaming, OpenAPI 3.1 spec at `/openapi.json`, MCP server with 22 tools, graceful shutdown.

## Architecture

```text

liter-llm/

├── crates/

│   ├── liter-llm/           # Rust core library

│   ├── liter-llm-py/        # Python (PyO3) core

│   ├── liter-llm-node/      # Node.js (NAPI-RS) core

│   ├── liter-llm-ffi/       # C-compatible FFI layer

│   ├── liter-llm-php/       # PHP (ext-php-rs) core

│   └── liter-llm-wasm/      # WebAssembly (wasm-bindgen) core

├── packages/

│   ├── python/               # Python package

│   ├── typescript/           # TypeScript/Node.js package

│   ├── go/                   # Go (cgo) module

│   ├── java/                 # Java (Panama FFI) package

│   ├── ruby/                 # Ruby (Magnus) gem

│   ├── elixir/               # Elixir (Rustler NIF) package

│   ├── csharp/               # .NET (P/Invoke) package

│   └── php/                  # PHP (Composer) package

└── schemas/                  # Provider registry and API schemas

```

## Quick Start

Install in your language of choice:

| Language | Install |

|----------|---------|

| Python | `pip install liter-llm` |

| Node.js | `pnpm add @kreuzberg/liter-llm` |

| Rust | `cargo add liter-llm` |

| Go | `go get github.com/kreuzberg-dev/liter-llm/packages/go` |

| Java | `dev.kreuzberg:liter-llm` (Maven/Gradle) |

| Ruby | `gem install liter_llm` |

| PHP | `composer require kreuzberg/liter-llm` |

| C# | `dotnet add package LiterLlm` |

| Elixir | `{:liter_llm, "~> 1.0"}` in mix.exs |

| WASM | `pnpm add @kreuzberg/liter-llm-wasm` |

| C/FFI | Build from source -- see [FFI crate](crates/liter-llm-ffi) |

### Usage

```python

import asyncio, os

from liter_llm import LlmClient

async def main():

    client = LlmClient(api_key=os.environ["OPENAI_API_KEY"])

    # Chat with any provider using the provider/model prefix

    response = await client.chat(

        model="openai/gpt-4o",

        messages=[{"role": "user", "content": "Hello!"}],

    )

    print(response.choices[0].message.content)

    # Switch providers by changing the prefix -- no other code changes

    client2 = LlmClient(api_key=os.environ["ANTHROPIC_API_KEY"])

    response = await client2.chat(

        model="anthropic/claude-sonnet-4-20250514",

        messages=[{"role": "user", "content": "Hello!"}],

    )

    print(response.choices[0].message.content)

asyncio.run(main())

```

Or use a `liter-llm.toml` config file instead of passing everything in code:

```toml

api_key = "sk-..."

timeout_secs = 120

[cache]

max_entries = 512

ttl_seconds = 600

backend = "redis"

backend_config = { connection_string = "redis://localhost:6379" }

[budget]

global_limit = 50.0

enforcement = "hard"

[[providers]]

name = "my-provider"

base_url = "https://my-llm.example.com/v1"

model_prefixes = ["my-provider/"]

```

The same API is available in all 11 languages -- see the language READMEs below for idiomatic examples.

## Core API

All bindings expose a unified `chat()` function:

| Language | Usage |

| -------- | ----- |

| Rust | `DefaultClient::new(config).chat(messages, options).await` |

| Python | `LlmClient(api_key=...).chat(messages, config)` |

| Node.js | `new LlmClient({ apiKey }).chat(messages, config)` |

| Go | `client.Chat(ctx, messages, config)` |

| Java | `client.chat(messages, configJson)` |

| Ruby | `LiterLlm::LlmClient.new(api_key, config).chat(messages)` |

| Elixir | `LiterLlm.chat(messages, config)` |

| PHP | `LiterLlm\LlmClient::new($apiKey)->chat($messages, $config)` |

| C# | `new LlmClient(apiKey).ChatAsync(messages, config)` |

| WASM | `new LlmClient({ apiKey }).chat(messages, config)` |

| C FFI | `liter_llm_chat(client, messages_json, config_json)` |

## Language READMEs

| Language | README | Binding |

| -------- | ------ | ------- |

| Python | [packages/python](packages/python/README.md) | PyO3 |

| TypeScript / Node.js | [crates/liter-llm-node](crates/liter-llm-node/README.md) | NAPI-RS |

| Go | [packages/go](packages/go/README.md) | cgo |

| Java | [packages/java](packages/java/README.md) | Panama FFI |

| Ruby | [packages/ruby](packages/ruby/README.md) | Magnus |

| Elixir | [packages/elixir](packages/elixir/README.md) | Rustler NIF |

| PHP | [packages/php](packages/php/README.md) | ext-php-rs |

| .NET (C#) | [packages/csharp](packages/csharp/README.md) | P/Invoke |

| WebAssembly | [crates/liter-llm-wasm](crates/liter-llm-wasm/README.md) | wasm-bindgen |

| C/C++ (FFI) | [crates/liter-llm-ffi](crates/liter-llm-ffi) | C ABI |

## Part of kreuzberg.dev

liter-llm is built by the [kreuzberg.dev](https://kreuzberg.dev) team -- the same people behind [Kreuzberg](https://github.com/kreuzberg-dev/kreuzberg) (document extraction for 91+ formats), [tree-sitter-language-pack](https://github.com/kreuzberg-dev/tree-sitter-language-pack) (multilingual parsing), and [html-to-markdown](https://github.com/kreuzberg-dev/html-to-markdown). All our libraries share the same Rust-core, polyglot-bindings architecture. Visit [kreuzberg.dev](https://kreuzberg.dev) or find us on [GitHub](https://github.com/kreuzberg-dev).

## Contributing

Contributions are welcome! See [CONTRIBUTING.md](CONTRIBUTING.md) for guidelines.

Join our [Discord community](https://discord.gg/xt9WY3GnKR) for questions and discussion.

## License

MIT -- see [LICENSE](LICENSE) for details.
ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/kreuzberg-dev/liter-llm

Awesome Lists containing this project

README