
An open API service indexing awesome lists of open source software.

🐦 A open blazing-fast simple model gateway for rapid development of production GenAI apps

ai gateway gateway-api genai generative-ai glide go llm llmops ml mlops router

Last synced: 3 months ago
JSON representation

🐦 A open blazing-fast simple model gateway for rapid development of production GenAI apps




Glide GH Header

Glide: Cloud-Native LLM Gateway for Seamless LLMOps

Glide Docs
FOSSA Status


**Glide** is your go-to cloud-native LLM gateway, delivering high-performance LLMOps in a lightweight, all-in-one package.

We take all problems of managing and communicating with external providers out of your applications,
so you can dive into tackling your core challenges.

> [!Important]
> Give us a star⭐ to support the project and watchπŸ‘€ our repositories not to miss any update. Appriciate your interest πŸ™

Glide sits between your application and model providers to seamlessly handle various LLMOps tasks like
model failover, caching, key management, etc.

Take a look at the develop branch.

Check out our [documentation](!

> [!Warning]
> Glide is under active development right now πŸ› οΈ

## Features

- **Unified REST API** across providers. Avoid vendor lock-in and changes in your applications when you swap model providers.
- **High availability** and **resiliency** when working with external model providers. Automatic **fallbacks** on provider failures, rate limits, transient errors. Smart retries to reduce communication latency.
- Support **popular LLM providers**.
- **High performance**. Performance is our priority. We want to keep Glide "invisible" for your latency-wise, while providing rich functionality.
- **Production-ready observability** via OpenTelemetry, emit metrics on models health, allows whitebox monitoring (coming soon)
- Straightforward and simple maintenance and configuration, centralized API key control & management & rotation, etc.

### Large Language Models

| Provider | Supported Capabilities |
| OpenAI | βœ… Chat
βœ… Streaming Chat |
| Anthropic | βœ… Chat
πŸ—οΈ Streaming Chat (coming soon) |
| Azure OpenAI | βœ… Chat
πŸ—οΈ Streaming Chat (coming soon) |
| AWS Bedrock (Titan) | βœ… Chat |
| Cohere | βœ… Chat
πŸ—οΈ Streaming Chat (coming soon) |
| Google Gemini | πŸ—οΈ Chat (coming soon) |
| OctoML | βœ… Chat |
| Ollama | βœ… Chat |

## Get Started

### Installation

> [!Note]
> Windows users should follow an instruction right from [the demo README file]( that specifies how to do the steps without the `make` command as Windows doesn't come with it by default.

The easiest way to deploy Glide is to our [demo repository]( and [docker-compose](

### 1. Clone the demo repository

git clone

### 2. Init Configs

The demo repository comes with a basic config. Additionally, you need to init your secrets by running:

make init # from the demo root

This will create the `secrets` directory with one `.OPENAI_API_KEY` file that you need to put your key to.

### 3. Start Glide

After that, just use docker compose via this command to start your demo environment:

make up

### 4. Sample API Request to `/chat` endpoint

See [API Reference]( for more details.

"role": "user",
"content": "Where was it played?"
"messageHistory": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Who won the world series in 2020?"},
{"role": "assistant", "content": "The Los Angeles Dodgers won the World Series in 2020."}

### API Docs

Finally, Glide comes with OpenAPI documentation that is accessible via

That's it πŸ™Œ

Use [our documentation]( to further learn about Glide capabilities and configs.


Other ways to install Glide are available:

### Homebrew (MacOS)

brew tap einstack/tap
brew install einstack/tap/glide

### Snapcraft (Linux)

[![Get it from the Snap Store](](

snap install glide

To upgrade the already installed package, you just need to run:

snap refresh glide

Detailed instruction on Snapcraft installation for different Linux distos:

- [Arch](
- [CentOS](
- [Debian](
- [elementaryOS](
- [Fedora](
- [KDE Neon](
- [Kubuntu](
- [Manjaro](
- [Pop! OS](
- [openSUSE](
- [RHEL](
- [Ubuntu](
- [Raspberry Pi](

### Docker Images

Glide provides official images in our [GHCR]( & [DockerHub]( ):

- Alpine 3.19:
docker pull

- Ubuntu 22.04 LTS:
docker pull

- Google Distroless (non-root)
docker pull

- RedHat UBI 8.9 Micro
docker pull

### Helm Chart

Add the EinStack repository:

helm repo add einstack
helm repo update

Before installing the Helm chart, you need to create a Kubernetes secret with your API keys like:

kubectl create secret generic api-keys --from-literal=OPENAI_API_KEY=sk-abcdXYZ

Then, you need to create a custom values.yaml file to override the secret name like:

# save as custom.values.yaml, for example
apiKeySecret: "api-keys"

Finally, you should be able to install Glide's chart via:

helm upgrade glide-gateway einstack/glide --values custom.values.yaml --install

## SDKs

To let you work with Glide's API with ease, we are going to provide you with SDKs that fits your tech stack:

- Python (coming soon)
- NodeJS (coming soon)

## Core Concepts

### Routers

Routers are a core functionality of Glide. Think of routers as a group of models with some predefined logic. For example, the resilience router allows a user to define a set of backup models should the initial model fail. Another example, would be to leverage the least-latency router to make latency sensitive LLM calls in the most efficient manner.

Detailed info on routers can be found [here](

#### Available Routers

| Router | Description |
| Priority | When the target model fails the request is sent to the secondary model. The entire service instance keeps track of the number of failures for a specific model reducing latency upon model failure |
| Least Latency | This router selects the model with the lowest average latency over time. If the least latency model becomes unhealthy, it will pick the second the best, etc. |
| Round Robin | Split traffic equally among specified models. Great for A/B testing. |
| Weighted Round Robin | Split traffic based on weights. For example, 70% of traffic to Model A and 30% of traffic to Model B. |

## Community

- Join [Discord]( for real-time discussion

Open [an issue]( or start [a discussion](
if there is a feature or an enhancement you'd like to see in Glide.

## Contribute

- Maintainers

- [Roman Hlushko](, Software Engineer, Distributed Systems & MLOps
- [Max Krueger](, Data & ML Engineer

Thanks everyone for already put their effort to make Glide better and more feature-rich:

## License

Apache 2.0

[![FOSSA Status](](