Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/edenreich/inference-gateway

A unified Gateway for various LLM providers, including local solutions like Ollama, as well as major platforms such as Groq Cloud, Google, Cloudflare.
https://github.com/edenreich/inference-gateway

agnostic api gateway gateway-api golang inference-api infrastructure infrastructure-configuration kubernetes llm llms opensource opensource-project opensource-projects opentelemetry performance proxy proxy-server self-hosted tracing

Last synced: about 7 hours ago
JSON representation

A unified Gateway for various LLM providers, including local solutions like Ollama, as well as major platforms such as Groq Cloud, Google, Cloudflare.

Awesome Lists containing this project

README

        

Inference Gateway




CI Status



Version



License

The Inference Gateway is a proxy server designed to facilitate access to various language model APIs. It allows users to interact with different language models through a unified interface, simplifying the configuration and the process of sending requests and receiving responses from multiple LLMs, enabling an easy use of Mixture of Experts.

- [Key Features](#key-features)
- [Supported API's](#supported-apis)
- [Configuration](#configuration)
- [Examples](#examples)
- [License](#license)

## Key Features

- πŸ“œ **Open Source**: Available under the MIT License.
- πŸš€ **Unified API Access**: Proxy requests to multiple language model APIs, including Groq, OpenAI, Ollama etc.
- βš™οΈ **Environment Configuration**: Easily configure API keys and URLs through environment variables.
- 🐳 **Docker Support**: Use Docker and Docker Compose for easy setup and deployment.
- ☸️ **Kubernetes Support**: Ready for deployment in Kubernetes environments.
- πŸ“Š **OpenTelemetry Tracing**: Enable tracing for the server to monitor and analyze performance.
- πŸ›‘οΈ **Production Ready**: Built with production in mind, with configurable timeouts and TLS support.
- 🌿 **Lightweight**: Includes only essential libraries and runtime, resulting in smaller size binary of ~8.6MB.
- πŸ“‰ **Minimal Resource Consumption**: Designed to consume minimal resources and have a lower footprint.
- πŸ“š **Documentation**: Well documented with examples and guides.
- πŸ§ͺ **Tested**: Extensively tested with unit tests and integration tests.
- πŸ› οΈ **Maintained**: Actively maintained and developed.
- πŸ“ˆ **Scalable**: Easily scalable and can be used in a distributed environment - with HPA in Kubernetes.
- πŸ”’ **Compliance** and Data Privacy: This project does not collect data or analytics, ensuring compliance and data privacy.
- 🏠 **Self-Hosted**: Can be self-hosted for complete control over the deployment environment.

## Supported API's

- [OpenAI](https://platform.openai.com/)
- [Ollama](https://ollama.com/)
- [Groq Cloud](https://console.groq.com/)
- [Google](https://aistudio.google.com/)
- [Cloudflare](https://www.cloudflare.com/)

## Configuration

The Inference Gateway can be configured using environment variables. The following [environment variables](./Configurations.md) are supported.

## Examples

- Using [Docker Compose](examples/docker-compose/)
- Using [Kubernetes](examples/kubernetes/)
- Using standard [REST endpoints](examples/rest-endpoints/)

## License

This project is licensed under the MIT License.