Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/hizkifw/lmrouter
experimental language model router and load balancer
https://github.com/hizkifw/lmrouter
Last synced: about 1 month ago
JSON representation
experimental language model router and load balancer
- Host: GitHub
- URL: https://github.com/hizkifw/lmrouter
- Owner: hizkifw
- Created: 2024-03-27T15:11:16.000Z (9 months ago)
- Default Branch: main
- Last Pushed: 2024-03-29T07:47:39.000Z (9 months ago)
- Last Synced: 2024-03-29T16:37:32.056Z (9 months ago)
- Language: Go
- Size: 60.5 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# lmrouter
Just like [AI Horde](https://stablehorde.net/) but specifically for low-latency
streaming text generation using ephemeral inference servers.## Usage
```sh
# Build the project
go build .# Run the server
./lmrouter server --listen :9090# Run the agent
./lmrouter agent --hub ws://localhost:9090 --inference http://localhost:5000
```## How it works
![diagram](.github/images/diagram.png)
lmrouter consists of two components, server and agent. Server acts as the hub,
and will route incoming inference requests to any available agent. Agents will
run in the inference server close to where the inference API is hosted.The agents will be making an outbound websocket connection to the server, so
there is no need to port forward agent nodes.## Features
Implemented:
- `/v1/completions` endpoint
- `/v1/models` endpoint
- SSE streaming for completions endpoint
- Automatic selection of agent based on available modelsTo-do:
- `/v1/chat/completions` endpoint