Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/hizkifw/lmrouter

experimental language model router and load balancer
https://github.com/hizkifw/lmrouter

Last synced: about 1 month ago
JSON representation

experimental language model router and load balancer

Host: GitHub
URL: https://github.com/hizkifw/lmrouter
Owner: hizkifw
Created: 2024-03-27T15:11:16.000Z (9 months ago)
Default Branch: main
Last Pushed: 2024-03-29T07:47:39.000Z (9 months ago)
Last Synced: 2024-03-29T16:37:32.056Z (9 months ago)
Language: Go
Size: 60.5 KB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# lmrouter

Just like [AI Horde](https://stablehorde.net/) but specifically for low-latency
streaming text generation using ephemeral inference servers.

## Usage

```sh
# Build the project
go build .

# Run the server
./lmrouter server --listen :9090

# Run the agent
./lmrouter agent --hub ws://localhost:9090 --inference http://localhost:5000
```

## How it works

![diagram](.github/images/diagram.png)

lmrouter consists of two components, server and agent. Server acts as the hub,
and will route incoming inference requests to any available agent. Agents will
run in the inference server close to where the inference API is hosted.

The agents will be making an outbound websocket connection to the server, so
there is no need to port forward agent nodes.

## Features

Implemented:

- `/v1/completions` endpoint
- `/v1/models` endpoint
- SSE streaming for completions endpoint
- Automatic selection of agent based on available models

To-do:

- `/v1/chat/completions` endpoint