https://github.com/shanevcantwell/llauncher

llama-server model endpoint manager with JiT swapping with human UI, MCP and pi extension support.
https://github.com/shanevcantwell/llauncher

agentic-ai agentic-workflow ai llama-cpp llama-server llm mcp mcp-server pi-harness python server-admin

Last synced: about 1 month ago
JSON representation

llama-server model endpoint manager with JiT swapping with human UI, MCP and pi extension support.

Host: GitHub
URL: https://github.com/shanevcantwell/llauncher
Owner: shanevcantwell
Created: 2026-04-05T10:11:37.000Z (2 months ago)
Default Branch: main
Last Pushed: 2026-05-02T14:40:57.000Z (about 1 month ago)
Last Synced: 2026-05-03T01:55:04.957Z (about 1 month ago)
Topics: agentic-ai, agentic-workflow, ai, llama-cpp, llama-server, llm, mcp, mcp-server, pi-harness, python, server-admin
Language: Python
Homepage: http://reflectiveattention.ai
Size: 746 KB
Stars: 0
Watchers: 0
Forks: 0
Open Issues: 18
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# llauncher

An MCP-first launcher and management tool for llama.cpp `llama-server` instances. Designed for both programmatic control via LLMs and human operators via a web UI.

## Features

### MCP Server
Full programmatic control for LLM agents and automation:
- **List models** with current status (running/stopped)
- **Start/stop servers** with validation and audit logging
- **Manage configurations** - add, update, remove model configs
- **Get server logs** for debugging and monitoring
- **Validate configurations** before applying changes

### Streamlit UI
Web-based dashboard for human operators:
- **Dashboard**: Overview of all models with quick Start/Stop buttons
- **Manager**: Add new models or edit existing configurations
- **Running**: View live logs from active servers with Stop controls

### Configuration
- **Config Persistence**: Store configurations in `~/.llauncher/config.json` (single source of truth)
- **Validation**: Model paths verified, port conflicts detected, blacklists enforced

## Installation

```bash
# Clone the repository
git clone https://github.com/shanevcantwell/llauncher
cd llauncher

# Install in development mode (with UI)
pip install -e ".[ui]"

# Optional: Install test dependencies
pip install -e ".[test]"
```

### Windows Notes

If you see warnings like `WARNING: Ignoring invalid distribution ~` during install:

```bat
# Clean up corrupted site-packages and reinstall
cd github\llauncher
rmdir /s /q .venv
python -m venv .venv
\.venv\Scripts\activate
pip install -e ".[ui]"
```

## Quick Start

Use the runner scripts for easiest setup:

**Linux/macOS:**
```bash
./run.sh install # Set up virtual environment and install
./run.sh ui # Start dashboard (auto-starts agent)
./run.sh agent # Start agent in foreground
./run.sh stop # Stop running agent
```

**Windows:**
```cmd
run.bat install # Set up virtual environment and install
run.bat ui # Start dashboard (auto-starts agent)
run.bat agent # Start agent in foreground
run.bat stop # Stop running agent
```

## Usage

### MCP Server

Start the MCP server:

```bash
llauncher-mcp
```

Or configure in your MCP client (e.g., Claude Code):

```json
{
"mcpServers": {
"llauncher": {
"command": "llauncher-mcp",
"args": []
}
}
}
```

### Available MCP Tools

| Tool | Description |
|------|-------------|
| `list_models` | List all configured models with current status (running/stopped) |
| `get_model_config` | Get full configuration details for a specific model |
| `start_server` | Start a llama-server instance for a model (with validation) |
| `stop_server` | Stop a running server by port number |
| `swap_server` | Atomically swap models on a port with rollback guarantee |
| `server_status` | Get status summary of all running servers |
| `get_server_logs` | Fetch recent log lines from a running server |
| `update_model_config` | Update an existing model's configuration |
| `validate_config` | Validate a configuration without applying it |
| `add_model` | Add a new model configuration to the store |
| `remove_model` | Remove a model configuration (blocks if running) |

### Streamlit UI

Start the UI using the runner script (recommended):

**Linux/macOS:**
```bash
./run.sh ui
```

**Windows:**
```cmd
run.bat ui
```

The UI automatically starts a local agent if one isn't running. You can also start the agent separately with `./run.sh agent` or `run.bat agent`.

#### Dashboard Tab
- Grid view of all configured models with status indicators (🟢 Running / ⚫ Stopped)
- Quick **Start** and **Stop** buttons for each model
- **Edit** button redirects to Manager for configuration changes
- Links to API docs when server is running

#### Manager Tab
- **List Models**: View all models with expandable details (port, model path, GPU layers)
- **Add New Model**: Form to create new configurations with validation
- **Edit Model**: Pre-populated form to modify existing configurations
- **Delete Model**: Remove configurations (blocked if server is running)

#### Running Tab
- List of currently running servers with uptime
- Live log streaming for each server
- Stop button for each running instance

### CLI

llauncher provides an MCP server and Streamlit UI for model management. Use the runner scripts to start services:

```bash
./run.sh mcp # Start MCP server
./run.sh ui # Start Streamlit dashboard
```

## Configuration

Create model configurations directly in `~/.llauncher/config.json`. Configs can be managed via the UI or MCP tools.

Example config entry:

```json
{
"mistral": {
"name": "mistral",
"model_path": "/path/to/model.gguf",
"mmproj_path": null,
"default_port": 8081,
"n_gpu_layers": 255,
"ctx_size": 131072,
"threads": 8,
"threads_batch": 8,
"ubatch_size": 512,
"batch_size": null,
"flash_attn": "on",
"no_mmap": false,
"cache_type_k": "f32",
"cache_type_v": "f32",
"n_cpu_moe": null,
"parallel": 1,
"temperature": null,
"top_k": null,
"top_p": null,
"min_p": null,
"repeat_penalty": null,
"reverse_prompt": null,
"mlock": false,
"extra_args": ""
}
}
```

## Change Management

llauncher includes validation rules to prevent problematic actions:

- **Port conflicts**: Prevents starting models on ports already in use
- **Blacklisted ports**: Default blacklist includes port 8080 (commonly used by other services)
- **Model whitelists**: Optionally restrict which models can be started
- **Caller blacklists**: Restrict which callers (UI, MCP, etc.) can perform actions

## Project Structure

```
llauncher/
├── pyproject.toml
├── llauncher/
│ ├── __init__.py
│ ├── __main__.py
│ ├── agent/ # HTTP agent for multi-node management
│ │ ├── config.py
│ │ ├── routing.py
│ │ └── server.py
│ ├── core/
│ │ ├── config.py # Config persistence
│ │ ├── process.py # Process management
│ │ └── settings.py # Global settings
│ ├── mcp/
│ │ ├── server.py # MCP server
│ │ └── tools/ # Tool implementations
│ ├── models/
│ │ └── config.py # Pydantic models
│ ├── remote/ # Multi-node support
│ │ ├── node.py
│ │ ├── registry.py
│ │ └── state.py
│ ├── state.py # StateManager
│ └── ui/
│ ├── app.py # Streamlit app
│ └── tabs/ # UI components
```

## Testing

Run the test suite:

```bash
pytest
# or with coverage
pytest --cov=llauncher --cov-report=term-missing
```

Test files are in `tests/`:
- `tests/unit/`: Unit tests for models, config, and process
- `tests/integration/`: Integration tests for state management

## Multi-Node Management (Remote)

llauncher supports managing llama-server instances across multiple machines (Windows and Linux) on a local network from a single dashboard.

### Architecture

Each managed node runs a lightweight **agent** that exposes an HTTP API. The "head" dashboard connects to these agents over the LAN:

```
┌─────────────────────────────────────┐
│ HEAD DASHBOARD │
│ - Streamlit UI with node selector │
│ - Connects to all agents via HTTP │
└─────────────┬───────────────────────┘
│ LAN (port 8765)
┌─────────┼─────────┐
▼ ▼ ▼
┌────────┐ ┌────────┐ ┌────────┐
│ Agent │ │ Agent │ │ Agent │
│ Linux │ │Windows │ │ Linux │
│ :8765 │ │ :8765 │ │ :8765 │
└────────┘ └────────┘ └────────┘
```

### Deployment

#### 1. Install on Each Node

On every machine you want to manage (including the head):

**Linux/macOS:**
```bash
git clone https://github.com/shanevcantwell/llauncher
cd llauncher
./run.sh install
```

**Windows:**
```cmd
git clone https://github.com/shanevcantwell/llauncher
cd llauncher
run.bat install
```

#### 2. Start the Agent on Each Node

**Using runner scripts (recommended):**

**Linux/macOS:**
```bash
./run.sh agent # Foreground
./run.sh agent-bg # Background
./run.sh stop # Stop agent
```

**Windows:**
```cmd
run.bat agent # Foreground
run.bat agent-bg # Background
run.bat stop # Stop agent
```

**With custom configuration:**
```bash
# Linux/macOS
LAUNCHER_AGENT_PORT=9000 LAUNCHER_AGENT_NODE_NAME="my-server" ./run.sh agent

# Windows (PowerShell)
$env:LAUNCHER_AGENT_PORT="9000"
$env:LAUNCHER_AGENT_NODE_NAME="my-server"
run.bat agent
```

**Environment Variables:**
- `LAUNCHER_AGENT_HOST`: Host to bind to (default: `0.0.0.0`)
- `LAUNCHER_AGENT_PORT`: Port to listen on (default: `8765`)
- `LAUNCHER_AGENT_NODE_NAME`: Friendly name for the node

#### 3. Start the Dashboard on the Head Machine

**Linux/macOS:**
```bash
./run.sh ui
```

**Windows:**
```cmd
run.bat ui
```

The dashboard will automatically:
1. Show a loading screen while initializing
2. Start a local agent if one isn't running
3. Register itself as the "local" node

#### 4. Add Remote Nodes

In the dashboard:
1. Go to the **Nodes** tab
2. Click **➕ Add New Node**
3. Enter:
- **Node Name**: Friendly name (e.g., `linux-box`, `windows-server`)
- **Host**: IP address or hostname (e.g., `192.168.1.100`)
- **Port**: Agent port (default: `8765`)
4. Click **🔍 Test Connection** to verify
5. Click **➕ Add Node** to register

### Network Configuration

#### Firewall Rules

Ensure port 8765 is open on managed nodes:

**Linux (ufw):**
```bash
sudo ufw allow 8765/tcp
```

**Linux (firewalld):**
```bash
sudo firewall-cmd --permanent --add-port=8765/tcp
sudo firewall-cmd --reload
```

**Windows (PowerShell):**
```powershell
New-NetFirewallRule -DisplayName "llauncher Agent" -Direction Inbound -LocalPort 8765 -Protocol TCP -Action Allow
```

#### Security Notes

- **Trusted LAN Only**: Agents run without authentication by default. Only expose them on trusted networks.
- **Bind to Specific Interface**: Use `LAUNCHER_AGENT_HOST` to bind to a specific IP instead of `0.0.0.0`.
- **Firewall**: Restrict port 8765 to your LAN subnet.

### Usage

#### Dashboard Tab

- **Node Selector** (sidebar): Filter view by specific node or "All Nodes"
- **Running Servers**: Shows all active servers with node badges
- **Models**: Lists all configured models grouped by node
- **Start/Stop**: Control servers on any node

#### Nodes Tab

- **Registered Nodes**: List of all connected nodes with status
- **Test Connection**: Verify agent connectivity
- **Remove Node**: Unregister a node from the dashboard

### Troubleshooting

#### "Connection Failed" when adding node

1. Verify agent is running on the remote node:
```bash
curl http://:8765/health
```

2. Check firewall rules on the remote node

3. Verify the agent is binding to the correct interface:
```bash
# Should show 0.0.0.0:8765 or your LAN IP
netstat -tlnp | grep 8765
```

#### Agent won't start

1. Check if port 8765 is already in use:
```bash
lsof -i :8765
# or
netstat -tlnp | grep 8765
```

2. Use a different port:
```bash
LAUNCHER_AGENT_PORT=9000 llauncher-agent
```

#### Can't connect from Windows to Linux (or vice versa)

1. Verify network connectivity:
```bash
ping
```

2. Check that the agent is not binding to localhost only:
- Look for `0.0.0.0:8765` in agent startup logs
- If it shows `127.0.0.1:8765`, set `LAUNCHER_AGENT_HOST=0.0.0.0`

### API Documentation

When an agent is running, visit `http://:8765/docs` for interactive API documentation.

### License

MIT

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/shanevcantwell/llauncher

Awesome Lists containing this project

README