https://github.com/adityak74/subagent-fleet

Run Claude Code-style subagents across your local model fleet.
https://github.com/adityak74/subagent-fleet

claude-code cli developer-tools litellm local-ai local-first ollama python subagents

Last synced: 4 days ago
JSON representation

Run Claude Code-style subagents across your local model fleet.

Host: GitHub
URL: https://github.com/adityak74/subagent-fleet
Owner: adityak74
License: mit
Created: 2026-06-15T05:10:12.000Z (9 days ago)
Default Branch: main
Last Pushed: 2026-06-15T18:24:02.000Z (8 days ago)
Last Synced: 2026-06-15T20:15:43.985Z (8 days ago)
Topics: claude-code, cli, developer-tools, litellm, local-ai, local-first, ollama, python, subagents
Language: Python
Size: 43.9 KB
Stars: 0
Watchers: 0
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE
- Agents: AGENTS.md

Awesome Lists containing this project

README

# subagent-fleet

**Run Claude Code-style subagents across your local model fleet.**

`subagent-fleet` is a config-first Python CLI for mapping coding subagents to the best Ollama model and machine you own, then generating LiteLLM and Claude Code-style agent configuration.

[![GitHub Repo stars](https://img.shields.io/github/stars/adityak74/subagent-fleet?style=social)](https://github.com/adityak74/subagent-fleet/stargazers)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE)
![Python](https://img.shields.io/badge/python-3.10%2B-blue)
![CLI](https://img.shields.io/badge/interface-CLI-4B5563)
![Ollama](https://img.shields.io/badge/ollama-compatible-111827)
![LiteLLM](https://img.shields.io/badge/litellm-ready-2563EB)
![GitHub last commit](https://img.shields.io/github/last-commit/adityak74/subagent-fleet)
![GitHub issues](https://img.shields.io/github/issues/adityak74/subagent-fleet)

[Quickstart](#quickstart) • [Configuration](#configuration) • [Generated Files](#generated-files) • [Security](#security) • [Roadmap](#roadmap)

## Overview

Local model users often have more than one useful machine: a laptop, a Mac mini, a workstation, a home server, or a spare GPU box. Most coding harnesses still point at one model endpoint.

`subagent-fleet` turns that setup into a private local subagent fleet:

```text
planner -> small fast model on a lightweight node
implementer -> larger coding model on a bigger node
reviewer -> larger coding model on a bigger node
summarizer -> small local model on the controller
```

It does not replace Ollama, LiteLLM, or Claude Code. It generates the glue between them:

```text
Claude Code / coding harness
|
v
LiteLLM gateway generated by subagent-fleet
|
+-- Ollama node: laptop
+-- Ollama node: Mac mini 64GB
+-- Ollama node: workstation
```

## Features

- Validate a declarative `fleet.yaml`.
- Discover models from configured Ollama nodes via `/api/tags`.
- Generate `litellm_config.yaml` with `ollama_chat/` routes.
- Generate Claude Code-style `.claude/agents/*.md` files.
- Generate `.env.subagent-fleet` for Claude Code/LiteLLM environment variables.
- Warm configured Ollama models with `keep_alive`.
- Show node health and agent routing tables.
- Keep unreachable nodes isolated so one offline machine does not crash the whole workflow.

## Status

MVP CLI implemented.

Available commands:

```bash
subagent-fleet init
subagent-fleet validate
subagent-fleet discover
subagent-fleet generate
subagent-fleet warmup
subagent-fleet status
subagent-fleet doctor
subagent-fleet clean
subagent-fleet skills list
subagent-fleet skills install
subagent-fleet plugins install
```

## Install

Choose one of the install paths below.

### CLI from GitHub

Install the CLI directly from PyPI:

```bash
python -m pip install subagent-fleet
```

Or install it as an isolated command with `pipx`:

```bash
pipx install subagent-fleet
```

Verify:

```bash
subagent-fleet --help
```

### Development Checkout

Use this when contributing to the project:

```bash
git clone https://github.com/adityak74/subagent-fleet.git
cd subagent-fleet
python -m pip install -e ".[dev]"
```

Run tests:

```bash
python -m pytest
```

### Claude Code Plugin First

Install the plugin first from Claude Code, then let the bundled bootstrap skill install the CLI:

```text
/plugin marketplace add https://github.com/adityak74/subagent-fleet
/plugin install subagent-fleet
```

After install, ask Claude Code:

```text
Use the subagent-fleet bootstrap skill to install the CLI and set up this repo.
```

The bootstrap skill will run or recommend:

```bash
python -m pip install subagent-fleet
subagent-fleet skills install
```

### Codex Plugin First

Install this repository as a local Codex marketplace:

```bash
codex plugin marketplace add .
codex plugin add subagent-fleet@subagent-fleet
```

Then ask Codex:

```text
Use the subagent-fleet bootstrap skill to install the CLI and set up this repo.
```

## Quickstart

Create a starter config:

```bash
subagent-fleet init
```

Edit `fleet.yaml` with your Ollama node endpoints and model names, then validate it:

```bash
subagent-fleet validate
```

Check which nodes are reachable:

```bash
subagent-fleet discover
```

Generate LiteLLM, Claude agent, and environment files:

```bash
subagent-fleet generate
```

Start LiteLLM:

```bash
export LITELLM_MASTER_KEY="sk-local-dev"

litellm \
--config ./litellm_config.yaml \
--host 127.0.0.1 \
--port 4000
```

Point Claude Code at the local gateway:

```bash
source .env.subagent-fleet
claude
```

## Configuration

`subagent-fleet` is driven by `fleet.yaml`.

```yaml
project:
name: local-dev
gateway:
provider: litellm
host: 127.0.0.1
port: 4000
master_key_env: LITELLM_MASTER_KEY

nodes:
m5-local:
endpoint: http://localhost:11434
tags: [controller, local, fast]

m4-mini-64gb:
endpoint: http://192.168.1.50:11434
tags: [heavy, coder, reviewer]

m4-mini-16gb:
endpoint: http://192.168.1.51:11434
tags: [small, planner, summarizer]

models:
heavy-coder:
node: m4-mini-64gb
ollama_model: qwen2.5-coder:32b
litellm_alias: claude-sonnet-local
context: 32768
timeout: 600
max_parallel: 1

small-coder:
node: m4-mini-16gb
ollama_model: qwen2.5-coder:7b
litellm_alias: claude-haiku-local
context: 8192
timeout: 300
max_parallel: 1

agents:
planner:
model: small-coder
description: Use for planning, file discovery, task decomposition, and summarization.
tools: [Read, Grep, Glob]
prompt: |
You are a fast local planning agent.
Do not edit files.
Return a concise response with:
- plan
- relevant files
- risks
- next recommended agent

implementer:
model: heavy-coder
description: Use for implementation, bug fixes, refactors, and patch creation.
tools: [Read, Grep, Glob, Edit, MultiEdit, Bash]

reviewer:
model: heavy-coder
description: Use after implementation to review diffs, tests, regressions, and maintainability.
tools: [Read, Grep, Glob, Bash]
```

## Generated Files

Running:

```bash
subagent-fleet generate
```

creates:

```text
litellm_config.yaml
.claude/agents/planner.md
.claude/agents/implementer.md
.claude/agents/reviewer.md
.env.subagent-fleet
```

Example LiteLLM route:

```yaml
model_list:
- model_name: claude-sonnet-local
litellm_params:
model: ollama_chat/qwen2.5-coder:32b
api_base: http://192.168.1.50:11434
api_key: ollama
timeout: 600
model_info:
max_input_tokens: 32768
```

Example Claude agent:

```markdown
---
name: planner
description: Use for planning, file discovery, task decomposition, and summarization.
model: claude-haiku-local
tools: Read, Grep, Glob
---

You are a fast local planning agent.
Do not edit files.
Return a concise response with:
- plan
- relevant files
- risks
- next recommended agent
```

## Commands

| Command | Purpose |
| --- | --- |
| `subagent-fleet init` | Create a starter `fleet.yaml`. |
| `subagent-fleet validate` | Validate schema, references, URLs, aliases, and agent names. |
| `subagent-fleet discover` | Query configured Ollama nodes for available models. |
| `subagent-fleet generate` | Generate LiteLLM config, Claude agents, and env file. |
| `subagent-fleet warmup` | Preload configured Ollama models with `keep_alive`. |
| `subagent-fleet status` | Show node health and agent routing. |
| `subagent-fleet doctor` | Show validation and local-network safety guidance. |
| `subagent-fleet clean` | List or remove generated files. |
| `subagent-fleet skills list` | List bundled assistant skills and supported targets. |
| `subagent-fleet skills install` | Install assistant-facing setup and operations skills. |
| `subagent-fleet plugins install` | Install Claude Code and Codex plugin marketplace bundles. |

JSON output is available for discovery and status:

```bash
subagent-fleet discover --json
subagent-fleet status --json
```

## Assistant Skills

`subagent-fleet` ships assistant-facing skills that teach Claude Code, Codex, OpenCode, and similar tools how to set up and operate the fleet from inside a repository.

List bundled skills and supported targets:

```bash
subagent-fleet skills list
```

Install all bundled skills for all supported targets:

```bash
subagent-fleet skills install
```

This writes:

```text
.claude/skills/subagent-fleet-setup/SKILL.md
.claude/skills/subagent-fleet-operations/SKILL.md
.codex/skills/subagent-fleet-setup/SKILL.md
.codex/skills/subagent-fleet-operations/SKILL.md
.opencode/skills/subagent-fleet-setup/SKILL.md
.opencode/skills/subagent-fleet-operations/SKILL.md
```

Install for a specific assistant:

```bash
subagent-fleet skills install --target codex
subagent-fleet skills install --target claude-code
subagent-fleet skills install --target opencode
```

Install one bundled skill:

```bash
subagent-fleet skills install --skill subagent-fleet-setup
```

Existing skill files are not overwritten unless you pass `--force`.

## Plugin Marketplaces

This repository also ships plugin marketplace metadata so users can install the assistant skill first, then let that skill install and verify the Python CLI.

Included plugin artifacts:

```text
.claude-plugin/marketplace.json
.agents/plugins/marketplace.json
plugins/subagent-fleet/.claude-plugin/plugin.json
plugins/subagent-fleet/.codex-plugin/plugin.json
plugins/subagent-fleet/skills/subagent-fleet-bootstrap/SKILL.md
plugins/subagent-fleet/skills/subagent-fleet-setup/SKILL.md
plugins/subagent-fleet/skills/subagent-fleet-operations/SKILL.md
```

The bootstrap skill teaches Claude Code or Codex how to install the CLI:

```bash
python -m pip install subagent-fleet
```

and then install repo-local assistant skills:

```bash
subagent-fleet skills install
```

Claude Code plugin install flow:

```text
/plugin marketplace add https://github.com/adityak74/subagent-fleet
/plugin install subagent-fleet
```

Codex local marketplace flow:

```bash
codex plugin marketplace add .
codex plugin add subagent-fleet@subagent-fleet
```

To generate the same marketplace/plugin bundle into another directory:

```bash
subagent-fleet plugins install --out /path/to/marketplace-root
```

Install only one target:

```bash
subagent-fleet plugins install --target claude-code
subagent-fleet plugins install --target codex
```

Existing plugin marketplace files are not overwritten unless you pass `--force`.

## Ollama Worker Setup

On each worker machine, run Ollama on a private interface reachable from your controller:

```bash
launchctl setenv OLLAMA_HOST "0.0.0.0:11434"
launchctl setenv OLLAMA_KEEP_ALIVE "-1"
launchctl setenv OLLAMA_NUM_PARALLEL "1"
launchctl setenv OLLAMA_MAX_LOADED_MODELS "1"

killall Ollama
open -a Ollama
```

From the controller:

```bash
curl http://NODE_IP:11434/api/tags
```

## Security

`subagent-fleet` assumes private local networking.

Do:

- Use LAN, firewall rules, Tailscale, WireGuard, or a private subnet.
- Keep `LITELLM_MASTER_KEY` set for LiteLLM access.
- Treat generated `.env.subagent-fleet` files as local developer configuration.

Do not:

- Expose Ollama directly to the public internet.
- Expose LiteLLM without authentication.
- Commit real API keys, LAN secrets, or machine-specific private `.env` files.

Run:

```bash
subagent-fleet doctor
```

for local setup and safety reminders.

## Development

Install dev dependencies:

```bash
python -m pip install -e ".[dev]"
```

Run tests:

```bash
python -m pytest
```

Run a focused test:

```bash
python -m pytest tests/test_config.py
```

Check CLI wiring:

```bash
python -m subagent_fleet.cli --help
```

## Project Layout

```text
src/subagent_fleet/
cli.py
config.py
discovery.py
plugins.py
warmup.py
status.py
skills.py
generators/
skill_templates/
templates/

examples/
plugins/
tests/
```

## Roadmap

MVP:

- [x] `fleet.yaml` schema
- [x] Ollama node health checks
- [x] Ollama model discovery via `/api/tags`
- [x] LiteLLM config generation
- [x] Claude Code agent generation
- [x] Environment file generation
- [x] Model warmup with `keep_alive`
- [x] Status and routing tables

- [ ] Latency benchmarking
- [ ] Recommended agent-to-node assignment
- [ ] Role-based routing templates
- [ ] Tailscale-aware node discovery
- [ ] OpenAI-compatible harness examples
- [ ] Release packaging

Later:

- [ ] Dynamic routing by task type
- [ ] Fallback model generation
- [ ] Queue-aware scheduling
- [ ] Agent execution trace viewer
- [ ] Support for vLLM, LM Studio, llama.cpp, OpenRouter, and cloud APIs

## Star History

## Contributing

Issues and pull requests are welcome.

Good first areas:

- More generator tests
- Additional example fleets
- Better status formatting
- More robust Ollama error reporting
- Documentation for real multi-machine setups

Before opening a PR:

```bash
python -m pytest
```

## What This Is Not

`subagent-fleet` is not:

- an inference engine
- a replacement for Ollama
- a replacement for LiteLLM
- a model sharding framework
- Kubernetes for local LLMs
- a public model hosting platform

It is a small workflow layer for private local subagent orchestration.

## License

MIT. See [LICENSE](LICENSE).

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/adityak74/subagent-fleet

Awesome Lists containing this project

README