https://github.com/adityak74/subagent-fleet
Run Claude Code-style subagents across your local model fleet.
https://github.com/adityak74/subagent-fleet
claude-code cli developer-tools litellm local-ai local-first ollama python subagents
Last synced: 4 days ago
JSON representation
Run Claude Code-style subagents across your local model fleet.
- Host: GitHub
- URL: https://github.com/adityak74/subagent-fleet
- Owner: adityak74
- License: mit
- Created: 2026-06-15T05:10:12.000Z (9 days ago)
- Default Branch: main
- Last Pushed: 2026-06-15T18:24:02.000Z (8 days ago)
- Last Synced: 2026-06-15T20:15:43.985Z (8 days ago)
- Topics: claude-code, cli, developer-tools, litellm, local-ai, local-first, ollama, python, subagents
- Language: Python
- Size: 43.9 KB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
- Agents: AGENTS.md
Awesome Lists containing this project
README
# subagent-fleet
**Run Claude Code-style subagents across your local model fleet.**
`subagent-fleet` is a config-first Python CLI for mapping coding subagents to the best Ollama model and machine you own, then generating LiteLLM and Claude Code-style agent configuration.
[](https://github.com/adityak74/subagent-fleet/stargazers)
[](LICENSE)






[Quickstart](#quickstart) • [Configuration](#configuration) • [Generated Files](#generated-files) • [Security](#security) • [Roadmap](#roadmap)
## Overview
Local model users often have more than one useful machine: a laptop, a Mac mini, a workstation, a home server, or a spare GPU box. Most coding harnesses still point at one model endpoint.
`subagent-fleet` turns that setup into a private local subagent fleet:
```text
planner -> small fast model on a lightweight node
implementer -> larger coding model on a bigger node
reviewer -> larger coding model on a bigger node
summarizer -> small local model on the controller
```
It does not replace Ollama, LiteLLM, or Claude Code. It generates the glue between them:
```text
Claude Code / coding harness
|
v
LiteLLM gateway generated by subagent-fleet
|
+-- Ollama node: laptop
+-- Ollama node: Mac mini 64GB
+-- Ollama node: workstation
```
## Features
- Validate a declarative `fleet.yaml`.
- Discover models from configured Ollama nodes via `/api/tags`.
- Generate `litellm_config.yaml` with `ollama_chat/` routes.
- Generate Claude Code-style `.claude/agents/*.md` files.
- Generate `.env.subagent-fleet` for Claude Code/LiteLLM environment variables.
- Warm configured Ollama models with `keep_alive`.
- Show node health and agent routing tables.
- Keep unreachable nodes isolated so one offline machine does not crash the whole workflow.
## Status
MVP CLI implemented.
Available commands:
```bash
subagent-fleet init
subagent-fleet validate
subagent-fleet discover
subagent-fleet generate
subagent-fleet warmup
subagent-fleet status
subagent-fleet doctor
subagent-fleet clean
subagent-fleet skills list
subagent-fleet skills install
subagent-fleet plugins install
```
## Install
Choose one of the install paths below.
### CLI from GitHub
Install the CLI directly from PyPI:
```bash
python -m pip install subagent-fleet
```
Or install it as an isolated command with `pipx`:
```bash
pipx install subagent-fleet
```
Verify:
```bash
subagent-fleet --help
```
### Development Checkout
Use this when contributing to the project:
```bash
git clone https://github.com/adityak74/subagent-fleet.git
cd subagent-fleet
python -m pip install -e ".[dev]"
```
Run tests:
```bash
python -m pytest
```
### Claude Code Plugin First
Install the plugin first from Claude Code, then let the bundled bootstrap skill install the CLI:
```text
/plugin marketplace add https://github.com/adityak74/subagent-fleet
/plugin install subagent-fleet
```
After install, ask Claude Code:
```text
Use the subagent-fleet bootstrap skill to install the CLI and set up this repo.
```
The bootstrap skill will run or recommend:
```bash
python -m pip install subagent-fleet
subagent-fleet skills install
```
### Codex Plugin First
Install this repository as a local Codex marketplace:
```bash
codex plugin marketplace add .
codex plugin add subagent-fleet@subagent-fleet
```
Then ask Codex:
```text
Use the subagent-fleet bootstrap skill to install the CLI and set up this repo.
```
## Quickstart
Create a starter config:
```bash
subagent-fleet init
```
Edit `fleet.yaml` with your Ollama node endpoints and model names, then validate it:
```bash
subagent-fleet validate
```
Check which nodes are reachable:
```bash
subagent-fleet discover
```
Generate LiteLLM, Claude agent, and environment files:
```bash
subagent-fleet generate
```
Start LiteLLM:
```bash
export LITELLM_MASTER_KEY="sk-local-dev"
litellm \
--config ./litellm_config.yaml \
--host 127.0.0.1 \
--port 4000
```
Point Claude Code at the local gateway:
```bash
source .env.subagent-fleet
claude
```
## Configuration
`subagent-fleet` is driven by `fleet.yaml`.
```yaml
project:
name: local-dev
gateway:
provider: litellm
host: 127.0.0.1
port: 4000
master_key_env: LITELLM_MASTER_KEY
nodes:
m5-local:
endpoint: http://localhost:11434
tags: [controller, local, fast]
m4-mini-64gb:
endpoint: http://192.168.1.50:11434
tags: [heavy, coder, reviewer]
m4-mini-16gb:
endpoint: http://192.168.1.51:11434
tags: [small, planner, summarizer]
models:
heavy-coder:
node: m4-mini-64gb
ollama_model: qwen2.5-coder:32b
litellm_alias: claude-sonnet-local
context: 32768
timeout: 600
max_parallel: 1
small-coder:
node: m4-mini-16gb
ollama_model: qwen2.5-coder:7b
litellm_alias: claude-haiku-local
context: 8192
timeout: 300
max_parallel: 1
agents:
planner:
model: small-coder
description: Use for planning, file discovery, task decomposition, and summarization.
tools: [Read, Grep, Glob]
prompt: |
You are a fast local planning agent.
Do not edit files.
Return a concise response with:
- plan
- relevant files
- risks
- next recommended agent
implementer:
model: heavy-coder
description: Use for implementation, bug fixes, refactors, and patch creation.
tools: [Read, Grep, Glob, Edit, MultiEdit, Bash]
reviewer:
model: heavy-coder
description: Use after implementation to review diffs, tests, regressions, and maintainability.
tools: [Read, Grep, Glob, Bash]
```
## Generated Files
Running:
```bash
subagent-fleet generate
```
creates:
```text
litellm_config.yaml
.claude/agents/planner.md
.claude/agents/implementer.md
.claude/agents/reviewer.md
.env.subagent-fleet
```
Example LiteLLM route:
```yaml
model_list:
- model_name: claude-sonnet-local
litellm_params:
model: ollama_chat/qwen2.5-coder:32b
api_base: http://192.168.1.50:11434
api_key: ollama
timeout: 600
model_info:
max_input_tokens: 32768
```
Example Claude agent:
```markdown
---
name: planner
description: Use for planning, file discovery, task decomposition, and summarization.
model: claude-haiku-local
tools: Read, Grep, Glob
---
You are a fast local planning agent.
Do not edit files.
Return a concise response with:
- plan
- relevant files
- risks
- next recommended agent
```
## Commands
| Command | Purpose |
| --- | --- |
| `subagent-fleet init` | Create a starter `fleet.yaml`. |
| `subagent-fleet validate` | Validate schema, references, URLs, aliases, and agent names. |
| `subagent-fleet discover` | Query configured Ollama nodes for available models. |
| `subagent-fleet generate` | Generate LiteLLM config, Claude agents, and env file. |
| `subagent-fleet warmup` | Preload configured Ollama models with `keep_alive`. |
| `subagent-fleet status` | Show node health and agent routing. |
| `subagent-fleet doctor` | Show validation and local-network safety guidance. |
| `subagent-fleet clean` | List or remove generated files. |
| `subagent-fleet skills list` | List bundled assistant skills and supported targets. |
| `subagent-fleet skills install` | Install assistant-facing setup and operations skills. |
| `subagent-fleet plugins install` | Install Claude Code and Codex plugin marketplace bundles. |
JSON output is available for discovery and status:
```bash
subagent-fleet discover --json
subagent-fleet status --json
```
## Assistant Skills
`subagent-fleet` ships assistant-facing skills that teach Claude Code, Codex, OpenCode, and similar tools how to set up and operate the fleet from inside a repository.
List bundled skills and supported targets:
```bash
subagent-fleet skills list
```
Install all bundled skills for all supported targets:
```bash
subagent-fleet skills install
```
This writes:
```text
.claude/skills/subagent-fleet-setup/SKILL.md
.claude/skills/subagent-fleet-operations/SKILL.md
.codex/skills/subagent-fleet-setup/SKILL.md
.codex/skills/subagent-fleet-operations/SKILL.md
.opencode/skills/subagent-fleet-setup/SKILL.md
.opencode/skills/subagent-fleet-operations/SKILL.md
```
Install for a specific assistant:
```bash
subagent-fleet skills install --target codex
subagent-fleet skills install --target claude-code
subagent-fleet skills install --target opencode
```
Install one bundled skill:
```bash
subagent-fleet skills install --skill subagent-fleet-setup
```
Existing skill files are not overwritten unless you pass `--force`.
## Plugin Marketplaces
This repository also ships plugin marketplace metadata so users can install the assistant skill first, then let that skill install and verify the Python CLI.
Included plugin artifacts:
```text
.claude-plugin/marketplace.json
.agents/plugins/marketplace.json
plugins/subagent-fleet/.claude-plugin/plugin.json
plugins/subagent-fleet/.codex-plugin/plugin.json
plugins/subagent-fleet/skills/subagent-fleet-bootstrap/SKILL.md
plugins/subagent-fleet/skills/subagent-fleet-setup/SKILL.md
plugins/subagent-fleet/skills/subagent-fleet-operations/SKILL.md
```
The bootstrap skill teaches Claude Code or Codex how to install the CLI:
```bash
python -m pip install subagent-fleet
```
and then install repo-local assistant skills:
```bash
subagent-fleet skills install
```
Claude Code plugin install flow:
```text
/plugin marketplace add https://github.com/adityak74/subagent-fleet
/plugin install subagent-fleet
```
Codex local marketplace flow:
```bash
codex plugin marketplace add .
codex plugin add subagent-fleet@subagent-fleet
```
To generate the same marketplace/plugin bundle into another directory:
```bash
subagent-fleet plugins install --out /path/to/marketplace-root
```
Install only one target:
```bash
subagent-fleet plugins install --target claude-code
subagent-fleet plugins install --target codex
```
Existing plugin marketplace files are not overwritten unless you pass `--force`.
## Ollama Worker Setup
On each worker machine, run Ollama on a private interface reachable from your controller:
```bash
launchctl setenv OLLAMA_HOST "0.0.0.0:11434"
launchctl setenv OLLAMA_KEEP_ALIVE "-1"
launchctl setenv OLLAMA_NUM_PARALLEL "1"
launchctl setenv OLLAMA_MAX_LOADED_MODELS "1"
killall Ollama
open -a Ollama
```
From the controller:
```bash
curl http://NODE_IP:11434/api/tags
```
## Security
`subagent-fleet` assumes private local networking.
Do:
- Use LAN, firewall rules, Tailscale, WireGuard, or a private subnet.
- Keep `LITELLM_MASTER_KEY` set for LiteLLM access.
- Treat generated `.env.subagent-fleet` files as local developer configuration.
Do not:
- Expose Ollama directly to the public internet.
- Expose LiteLLM without authentication.
- Commit real API keys, LAN secrets, or machine-specific private `.env` files.
Run:
```bash
subagent-fleet doctor
```
for local setup and safety reminders.
## Development
Install dev dependencies:
```bash
python -m pip install -e ".[dev]"
```
Run tests:
```bash
python -m pytest
```
Run a focused test:
```bash
python -m pytest tests/test_config.py
```
Check CLI wiring:
```bash
python -m subagent_fleet.cli --help
```
## Project Layout
```text
src/subagent_fleet/
cli.py
config.py
discovery.py
plugins.py
warmup.py
status.py
skills.py
generators/
skill_templates/
templates/
examples/
plugins/
tests/
```
## Roadmap
MVP:
- [x] `fleet.yaml` schema
- [x] Ollama node health checks
- [x] Ollama model discovery via `/api/tags`
- [x] LiteLLM config generation
- [x] Claude Code agent generation
- [x] Environment file generation
- [x] Model warmup with `keep_alive`
- [x] Status and routing tables
Next:
- [ ] Latency benchmarking
- [ ] Recommended agent-to-node assignment
- [ ] Role-based routing templates
- [ ] Tailscale-aware node discovery
- [ ] OpenAI-compatible harness examples
- [ ] Release packaging
Later:
- [ ] Dynamic routing by task type
- [ ] Fallback model generation
- [ ] Queue-aware scheduling
- [ ] Agent execution trace viewer
- [ ] Support for vLLM, LM Studio, llama.cpp, OpenRouter, and cloud APIs
## Star History
## Contributing
Issues and pull requests are welcome.
Good first areas:
- More generator tests
- Additional example fleets
- Better status formatting
- More robust Ollama error reporting
- Documentation for real multi-machine setups
Before opening a PR:
```bash
python -m pytest
```
## What This Is Not
`subagent-fleet` is not:
- an inference engine
- a replacement for Ollama
- a replacement for LiteLLM
- a model sharding framework
- Kubernetes for local LLMs
- a public model hosting platform
It is a small workflow layer for private local subagent orchestration.
## License
MIT. See [LICENSE](LICENSE).