https://github.com/deeflect/vasted

GPU inference in one command. Auto-picks a cheap Vast.ai GPU, loads any GGUF model, gives you an OpenAI-compatible endpoint.
https://github.com/deeflect/vasted

gguf gpu inference llm openai python vast-ai

Last synced: 2 days ago
JSON representation

GPU inference in one command. Auto-picks a cheap Vast.ai GPU, loads any GGUF model, gives you an OpenAI-compatible endpoint.

Host: GitHub
URL: https://github.com/deeflect/vasted
Owner: deeflect
License: mit
Created: 2026-03-01T22:35:25.000Z (3 months ago)
Default Branch: main
Last Pushed: 2026-04-04T22:50:55.000Z (2 months ago)
Last Synced: 2026-04-05T00:40:22.442Z (2 months ago)
Topics: gguf, gpu, inference, llm, openai, python, vast-ai
Language: Python
Homepage: https://github.com/deeflect/vasted
Size: 20.1 MB
Stars: 1
Watchers: 0
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
- Code of conduct: CODE_OF_CONDUCT.md
- Security: SECURITY.md
- Agents: AGENTS.md

Awesome Lists containing this project

README

# vasted

[![CI](https://github.com/deeflect/vasted/actions/workflows/ci.yml/badge.svg)](https://github.com/deeflect/vasted/actions/workflows/ci.yml)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](./LICENSE)

`vasted` is a CLI that launches on-demand Vast.ai GPU workers for `llama.cpp` GGUF inference and exposes a stable OpenAI-compatible `/v1` endpoint.

## Demo

![vasted demo](docs/assets/demo.gif)

## Why `vasted`

- Stable client endpoint while worker URLs rotate.
- Setup wizard for local machine and VPS deployments.
- Non-interactive automation mode for agents/CI.
- OpenAI-compatible proxy for tools that expect `/v1` APIs.
- Session usage and cost tracking.
- Optional Telegram bot control commands.

## Requirements

- Python `3.12+`
- [`uv`](https://docs.astral.sh/uv/)
- Vast.ai account + API key
- Optional: Telegram bot token (`telegram` extra)

## Install

### From PyPI (recommended)

```bash
uv tool install vasted
vasted --version
```

Upgrade:

```bash
uv tool upgrade vasted
```

### From source (development)

```bash
git clone https://github.com/deeflect/vasted.git
cd vasted
uv sync --extra dev
```

Run CLI commands from the repo:

```bash
uv run vasted --help
```

### Git install (latest main)

```bash
uv tool install "git+https://github.com/deeflect/vasted.git"
```

## Quick Start

If installed as a tool:

```bash
vasted setup
vasted up
vasted status --verbose
```

From source checkout:

```bash
uv run vasted setup
uv run vasted up
uv run vasted status --verbose
```

Client connection values after setup:

- Base URL: `http://:/v1`
- Auth header: `Authorization: Bearer `

When `proxy_host` is `0.0.0.0`, use your real machine/VPS IP or domain in clients.

## Automation / Unattended Mode

Use non-interactive commands to avoid prompts:

```bash
uv run vasted setup --non-interactive \
--vast-api-key "$VASTED_API_KEY" \
--bearer-token "$VASTED_BEARER_TOKEN" \
--client openclaw \
--deployment-mode local_pc \
--model qwen3-coder-30b \
--quality balanced \
--gpu-mode auto

uv run vasted up --non-interactive --yes --jinja --model qwen3-coder-30b --quality balanced --gpu-mode auto --no-serve
uv run vasted status --verbose
uv run vasted usage
uv run vasted down --force
```

Environment variables accepted by `setup --non-interactive`:

- `VASTED_API_KEY`
- `VASTED_BEARER_TOKEN`
- `VASTED_CLIENT` (`openclaw`, `opencode`, `custom`)
- `VASTED_LLAMA_JINJA` (`true`/`false`)
- `VASTED_MODEL`, `VASTED_QUALITY`, `VASTED_GPU_MODE`, `VASTED_GPU_PRESET`
- `VASTED_DEPLOYMENT_MODE`, `VASTED_PROXY_HOST`, `VASTED_PROXY_PORT`, `VASTED_PUBLIC_HOST`

## Client Profiles and Jinja Behavior

`setup` supports client presets that define default `llama.cpp --jinja` behavior:

- `--client openclaw`: jinja on by default
- `--client opencode`: jinja off by default
- `--client custom`: keep/manual behavior

Per launch override is still available:

```bash
uv run vasted up --jinja
uv run vasted up --no-jinja
```

## Command Reference

```bash
vasted setup [--non-interactive] [--manual] [--client openclaw|opencode|custom]
vasted up [--model ...] [--quality ...] [--gpu-mode auto|manual] [--gpu-preset ...] [--profile ...] [--max-price ...] [--jinja|--no-jinja] [--yes] [--non-interactive] [--serve|--no-serve]
vasted down [--force]
vasted status [--verbose]
vasted logs [--instance-id N] [--tail N]
vasted usage
vasted token show [--full]
vasted token rotate
vasted rotate-token
vasted config show
vasted profile list|add|use|remove
vasted completions
```

## Telegram Bot (Optional)

Install telegram extra and run:

```bash
uv sync --extra telegram
uv run python bot.py
```

## Development

```bash
uv run ruff check .
uv run mypy app tests bot.py
uv run pytest -q
```

## Project Layout

- `app/commands/*`: CLI command handlers
- `app/service.py`: worker lifecycle + launch policy
- `app/proxy.py`: OpenAI-compatible reverse proxy
- `app/vast.py`: Vast API integration + startup script generation
- `app/usage.py`: token/time/cost accounting
- `app/user_config.py`: persistent config + keyring integration
- `app/state.py`: runtime state persistence
- `bot.py`: optional Telegram control plane

## Security

- Keep Vast API keys and bearer tokens private.
- Prefer localhost binds unless remote access is required.
- See [SECURITY.md](./SECURITY.md) for disclosure policy.

## Contributing

See [CONTRIBUTING.md](./CONTRIBUTING.md) and run the validation commands before opening a PR.

## License

MIT — see [LICENSE](./LICENSE).

---

### Built by

Built by [Dee](https://deeflect.com). Started as "just spin up a GPU for an hour" and grew a control plane, a Telegram bot, and an OpenAI-compatible proxy. Such is life.

Star if vasted spared you the AWS GPU pricing calculator. [Open an issue](https://github.com/deeflect/vasted/issues) if Vast.ai changes its API again (they will).

[deeflect.com](https://deeflect.com) · [Wikidata](https://www.wikidata.org/entity/Q138828544) · [LinkedIn](https://www.linkedin.com/in/dkargaev/) · [X](https://x.com/deeflectcom)

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/deeflect/vasted

Awesome Lists containing this project

README