{"id":50747510,"url":"https://github.com/deeflect/vasted","last_synced_at":"2026-06-10T22:30:50.796Z","repository":{"id":341473859,"uuid":"1170243324","full_name":"deeflect/vasted","owner":"deeflect","description":"GPU inference in one command. Auto-picks a cheap Vast.ai GPU, loads any GGUF model, gives you an OpenAI-compatible endpoint.","archived":false,"fork":false,"pushed_at":"2026-04-04T22:50:55.000Z","size":21101,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-04-05T00:40:22.442Z","etag":null,"topics":["gguf","gpu","inference","llm","openai","python","vast-ai"],"latest_commit_sha":null,"homepage":"https://github.com/deeflect/vasted","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/deeflect.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":"SECURITY.md","support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":"AGENTS.md","dco":null,"cla":null}},"created_at":"2026-03-01T22:35:25.000Z","updated_at":"2026-04-04T22:50:58.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/deeflect/vasted","commit_stats":null,"previous_names":["deeflect/vasted"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/deeflect/vasted","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/deeflect%2Fvasted","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/deeflect%2Fvasted/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/deeflect%2Fvasted/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/deeflect%2Fvasted/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/deeflect","download_url":"https://codeload.github.com/deeflect/vasted/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/deeflect%2Fvasted/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":34174148,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-10T02:00:07.152Z","response_time":89,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["gguf","gpu","inference","llm","openai","python","vast-ai"],"created_at":"2026-06-10T22:30:49.763Z","updated_at":"2026-06-10T22:30:50.765Z","avatar_url":"https://github.com/deeflect.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# vasted\n\n[![CI](https://github.com/deeflect/vasted/actions/workflows/ci.yml/badge.svg)](https://github.com/deeflect/vasted/actions/workflows/ci.yml)\n[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](./LICENSE)\n\n`vasted` is a CLI that launches on-demand Vast.ai GPU workers for `llama.cpp` GGUF inference and exposes a stable OpenAI-compatible `/v1` endpoint.\n\n## Demo\n\n![vasted demo](docs/assets/demo.gif)\n\n## Why `vasted`\n\n- Stable client endpoint while worker URLs rotate.\n- Setup wizard for local machine and VPS deployments.\n- Non-interactive automation mode for agents/CI.\n- OpenAI-compatible proxy for tools that expect `/v1` APIs.\n- Session usage and cost tracking.\n- Optional Telegram bot control commands.\n\n## Requirements\n\n- Python `3.12+`\n- [`uv`](https://docs.astral.sh/uv/)\n- Vast.ai account + API key\n- Optional: Telegram bot token (`telegram` extra)\n\n## Install\n\n### From PyPI (recommended)\n\n```bash\nuv tool install vasted\nvasted --version\n```\n\nUpgrade:\n\n```bash\nuv tool upgrade vasted\n```\n\n### From source (development)\n\n```bash\ngit clone https://github.com/deeflect/vasted.git\ncd vasted\nuv sync --extra dev\n```\n\nRun CLI commands from the repo:\n\n```bash\nuv run vasted --help\n```\n\n### Git install (latest main)\n\n```bash\nuv tool install \"git+https://github.com/deeflect/vasted.git\"\n```\n\n## Quick Start\n\nIf installed as a tool:\n\n```bash\nvasted setup\nvasted up\nvasted status --verbose\n```\n\nFrom source checkout:\n\n```bash\nuv run vasted setup\nuv run vasted up\nuv run vasted status --verbose\n```\n\nClient connection values after setup:\n\n- Base URL: `http://\u003chost\u003e:\u003cport\u003e/v1`\n- Auth header: `Authorization: Bearer \u003ctoken\u003e`\n\nWhen `proxy_host` is `0.0.0.0`, use your real machine/VPS IP or domain in clients.\n\n## Automation / Unattended Mode\n\nUse non-interactive commands to avoid prompts:\n\n```bash\nuv run vasted setup --non-interactive \\\n  --vast-api-key \"$VASTED_API_KEY\" \\\n  --bearer-token \"$VASTED_BEARER_TOKEN\" \\\n  --client openclaw \\\n  --deployment-mode local_pc \\\n  --model qwen3-coder-30b \\\n  --quality balanced \\\n  --gpu-mode auto\n\nuv run vasted up --non-interactive --yes --jinja --model qwen3-coder-30b --quality balanced --gpu-mode auto --no-serve\nuv run vasted status --verbose\nuv run vasted usage\nuv run vasted down --force\n```\n\nEnvironment variables accepted by `setup --non-interactive`:\n\n- `VASTED_API_KEY`\n- `VASTED_BEARER_TOKEN`\n- `VASTED_CLIENT` (`openclaw`, `opencode`, `custom`)\n- `VASTED_LLAMA_JINJA` (`true`/`false`)\n- `VASTED_MODEL`, `VASTED_QUALITY`, `VASTED_GPU_MODE`, `VASTED_GPU_PRESET`\n- `VASTED_DEPLOYMENT_MODE`, `VASTED_PROXY_HOST`, `VASTED_PROXY_PORT`, `VASTED_PUBLIC_HOST`\n\n## Client Profiles and Jinja Behavior\n\n`setup` supports client presets that define default `llama.cpp --jinja` behavior:\n\n- `--client openclaw`: jinja on by default\n- `--client opencode`: jinja off by default\n- `--client custom`: keep/manual behavior\n\nPer launch override is still available:\n\n```bash\nuv run vasted up --jinja\nuv run vasted up --no-jinja\n```\n\n## Command Reference\n\n```bash\nvasted setup [--non-interactive] [--manual] [--client openclaw|opencode|custom]\nvasted up [--model ...] [--quality ...] [--gpu-mode auto|manual] [--gpu-preset ...] [--profile ...] [--max-price ...] [--jinja|--no-jinja] [--yes] [--non-interactive] [--serve|--no-serve]\nvasted down [--force]\nvasted status [--verbose]\nvasted logs [--instance-id N] [--tail N]\nvasted usage\nvasted token show [--full]\nvasted token rotate\nvasted rotate-token\nvasted config show\nvasted profile list|add|use|remove\nvasted completions \u003cbash|zsh|fish\u003e\n```\n\n## Telegram Bot (Optional)\n\nInstall telegram extra and run:\n\n```bash\nuv sync --extra telegram\nuv run python bot.py\n```\n\n## Development\n\n```bash\nuv run ruff check .\nuv run mypy app tests bot.py\nuv run pytest -q\n```\n\n## Project Layout\n\n- `app/commands/*`: CLI command handlers\n- `app/service.py`: worker lifecycle + launch policy\n- `app/proxy.py`: OpenAI-compatible reverse proxy\n- `app/vast.py`: Vast API integration + startup script generation\n- `app/usage.py`: token/time/cost accounting\n- `app/user_config.py`: persistent config + keyring integration\n- `app/state.py`: runtime state persistence\n- `bot.py`: optional Telegram control plane\n\n## Security\n\n- Keep Vast API keys and bearer tokens private.\n- Prefer localhost binds unless remote access is required.\n- See [SECURITY.md](./SECURITY.md) for disclosure policy.\n\n## Contributing\n\nSee [CONTRIBUTING.md](./CONTRIBUTING.md) and run the validation commands before opening a PR.\n\n## License\n\nMIT — see [LICENSE](./LICENSE).\n\n---\n\n### Built by\n\nBuilt by [Dee](https://deeflect.com). Started as \"just spin up a GPU for an hour\" and grew a control plane, a Telegram bot, and an OpenAI-compatible proxy. Such is life.\n\nStar if vasted spared you the AWS GPU pricing calculator. [Open an issue](https://github.com/deeflect/vasted/issues) if Vast.ai changes its API again (they will).\n\n[deeflect.com](https://deeflect.com) · [Wikidata](https://www.wikidata.org/entity/Q138828544) · [LinkedIn](https://www.linkedin.com/in/dkargaev/) · [X](https://x.com/deeflectcom)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdeeflect%2Fvasted","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdeeflect%2Fvasted","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdeeflect%2Fvasted/lists"}