{"id":45660510,"url":"https://github.com/scitrera/sparkrun","last_synced_at":"2026-02-24T09:02:01.828Z","repository":{"id":338896957,"uuid":"1159456624","full_name":"scitrera/sparkrun","owner":"scitrera","description":"sparkrun - launch, manage, and stop LLM inference workloads on NVIDIA DGX Spark systems","archived":false,"fork":false,"pushed_at":"2026-02-18T18:46:55.000Z","size":33006,"stargazers_count":2,"open_issues_count":4,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-02-18T22:27:03.870Z","etag":null,"topics":["dgx-spark","inference","llama-cpp","sglang","vllm"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/scitrera.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-02-16T18:48:55.000Z","updated_at":"2026-02-18T18:48:36.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/scitrera/sparkrun","commit_stats":null,"previous_names":["scitrera/oss-spark-run","scitrera/sparkrun"],"tags_count":14,"template":false,"template_full_name":null,"purl":"pkg:github/scitrera/sparkrun","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/scitrera%2Fsparkrun","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/scitrera%2Fsparkrun/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/scitrera%2Fsparkrun/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/scitrera%2Fsparkrun/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/scitrera","download_url":"https://codeload.github.com/scitrera/sparkrun/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/scitrera%2Fsparkrun/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":29777606,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-02-24T04:54:30.205Z","status":"ssl_error","status_checked_at":"2026-02-24T04:53:58.628Z","response_time":75,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["dgx-spark","inference","llama-cpp","sglang","vllm"],"created_at":"2026-02-24T09:01:33.180Z","updated_at":"2026-02-24T09:02:01.820Z","avatar_url":"https://github.com/scitrera.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# sparkrun\n\n**One command to rule them all**\n\nLaunch, manage, and stop inference workloads on one or more NVIDIA DGX Spark systems — no Slurm, no Kubernetes, no fuss.\n\nsparkrun is a unified CLI for running LLM inference on DGX Spark. Point it at your hosts, pick a recipe, and go. It\nhandles container orchestration, InfiniBand/RDMA detection, model distribution, and multi-node tensor parallelism across\nyour Spark cluster automatically.\n\nsparkrun does not need to run on a member of the cluster. You can coordinate one or more DGX Sparks from any Linux\nmachine with SSH access.\n\n```bash\n# uv is preferred mechanism for managing python environments\n# To install uv:\ncurl -LsSf https://astral.sh/uv/install.sh | sh\n\n# automatic installation via uvx (manages virtual environment and\n# creates alias in your shell, sets up autocomplete too!)\nuvx sparkrun setup install\n```\n\n\u003cdetails\u003e\n\u003csummary\u003eAlternative: manual pip install\u003c/summary\u003e\n\n```bash\npip install sparkrun\n# or\nuv pip install sparkrun\n```\n\nWith a manual install you will need to run `sparkrun setup completion` separately for tab completion.\n\n\u003c/details\u003e\n\n## Quick Start\n\n### Tab completion\n\n\u003e **Note:** If you installed via `sparkrun setup install`, tab completion is already set up — you can skip this step.\n\n```bash\nsparkrun setup completion          # auto-detects your shell\nsparkrun setup completion --shell zsh\n```\n\nAfter restarting your shell, recipe names, cluster names, and subcommands all tab-complete.\n\n### Save a cluster config\n\n```bash\n# Save your hosts once\nsparkrun cluster create mylab --hosts 192.168.11.13,192.168.11.14 -d \"My DGX Spark lab\"\nsparkrun cluster set-default mylab\n\n# Now just run — hosts are automatic\nsparkrun run nemotron3-nano-30b-nvfp4-vllm\n```\n\n### Run an inference job\n\n```bash\n# Single node vLLM (Note that minimum nodes / parallelism is configured by the recipe)\nsparkrun run qwen3-1.7b-vllm\n\n# Multi-node (2-node tensor parallel) -- using your default two node cluster\nsparkrun run qwen3-1.7b-vllm --tp 2\n\n# Override settings on the fly\nsparkrun run qwen3-1.7b-vllm --hosts 192.168.11.14 --port 9000 --gpu-mem 0.8\nsparkrun run qwen3-1.7b-vllm --tp 2 -H 192.168.11.13,192.168.11.14 -o max_model_len=8192\n\n# GGUF quantized models via llama.cpp\nsparkrun run qwen3-1.7b-llama-cpp\n```\n\nsparkrun always launches jobs in the background (detached containers) and then follows logs. **Ctrl+C detaches from\nlogs — it never kills your inference job.** Your model keeps serving.\n\n### Inspect a recipe\n\n```bash\nsparkrun show nemotron3-nano-30b-nvfp4-vllm\n```\n\n```\nName:         nemotron3-nano-30b-nvfp4\nDescription:  NVIDIA Nemotron 3 Nano 30B (upstream NVFP4) -- cluster or solo\nRuntime:      vllm\nModel:        nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-NVFP4\nContainer:    scitrera/dgx-spark-vllm:0.16.0-t5\nNodes:        1 - unlimited\nRepository:   Local\n\nDefaults:\n  gpu_memory_utilization: 0.8\n  max_model_len: 200000\n  port: 8000\n  served_model_name: nemotron3-30b-a3b\n  tensor_parallel: 1\n\nVRAM Estimation:\n  Model dtype:      nvfp4\n  Model params:     30,000,000,000\n  KV cache dtype:   bfloat16\n  Architecture:     52 layers, 2 KV heads, 128 head_dim\n  Model weights:    19.56 GB\n  KV cache:         9.92 GB (max_model_len=200,000)\n  Tensor parallel:  1\n  Per-GPU total:    29.48 GB\n  DGX Spark fit:    YES\n\n  GPU Memory Budget:\n    gpu_memory_utilization: 80%\n    Usable GPU memory:     96.8 GB (121 GB x 80%)\n    Available for KV:      77.2 GB\n    Max context tokens:    1,557,583\n    Context multiplier:    7.8x (vs max_model_len=200,000)\n```\n\nThe VRAM estimator auto-detects model architecture from HuggingFace and tells you whether your configuration fits within\nDGX Spark's 128 GB unified memory before you launch.\n\n### Custom recipe registries\n\n```bash\n# See what's configured\nsparkrun recipe registries\n\n# Add a community or private registry\nsparkrun recipe add-registry myteam \\\n  --url https://github.com/myorg/spark-recipes.git \\\n  --subpath recipes\n\n# Update all registries\nsparkrun recipe update\n\n# Search across all registries\nsparkrun search qwen3\n```\n\n### Manage running workloads\n\n```bash\n# Re-attach to logs (Ctrl+C is always safe) -- NOTE: finds cluster by combination of hosts, model, and runtime\nsparkrun logs nemotron3-nano-30b-nvfp4-vllm --cluster mylab\n\n# Stop a workload -- NOTE: finds cluster by combination of hosts, model, and runtime\nsparkrun stop nemotron3-nano-30b-nvfp4-vllm --cluster mylab\n\n# If you launched with --tp (modifying the recipe default), e.g.:\nsparkrun run nemotron3-nano-30b-nvfp4-vllm --tp 2\n# then pass --tp so stop/logs resolve the same cluster ID as run:\nsparkrun stop nemotron3-nano-30b-nvfp4-vllm --tp 2\nsparkrun logs nemotron3-nano-30b-nvfp4-vllm --tp 2\n# TIP: you can just press up and modify \"run\" to \"stop\"\n```\n\n## Supported Runtimes\n\n### vLLM\n\nFirst-class support for [vLLM](https://github.com/vllm-project/vllm). Solo and multi-node clustering via Ray. Works with\nready-built images (e.g. `scitrera/dgx-spark-vllm`). Also works with other images including those built from eugr's repo\nand/or NVIDIA images.\n\n### SGLang\n\nFirst-class support for [SGLang](https://github.com/sgl-project/sglang). Solo and multi-node clustering via SGLang's\nnative distributed backend (`--dist-init-addr`, `--nnodes`, `--node-rank`). Works with ready-built images (e.g.\n`scitrera/dgx-spark-sglang`). Should also work with other sglang images, but there seem to be a lot fewer sglang images\naround than vllm images.\n\n### llama.cpp\n\nSupport for [llama.cpp](https://github.com/ggml-org/llama.cpp) via `llama-server`. Solo mode with GGUF quantized models.\nLoads models directly from HuggingFace (e.g. `Qwen/Qwen3-1.7B-GGUF:Q4_K_M`). Lightweight alternative to vLLM/SGLang\nfor smaller models or constrained environments.\n\nGGUF models use colon syntax to select a quantization variant: `model: Qwen/Qwen3-1.7B-GGUF:Q8_0`. sparkrun\npre-downloads only the matching quant files and resolves the local cache path so the container doesn't need to\nre-download at serve time.\n\n**Experimental**: Multi-node inference via llama.cpp's RPC backend. Worker nodes run `rpc-server` and\nthe head node connects via `--rpc`. This is still evolving both upstream and in sparkrun and should be considered \nexperimental. Note that the fastest DGX Spark interconnect communication will be via NCCL and RoCE -- and the \nllama.cpp RPC mechanism involves a lot more overhead. \n\n### eugr-vllm\n\nExtends the native `vllm` runtime with [eugr/spark-vllm-docker](https://github.com/eugr/spark-vllm-docker) container\nbuilds and mod support. All sparkrun orchestration features work natively — multi-node Ray clustering, container\ndistribution, InfiniBand detection, log following, and cluster stop — while eugr-specific features (local container\nbuilds via `build-and-copy.sh`, mod application) are integrated into the launch pipeline.\n\nUse this when you need a nightly vLLM build, custom modifications, or anything that requires building containers locally\nfrom eugr's repo. For prebuilt images without mods or custom builds, use the standard `vllm` runtime instead.\n\nThe recipe format for sparkrun is designed to be mostly compatible with eugr's format — sparkrun translates any\nvariations automatically. Recipes with `recipe_version: \"1\"` or eugr-specific fields (`build_args`, `mods`) are\nauto-detected as `eugr-vllm`.\n\n```yaml\n# eugr-vllm recipe example\nruntime: eugr-vllm\nmodel: my-org/custom-model\ncontainer: vllm-node\nruntime_config:\n  mods: [ my-custom-mod ]\n  build_args: [ --some-flag ]\n```\n\n## How It Works\n\n**Recipes** are YAML files that describe an inference workload: the model, container image, runtime, and default\nparameters. sparkrun ships bundled recipes and supports custom registries (any git repo with YAML files). Sparkrun\nincludes limited recipes and also includes the eugr repo as a default registry. Recipes from eugr's format are\nauto-detected and run natively through sparkrun's orchestration pipeline. The idea in the long run is to merge recipes\nfrom multiple registries into a single unified catalog and be able to run them even if they were designed for different\nruntimes (e.g. vLLM vs SGLang) without needing to worry about the underlying command differences. See the\n[RECIPES](./RECIPES.md) specification file for more details.\n\n**Runtimes** are plugins that know how to launch a specific inference engine. sparkrun discovers them via Python entry\npoints, so custom runtimes can be added by installing a package.\n\n**Orchestration** is handled over SSH. sparkrun detects InfiniBand/RDMA interfaces on your hosts, distributes container\nimages and models from local to remote (using the ethernet interfaces of the RDMA interfaces for fast transfers when\navailable), configures NCCL environment variables, and launches containers with the right networking.\n\nEach DGX Spark has one GPU, so tensor parallelism maps directly to node count: `--tp 2` means 2 hosts.\n\n### SSH Prerequisites\n\nAll multi-node orchestration relies on SSH. At minimum, you need **passwordless SSH from your control machine\nto every cluster node**. sparkrun pulls container images and models locally and pushes them to each node\ndirectly, so node-to-node SSH is not strictly required for the default workflow.\n\nThat said, setting up a **full SSH mesh** (every host can reach every other host) is recommended — it enables\nalternative distribution strategies and is generally useful for cluster administration.\n\nThe easiest way to set this up is `sparkrun setup ssh`, which creates a full mesh across your cluster\nhosts **and** the control machine (included automatically via `--include-self`, on by default):\n\n```bash\n# Set up passwordless SSH mesh across your cluster + this machine\nsparkrun setup ssh --hosts 192.168.11.13,192.168.11.14 --user ubuntu\n\n# Or use a saved cluster\nsparkrun setup ssh --cluster mylab\n\n# Or if you've set your default cluster -- it'll just use that\nsparkrun setup ssh\n\n# Add extra hosts beyond the cluster (e.g. a jump host)\nsparkrun setup ssh --cluster mylab --extra-hosts 10.0.0.99\n\n# Exclude the control machine from the mesh\nsparkrun setup ssh --cluster mylab --no-include-self\n```\n\nYou will be prompted for passwords on first connection to each host. After that, every host in the\nmesh can SSH to every other host without passwords.\n\n\u003cdetails\u003e\n\u003csummary\u003eManual SSH setup (without sparkrun setup ssh)\u003c/summary\u003e\n\nIf you prefer to set up SSH yourself, you need key-based auth from your control machine to each node:\n\n```bash\n# Generate a key if you don't have one\nssh-keygen -t ed25519\n\n# Copy to each node\nssh-copy-id 192.168.11.13\nssh-copy-id 192.168.11.14\n```\n\n\u003c/details\u003e\n\n**SSH user**: By default sparkrun uses your current OS user for SSH. You can set a per-cluster user\nwith `sparkrun cluster create --user dgxuser` or `sparkrun cluster update --user dgxuser`, or override\nper-command with `--user`.\n\n\u003cdetails\u003e\n\u003csummary\u003eFor more advanced SSH configuration (non-default ports, identity files), use `~/.ssh/config`.\u003c/summary\u003e\n\n```\nHost spark1\n    HostName 192.168.11.13\n    User dgxuser\n\nHost spark2\n    HostName 192.168.11.14\n    User dgxuser\n```\n\n\u003c/details\u003e\n\nSolo mode (`--solo`) runs on a single host and still uses SSH unless the target is `localhost`.\n\n### Docker Group\n\nsparkrun launches containers via `docker` on each host. The SSH user must be a member of the `docker` group\non every cluster node:\n\n```bash\nsudo usermod -aG docker \"$USER\"\n```\n\n## Recipes\n\nA recipe is a YAML file:\n\n```yaml\nmodel: nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-NVFP4\nruntime: vllm\nmin_nodes: 2\ncontainer: scitrera/dgx-spark-vllm:0.16.0-t5\n\nmetadata:\n  description: NVIDIA Nemotron 3 Nano 30B (upstream NVFP4)\n  maintainer: scitrera.ai \u003copen-source-team@scitrera.com\u003e\n\ndefaults:\n  port: 8000\n  tensor_parallel: 1\n  gpu_memory_utilization: 0.8\n  max_model_len: 200000\n  served_model_name: nemotron3-30b-a3b\n\ncommand: |\n  vllm serve {model} \\\n      --served-model-name {served_model_name} \\\n      --max-model-len {max_model_len} \\\n      --gpu-memory-utilization {gpu_memory_utilization} \\\n      -tp {tensor_parallel} \\\n      --host {host} --port {port}\n```\n\nAny default can be overridden at launch time with `-o key=value` or dedicated flags like `--port`, `--tp`, `--gpu-mem`.\n\nRecipes can also include an `env` block for environment variables injected into the container. Shell variable\nreferences like `${HF_TOKEN}` are expanded from the control machine's environment, so you can forward secrets\nwithout hardcoding them. See [RECIPES.md](./RECIPES.md) for the full recipe format specification.\n\n### GGUF recipes (llama.cpp)\n\nGGUF recipes use the `llama-cpp` runtime and specify a quantization variant with colon syntax:\n\n```yaml\nmodel: Qwen/Qwen3-1.7B-GGUF:Q8_0\nruntime: llama-cpp\nmin_nodes: 1\nmax_nodes: 1\ncontainer: scitrera/dgx-spark-llama-cpp:latest\n\ndefaults:\n  port: 8000\n  host: 0.0.0.0\n  n_gpu_layers: 99\n  ctx_size: 8192\n\ncommand: |\n  llama-server \\\n      -hf {model} \\\n      --host {host} --port {port} \\\n      --n-gpu-layers {n_gpu_layers} \\\n      --ctx-size {ctx_size} \\\n      --flash-attn on --jinja --no-webui\n```\n\nWhen model pre-sync is enabled (the default), sparkrun downloads only the matching quant files locally, distributes\nthem to target hosts, and rewrites `-hf` to `-m` with the resolved container cache path so the container serves from\nthe local copy without re-downloading.\n\n## CLI Reference\n\n### Global options\n\n| Option             | Description                 |\n|--------------------|-----------------------------|\n| `-v` / `--verbose` | Enable verbose/debug output |\n| `--version`        | Show version and exit       |\n| `--help`           | Show help for any command   |\n\n### Workload commands\n\n| Command                  | Description                  |\n|--------------------------|------------------------------|\n| `sparkrun run \u003crecipe\u003e`  | Launch an inference workload |\n| `sparkrun stop \u003crecipe\u003e` | Stop a running workload      |\n| `sparkrun logs \u003crecipe\u003e` | Re-attach to workload logs   |\n\n**`sparkrun run` options:**\n\n| Option                       | Description                                              |\n|------------------------------|----------------------------------------------------------|\n| `--hosts` / `-H`             | Comma-separated host list (first = head)                 |\n| `--hosts-file`               | File with hosts (one per line, `#` comments)             |\n| `--cluster`                  | Use a saved cluster by name                              |\n| `--solo`                     | Force single-node mode                                   |\n| `--port`                     | Override serve port                                      |\n| `--tp` / `--tensor-parallel` | Override tensor parallelism                              |\n| `--gpu-mem`                  | Override GPU memory utilization (0.0-1.0)                |\n| `--image`                    | Override container image  (not recommended)              |\n| `--cache-dir`                | HuggingFace cache directory                              |\n| `--option` / `-o`            | Override any recipe default: `-o key=value` (repeatable) |\n| `--dry-run` / `-n`           | Show what would be done without executing                |\n| `--foreground`               | Run in foreground (don't detach)                         |\n| `--no-follow`                | Don't follow container logs after launch                 |\n| `--skip-ib`                  | Skip InfiniBand detection     (not recommended)          |\n| `--ray-port`                 | Ray GCS port (default: 46379)  (vllm)                    |\n| `--init-port`                | SGLang distributed init port (default: 25000)            |\n| `--dashboard`                | Enable Ray dashboard on head node (vllm)                 |\n| `--dashboard-port`           | Ray dashboard port (default: 8265)                       |\n\n**`sparkrun stop` options:**\n\n| Option                       | Description                  |\n|------------------------------|------------------------------|\n| `--hosts` / `-H`             | Comma-separated host list    |\n| `--hosts-file`               | File with hosts              |\n| `--cluster`                  | Use a saved cluster by name  |\n| `--tp` / `--tensor-parallel` | Match host trimming from run |\n| `--dry-run` / `-n`           | Show what would be done      |\n\n**`sparkrun logs` options:**\n\n| Option                       | Description                                         |\n|------------------------------|-----------------------------------------------------|\n| `--hosts` / `-H`             | Comma-separated host list                           |\n| `--hosts-file`               | File with hosts                                     |\n| `--cluster`                  | Use a saved cluster by name                         |\n| `--tp` / `--tensor-parallel` | Match host trimming from run                        |\n| `--tail`                     | Number of existing log lines to show (default: 100) |\n\n### Recipe commands\n\n| Command                             | Description                                       |\n|-------------------------------------|---------------------------------------------------|\n| `sparkrun list [query]`             | List available recipes (alias)                    |\n| `sparkrun show \u003crecipe\u003e`            | Show recipe details + VRAM estimate (alias)       |\n| `sparkrun search \u003cquery\u003e`           | Search recipes by name/model/description (alias)  |\n| `sparkrun recipe list [query]`      | List available recipes from all registries        |\n| `sparkrun recipe show \u003crecipe\u003e`     | Show detailed recipe information                  |\n| `sparkrun recipe search \u003cquery\u003e`    | Search for recipes by name, model, or description |\n| `sparkrun recipe validate \u003crecipe\u003e` | Validate a recipe file                            |\n| `sparkrun recipe vram \u003crecipe\u003e`     | Estimate VRAM usage for a recipe                  |\n\n**`sparkrun recipe vram` options:**\n\n| Option                       | Description                               |\n|------------------------------|-------------------------------------------|\n| `--tp` / `--tensor-parallel` | Override tensor parallelism               |\n| `--max-model-len`            | Override max sequence length              |\n| `--gpu-mem`                  | Override gpu_memory_utilization (0.0-1.0) |\n| `--no-auto-detect`           | Skip HuggingFace model auto-detection     |\n\n### Registry commands\n\n| Command                                  | Description                       |\n|------------------------------------------|-----------------------------------|\n| `sparkrun recipe registries`             | List configured recipe registries |\n| `sparkrun recipe add-registry \u003cname\u003e`    | Add a custom recipe registry      |\n| `sparkrun recipe remove-registry \u003cname\u003e` | Remove a recipe registry          |\n| `sparkrun recipe update`                 | Update registries from git        |\n\n### Cluster commands\n\n| Command                               | Description                                         |\n|---------------------------------------|-----------------------------------------------------|\n| `sparkrun cluster create \u003cname\u003e`      | Create a new named cluster (`--user` sets SSH user) |\n| `sparkrun cluster update \u003cname\u003e`      | Update hosts, description, or user of a cluster     |\n| `sparkrun cluster list`               | List all saved clusters                             |\n| `sparkrun cluster show \u003cname\u003e`        | Show details of a saved cluster                     |\n| `sparkrun cluster delete \u003cname\u003e`      | Delete a saved cluster                              |\n| `sparkrun cluster set-default \u003cname\u003e` | Set the default cluster                             |\n| `sparkrun cluster unset-default`      | Remove the default cluster setting                  |\n| `sparkrun cluster default`            | Show the current default cluster                    |\n| `sparkrun cluster status`             | Show running containers, pending operations, and IP mappings |\n| `sparkrun status`                     | Alias for `sparkrun cluster status`                          |\n\nThe first host in a cluster definition is used as the **head node** for multi-node jobs. Order the remaining\nhosts however you like — they become workers.\n\n### Setup commands\n\n| Command                            | Description                                       |\n|------------------------------------|---------------------------------------------------|\n| `sparkrun setup install`           | Install sparkrun as a uv tool + tab-completion    |\n| `sparkrun setup completion`        | Install shell tab-completion (bash/zsh/fish)      |\n| `sparkrun setup update`            | Update sparkrun to the latest version             |\n| `sparkrun setup ssh`               | Set up passwordless SSH mesh across hosts         |\n| `sparkrun setup cx7`               | Detect and configure ConnectX-7 NICs across hosts |\n| `sparkrun setup fix-permissions`   | Fix root-owned HF cache files on cluster hosts    |\n| `sparkrun setup clear-cache`       | Drop Linux page cache on cluster hosts             |\n\n## Roadmap\n\n- Additional bundled recipes for popular models\n- Health checks and status monitoring for running workloads\n\n## About\n\nsparkrun provides a unified tool for running inference on DGX Spark systems without Slurm or Kubernetes coordination. It\nis intended to be donated to a future community organization.\n\n## License\n\nApache License 2.0 — see [LICENSE](LICENSE) for details.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fscitrera%2Fsparkrun","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fscitrera%2Fsparkrun","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fscitrera%2Fsparkrun/lists"}