https://github.com/makazhanalpamys/soup

Soup turns the pain of LLM fine-tuning into a simple workflow. One config, one command, done.
https://github.com/makazhanalpamys/soup

artificial-intelligence cli dpo fine-tuning finetuning gguf huggingface llm llmops local-llm lora machine-learning model-finetuning ollama peft python pytorch qlora sft transformers

Last synced: about 2 months ago
JSON representation

Soup turns the pain of LLM fine-tuning into a simple workflow. One config, one command, done.

Host: GitHub
URL: https://github.com/makazhanalpamys/soup
Owner: MakazhanAlpamys
License: apache-2.0
Created: 2026-02-20T11:05:01.000Z (5 months ago)
Default Branch: main
Last Pushed: 2026-05-30T08:02:36.000Z (about 2 months ago)
Last Synced: 2026-05-30T10:05:51.207Z (about 2 months ago)
Topics: artificial-intelligence, cli, dpo, fine-tuning, finetuning, gguf, huggingface, llm, llmops, local-llm, lora, machine-learning, model-finetuning, ollama, peft, python, pytorch, qlora, sft, transformers
Language: Python
Homepage: https://trysoup.dev
Size: 3.49 MB
Stars: 65
Watchers: 1
Forks: 14
Open Issues: 113
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- Funding: .github/FUNDING.yml
- License: LICENSE
- Code of conduct: CODE_OF_CONDUCT.md
- Codeowners: CODEOWNERS
- Security: SECURITY.md
- Notice: NOTICE
- Agents: AGENTS.md

Awesome Lists containing this project

README

          


  



Soup




  Fine-tune LLMs in one command. No SSH, no config hell.





  Website ·

  Quick Start ·

  Config ·

  Docs ·

  Commands ·

  Models





  

  

  

  

  

  

  



---

Soup turns the pain of LLM fine-tuning into a simple workflow. One config, one command, done.

```bash

pip install 'soup-cli[train]'   # add [train] to fine-tune; bare `soup-cli` is the light CLI

soup init --template chat

soup train

```

## Why Soup?

Training LLMs is still painful. Even experienced teams spend 30-50% of their time fighting

infrastructure instead of improving models. Soup fixes that.

- **Zero SSH.** Never SSH into a broken GPU box again.

- **One config.** A simple YAML file is all you need.

- **Auto everything.** Batch size, GPU detection, quantization — handled.

- **Works locally.** Train on your own GPU with QLoRA. No cloud required.

## What's New

**v0.71.0 — Lighter install.** The heavy training stack (PyTorch, Transformers, PEFT, TRL,

datasets, bitsandbytes, accelerate) moved into a `[train]` extra. `pip install soup-cli` is now a

light CLI + data-tools install with no PyTorch; `pip install 'soup-cli[train]'` adds everything you

need to fine-tune. **Breaking:** existing users who train must reinstall with `[train]`.

Full history: [CHANGELOG.md](CHANGELOG.md) · [GitHub Releases](https://github.com/MakazhanAlpamys/Soup/releases).

## Quick Start

### 1. Install

```bash

pip install soup-cli            # light: CLI + config + data tools (no PyTorch)

pip install 'soup-cli[train]'   # add the training stack (torch, transformers, peft, trl, …)

pip install git+https://github.com/MakazhanAlpamys/Soup.git   # latest dev

```

`soup init`, `soup data …`, and the other data/inspection commands work on the light install.

Fine-tuning (`soup train`) needs the `[train]` extra.

### 2. Create a config

```bash

soup init                       # interactive wizard

soup init --template chat       # or start from a template

```

Templates: `chat`, `code`, `tool-calling`, `medical`, `reasoning`, `vision`, `kto`, `orpo`,

`simpo`, `ipo`, `bco`, `rlhf`, `pretrain`, `moe`, `longcontext`, `embedding`, `audio`.

### 3. Train, test, ship

```bash

soup train --config soup.yaml                 # LoRA, quantization, batching — all handled

soup chat  --model ./output                    # talk to your model

soup push  --model ./output --repo you/my-model

soup merge  --adapter ./output                              # merge LoRA into the base

soup export --model ./output --format gguf --quant q4_k_m   # GGUF for Ollama / llama.cpp

```

More export targets (ONNX, TensorRT, AWQ, GPTQ, BitNet) and deployment options live in

[`docs/serving-and-export.md`](docs/serving-and-export.md).

## Configuration

A complete `soup.yaml`:

```yaml

base: meta-llama/Llama-3.1-8B-Instruct

task: sft

# backend: unsloth  # 2-5x faster, pip install 'soup-cli[fast]'

data:

  train: ./data/train.jsonl

  format: alpaca

  val_split: 0.1

training:

  epochs: 3

  lr: 2e-5

  batch_size: auto

  lora:

    r: 64

    alpha: 16

  quantization: 4bit

output: ./output

```

`config/schema.py` is the single source of truth for every field. Advanced data, training,

and PEFT options are documented under [Documentation](#documentation).

## Documentation

The full feature reference lives in [`docs/`](docs/). Start here:

| Guide | Covers |

|---|---|

| [Training tasks & methods](docs/training.md) | SFT, DPO/GRPO/PPO/KTO/ORPO/SimPO/IPO/BCO, tool-calling, PRM, pre-training, distillation, classification, vision/audio/TTS, unlearning, RAFT/RA-DIT, loop-hardening detectors |

| [PEFT, long context & efficiency](docs/peft-and-efficiency.md) | DoRA, LoRA+, rsLoRA, VeRA, OLoRA, NEFTune, PiSSA, ReLoRA, optimizer & PEFT zoo, LLaMA Pro, GaLore, YaRN/LongLoRA, packing, curriculum, auto-tuning |

| [Performance & quantization](docs/performance-and-quantization.md) | QAT, FP8, Quant Menu (I + II), KV-cache, NVFP4, save formats, Cut Cross-Entropy, gradient checkpointing, kernels, activation offloading, multi-GPU / DeepSpeed / FSDP |

| [Data engineering](docs/data.md) | Formats, the Axolotl/LF-parity pipeline, data tools, synthetic generation & forge, quality scorecards, trace tooling, remote datasets, mixing, recipe DAGs |

| [Evaluation & probes](docs/evaluation.md) | Eval design/gate, eval-gated training, benchmarks, NLG metrics, calibration, Elo arena, diagnose, post-train X-ray probes, A/B, drift, tunability, `soup advise` |

| [Serving & export](docs/serving-and-export.md) | OpenAI-compatible server, batch inference, benchmarking, merge/export, Anthropic Messages endpoint, speculative decoding, deploy autopilot, Web UI, Agent Forge |

| [Adapters, registry & governance](docs/adapters-and-governance.md) | Adapter lifecycle/management, model registry, Soup Cans, the data flywheel (`soup loop`), knowledge editing, steering, supply-chain controls (scan/sign/BOM/attest/audit/airgap) |

| [Backends, platform & ops](docs/backends-and-ops.md) | MLX/Unsloth backends, alternative hubs, HF Hub integration, autopilot, experiment tracking, plan/apply, env lockfiles, hardware-fit, completions, plugins, utility commands |

| [Command reference](docs/commands.md) | The full `soup` command list |

| [Supported models & extras](docs/models.md) | Recommended model families, the VRAM size guide, the pip extras matrix |

## Data Formats

All formats are auto-detected from JSONL, JSON, CSV, Parquet, or TXT:

- **alpaca** — `{"instruction": ..., "input": ..., "output": ...}`

- **sharegpt** — `{"conversations": [{"from": "human", "value": ...}, ...]}`

- **chatml** — `{"messages": [{"role": "user", "content": ...}, ...]}`

- **dpo / orpo / simpo / ipo** — `{"prompt": ..., "chosen": ..., "rejected": ...}`

- **kto** — `{"prompt": ..., "completion": ..., "label": true}`

- **llava / sharegpt4v** (vision), **audio**, **plaintext** (pre-training), **embedding**,

  **prm**, **pre_tokenized**, **video**, **multimodal**

Full schemas and the Axolotl/LlamaFactory-parity data pipeline (remote URIs, streaming,

sharding, interleaving, vocab expansion, document ingestion) are in

[`docs/data.md`](docs/data.md).

## Common Commands

```bash

soup train  --config soup.yaml        # train (SFT/DPO/GRPO/PPO/KTO/ORPO/SimPO/IPO/...)

soup infer  --model ./output --input prompts.jsonl   # batch inference

soup chat   --model ./output          # interactive chat

soup serve  --model ./output          # OpenAI-compatible API server

soup merge  --adapter ./output        # merge LoRA into the base model

soup export --model ./output --format gguf           # export for deployment

soup eval   benchmark --model ./output               # evaluate

soup data   inspect ./data/train.jsonl               # dataset stats

soup recipes list                     # 100+ ready-made model recipes

soup autopilot --model  --data d.jsonl --goal chat  # zero-config

soup doctor                           # check GPU / deps / environment

```

The complete command list is in [`docs/commands.md`](docs/commands.md).

## Supported Models

Soup works with **any** text-generation model on the

[HuggingFace Hub](https://huggingface.co/models?pipeline_tag=text-generation) — if it loads with

`AutoModelForCausalLM`, it works, zero config changes. Llama 3.x/4, Qwen 2.5/3, Gemma 3, Mistral,

Mixtral, DeepSeek R1/V3, Phi-4, and 100+ others ship as ready-made recipes (`soup recipes list`).

| VRAM | Max model (QLoRA 4-bit) | Example |

|---|---|---|

| 8 GB | ~7B | Llama-3.1-8B, Mistral-7B |

| 16 GB | ~14B | Phi-4-14B, Qwen2.5-14B |

| 24 GB | ~34B | CodeLlama-34B, Yi-1.5-34B |

| 48 GB | ~70B | Llama-3.3-70B |

| 80 GB+ | 70B+ (full) or MoE | Mixtral-8x22B, DeepSeek-V3 |

Full model + vision tables and the optional-extras matrix are in [`docs/models.md`](docs/models.md).

## Docker

Run Soup without installing CUDA or PyTorch locally (image published to GHCR on every release):

```bash

docker pull ghcr.io/makazhanalpamys/soup:latest

docker run --gpus all -v $(pwd):/workspace ghcr.io/makazhanalpamys/soup train --config soup.yaml

docker compose up   # or build locally

```

## Requirements

- Python 3.10+

- GPU with CUDA (recommended), Apple Silicon (MPS), or CPU (experimental — very slow)

- 8 GB+ VRAM for 7B models with QLoRA

All training tasks run on CPU for testing (quantization auto-disabled). Optional extras

(`train`, `all`, `fast`, `vision`, `qat`, `serve`, `serve-fast`, `ui`, `eval`, `deepspeed`,

`liger`, `mlx`, `onnx`, `tensorrt`, …) are listed in

[`docs/models.md`](docs/models.md#optional-extras).

## Troubleshooting

```bash

soup doctor    # GPU, system resources, dependencies, and version in one place

```

- **`ImportError: DLL load failed while importing _C` (Windows)** — reinstall PyTorch for your

  CUDA version: `pip install torch --index-url https://download.pytorch.org/whl/cu121`.

- **`soup version` ≠ `pip show soup-cli`** — multiple Python installs; use a virtualenv.

## Development

```bash

git clone https://github.com/MakazhanAlpamys/Soup.git

cd Soup

pip install -e ".[dev]"

ruff check src/soup_cli/ tests/    # lint

pytest tests/ -v                   # unit tests (fast, no GPU)

pytest tests/ -m smoke -v          # smoke tests (downloads a tiny model, trains)

pre-commit install                 # optional: ruff lint+format on commit

```

See [CONTRIBUTING.md](CONTRIBUTING.md) for the full workflow and [SECURITY.md](SECURITY.md) to

report a vulnerability.

## License

[Apache-2.0](LICENSE). Copyright © the Soup contributors.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/makazhanalpamys/soup

Awesome Lists containing this project

README

Soup