https://github.com/raumberg/myllm
LLM Training Framework
https://github.com/raumberg/myllm
deep-neural-networks deepspeed framework huggingface huggingface-transformers llm llm-training python reinforcement-learning torch
Last synced: 3 months ago
JSON representation
LLM Training Framework
- Host: GitHub
- URL: https://github.com/raumberg/myllm
- Owner: Raumberg
- License: apache-2.0
- Created: 2025-02-03T08:01:21.000Z (8 months ago)
- Default Branch: main
- Last Pushed: 2025-04-21T09:58:06.000Z (6 months ago)
- Last Synced: 2025-04-21T10:51:22.515Z (6 months ago)
- Topics: deep-neural-networks, deepspeed, framework, huggingface, huggingface-transformers, llm, llm-training, python, reinforcement-learning, torch
- Language: Python
- Homepage:
- Size: 1.43 MB
- Stars: 12
- Watchers: 1
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
MyLLM
![]()
![]()
![]()
![]()
# LLM Framework | Toolkit for various training stages
Initially derived from [Effective LLM Alignment](https://github.com/VikhrModels/effective_llm_alignment/) by VikhrModels.
Many credits goes to the Vikhr Team.## 🚀 [Methods and Stages supported]:
- Supervised Finetuning (Full/LoRa/QLoRa)
- Distillation (KL Divergence, MSE, Cosine and others)
- Reinforcement Learning (GRPO, DPO, PPO)
- Adapters merging
- Tokenizer extensions## 🛠️ [Technical details]:
- Built on top of PyTorch, Transformers, TRL, Peft. No 'magic' libraries like unsloth.
- Distributed training via Accelerate, FSDP and DeepSpeed (Stage 2, 3).
- Acceleration with vLLM, FlashAttn, Liger Kernels and fusion.
- Logging options: wandb, clearml
- Convenient config management using TOML## 🧠 [Training an LLM]
- Everything is available from the root (MyLLM) folder.
- What you need to do is start any desired script using accelerate:
```bash
# ~/../myllm >
accelerate launch --config_file
# example SFT:
accelerate launch --config_file configs/accelerate/stage3_config.yaml src/train/sft.py configs/train/sft/full-sft-watari.toml
# example GRPO:
accelerate launch src/train/grpo.py configs/train/grpo/rl-grpo-zariman-no-vllm.toml
```
- Example launching GRPO with VLLM support:
```bash
> CUDA_VISIBLE_DEVICES=1 trl vllm-serve --model --tensor_parallel_size 1 --max_model_len 4096
> CUDA_VISIBLE_DEVICES=0 accelerate launch src/train/grpo.py configs/train/grpo/.toml
```
> **⚠️ Disclaimer:**
> GRPO scripts can be unstable, the work is still going on. If you encounter any errors, please, open an Issue.## 📟 [Useful scripts]:
The folder `myllm/src/helpers` contains useful scripts that you can utilize for your models:
- Merge your LoRA adapters with original model using `adapters.py` by:
```bash
cd myllm/src/helpers
python adapters.py merge --source ../../models/attn-signs-watari-32/checkpoint-5500/ --output ../../models/attn-signs-watari-32/watari-32-merged --dtype bf16
```
- Extend model tokenizer by using `tokenizer.py`# Latest changes:
- Added lora-sft-watari-32-stage-n.toml training configs from [Attention Signs HuggingFace Page](https://huggingface.co/attn-signs/Watari-32b-v0)
- Added new [fusion] toml group for fused kernels. Example:
```toml
[fusion]
use_liger = true
patch_dyntanh = true # Nightly function, may be unstable
```
- Added new modules: `stdout` and `data_processors` and `liger`.
- **stdout:** print your model config, script arguments and training config in table. Example:
```
Model Inspection:
+----------------------+----------------------------+
| Config key | Config value |
+======================+============================+
| Model Architecture | Qwen2ForCausalLM |
+----------------------+----------------------------+
| Total Parameters | 0 |
+----------------------+----------------------------+
| Trainable Parameters | 0 |
+----------------------+----------------------------+
| Dtype | torch.bfloat16 |
+----------------------+----------------------------+
| Device | cuda:0 |
+----------------------+----------------------------+
| Tokenizer Vocab Size | 147200 |
+----------------------+----------------------------+
| Model Embedding Size | 0 |
+----------------------+----------------------------+
| Padding Token | <|endoftext|> (ID: 147075) |
+----------------------+----------------------------+
| EOS Token | <|im_end|> (ID: 147077) |
+----------------------+----------------------------+
| Max Sequence Length | 32768 |
+----------------------+----------------------------+
| Architecture | Qwen2ForCausalLM |
+----------------------+----------------------------+
| Hidden Size | 5120 |
+----------------------+----------------------------+
| Attention Heads | 40 |
+----------------------+----------------------------+
```
- **data_processors**: moved all tokenizer processing functions to a separate module. Added support for default processing and history processing.
- **liger**: moved all liger kernels to a separate module
- `resume_from` now is a **boolean flag** instead of a string representing a model path. When providing `resume_from=true`, the initial model_name_or_path should be the path to your local checkpoint.
- Added `construct_history` **boolean flag** that constructs history out of the dataset. If `construct_history=false`, the script will use `default_row_processor` function.Overall, the training scripts are becoming more easy to read and user-friendly, outsourcing difficult tasks under the hood.
# Nightly | Development functions:
- Added `fusion` in which native / custom CUDA/Triton kernels will be developed
- Added Fused Dynamic Tanh kernel, Torch Interface and patching function.
What is **Dynamic Tanh**?
Dynamic Tanh is the most recent discovery by Meta, attempting to replace LayerNorm with Tanh() to speed up training stages and minimize total parameters.
DynTanh is a novel approach, thus, can be unstable (until we release the final version). At the same time it is also a debatable method for now.
```
Based on arxiv paper: https://www.alphaxiv.org/abs/2503.10622
Based on authors code: https://github.com/jiachenzhu/DyT/tree/main
```> [!IMPORTANT]
> Thank you for your interest in MyLLM! We look forward to your contributions and feedback! 🚀