{"id":48934701,"url":"https://github.com/ByteDance-Seed/VeOmni","last_synced_at":"2026-05-03T14:00:41.873Z","repository":{"id":285892632,"uuid":"956315778","full_name":"ByteDance-Seed/VeOmni","owner":"ByteDance-Seed","description":"VeOmni: Scaling Any Modality Model Training with Model-Centric Distributed Recipe Zoo","archived":false,"fork":false,"pushed_at":"2026-04-27T21:54:31.000Z","size":12027,"stargazers_count":1866,"open_issues_count":88,"forks_count":183,"subscribers_count":15,"default_branch":"main","last_synced_at":"2026-04-27T23:11:38.410Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ByteDance-Seed.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":"AGENTS.md","dco":null,"cla":null}},"created_at":"2025-03-28T03:42:42.000Z","updated_at":"2026-04-27T12:35:22.000Z","dependencies_parsed_at":"2025-05-12T04:24:40.701Z","dependency_job_id":"42a10565-f864-4688-9010-ff05a8ee9ad8","html_url":"https://github.com/ByteDance-Seed/VeOmni","commit_stats":null,"previous_names":["bytedance-seed/veomni"],"tags_count":11,"template":false,"template_full_name":null,"purl":"pkg:github/ByteDance-Seed/VeOmni","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ByteDance-Seed%2FVeOmni","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ByteDance-Seed%2FVeOmni/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ByteDance-Seed%2FVeOmni/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ByteDance-Seed%2FVeOmni/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ByteDance-Seed","download_url":"https://codeload.github.com/ByteDance-Seed/VeOmni/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ByteDance-Seed%2FVeOmni/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32571456,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-03T06:36:36.687Z","status":"ssl_error","status_checked_at":"2026-05-03T06:36:09.306Z","response_time":103,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2026-04-17T11:01:26.453Z","updated_at":"2026-05-03T14:00:41.862Z","avatar_url":"https://github.com/ByteDance-Seed.png","language":"Python","funding_links":[],"categories":["微调 Fine-Tuning","Python","7. Training \u0026 Fine-tuning Ecosystem"],"sub_categories":[],"readme":"\n\u003cdiv align=\"center\"\u003e\n\n\u003cimg src=\"./docs/assets/logo.png\" width=\"50%\"\u003e\n\n\u003cdiv align=\"center\"\u003e\n    VeOmni: Scaling Any Modality Model Training with Model-Centric Distributed Recipe Zoo\n    \u003cbr\u003e\n    \u003cbr\u003e\n\u003c/div\u003e\n\n[![GitHub Repo stars](https://img.shields.io/github/stars/ByteDance-Seed/VeOmni)](https://github.com/ByteDance-Seed/VeOmni/stargazers)\n[![Paper](https://img.shields.io/badge/Paper-red)](https://arxiv.org/abs/2508.02317)\n[![Documentation](https://img.shields.io/badge/Documentation-blue)](https://veomni.readthedocs.io/en/latest/)\n[![WeChat](https://img.shields.io/badge/WeChat-green?logo=wechat\u0026amp)](https://raw.githubusercontent.com/ByteDance-Seed/VeOmni/refs/heads/main/docs/assets/wechat.png)\n\n\u003c/div\u003e\n\n## 🍪 Overview\nVeOmni is a versatile framework for both single- and multi-modal pre-training and post-training. It empowers users to seamlessly scale models of any modality across various accelerators, offering both flexibility and user-friendliness.\n\nOur guiding principles when building VeOmni are:\n- **Flexibility and Modularity**: VeOmni is built with a modular design, allowing users to decouple most components and replace them with their own implementations as needed.\n- **Trainer-free**: VeOmni supports linear training scripts that avoid rigid, structured trainer classes (e.g., [PyTorch-Lightning](https://github.com/Lightning-AI/pytorch-lightning) or [HuggingFace](https://huggingface.co/docs/transformers/v4.50.0/en/main_classes/trainer#transformers.Trainer) Trainer). These training scripts expose the entire training logic to users for maximum transparency and control. Besides, VeOmni supports a basic trainer for text-only or vlm/omni models training and a rl trainer as a trainer backend in reinforcement learning.\n\n- **Omni model native**: VeOmni enables users to effortlessly scale any omni-model across devices and accelerators.\n- **Torch native**: VeOmni is designed to leverage PyTorch’s native functions to the fullest extent, ensuring maximum compatibility and performance.\n\n\u003cdiv align=\"center\"\u003e\n\u003cimg src=\"./docs/assets/system.png\" width=\"90%\"\u003e\n\u003c/div\u003e\n\n## 🔥 Latest News\n- [2025/11] Our Paper [OmniScale: Scaling Any Modality Model Training with Model-Centric Distributed Recipe Zoo](https://arxiv.org/abs/2508.02317) was accepted by AAAI 2026\n- [2025/09] We release first offical release [v0.1.0](https://github.com/ByteDance-Seed/VeOmni/pull/75) of VeOmni.\n- [2025/08] We release [VeOmni Tech report](https://arxiv.org/abs/2508.02317) and open the [WeChat group](./docs/assets/wechat.png). Feel free to join us!\n- [2025/04] We release VeOmni!\n\n\n## 📚 Key Features\n- **FSDP**, **FSDP2** backend for training.\n- **Sequence Parallelism** with [Deepspeed Ulysess](https://arxiv.org/abs/2309.14509), support with non-async and async mode.\n- **Experts Parallelism** support large MOE model training, like [Qwen3-Moe](https://veomni.readthedocs.io/en/latest/key_features/ep_fsdp2.html).\n- Efficient **GroupGemm** kernel for Moe model, [Liger-Kernel](https://github.com/linkedin/Liger-Kernel).\n- Compatible with HuggingFace Transformers models. [Qwen3](https://veomni.readthedocs.io/en/latest/examples/qwen3.html), [Qwen3-VL](https://veomni.readthedocs.io/en/latest/examples/qwen3_vl.html), Qwen3-Moe, etc\n- Dynamic batching strategy, Omnidata processing\n- [**Torch Distributed Checkpoint**](https://docs.pytorch.org/docs/stable/distributed.checkpoint.html) for checkpoint.\n- Support for both Nvidia-GPU and Ascend-NPU training.\n- Experiment tracking with wandb\n\n## 📝 Upcoming Features and Changes\n\n- VeOmni v0.2 Roadmap https://github.com/ByteDance-Seed/VeOmni/issues/268, https://github.com/ByteDance-Seed/VeOmni/issues/271\n- Vit balance tool https://github.com/ByteDance-Seed/VeOmni/issues/280\n- Validation dataset during training https://github.com/ByteDance-Seed/VeOmni/issues/247\n- RL post training for omni-modality models with VeRL https://github.com/ByteDance-Seed/VeOmni/issues/262\n\n\n## 🚀 Getting Started\n\n\u003ca href=\"https://veomni.readthedocs.io/en/latest/index.html\"\u003e\u003cb\u003eDocumentation\u003c/b\u003e\u003c/a\u003e\n\n### Quick Start\n  - [Installation](https://veomni.readthedocs.io/en/latest/get_started/installation/install.html)\n  - [Quick Start with Qwen3](https://veomni.readthedocs.io/en/latest/examples/qwen3.html)\n\n\n## ✏️ Supported Models\n\n| Model                                                    | Model size                    | Example config File                                                   |\n| -------------------------------------------------------- | ----------------------------- | ----------------------------------------------------------------------|\n| [DeepSeek2.5/3/R1](https://huggingface.co/deepseek-ai)   | 236B/671B                     | [deepseek.yaml](configs/text/deepseek.yaml)                           |\n| [Llama3-3.3](https://huggingface.co/meta-llama)          | 1B/3B/8B/70B                  | [llama3.yaml](configs/text/llama3.yaml)                               |\n| [Qwen2-3](https://huggingface.co/Qwen)                   | 0.5B/1.5B/3B/7B/14B/32B/72B/  | [qwen2_5.yaml](configs/text/qwen2_5.yaml)                             |\n| [Qwen2-3 VL/QVQ](https://huggingface.co/Qwen)            | 2B/3B/7B/32B/72B              | [qwen3_vl_dense.yaml](configs/multimodal/qwen3_vl/qwen3_vl_dense.yaml)|\n| [Qwen3-VL MoE](https://huggingface.co/Qwen)              | 30BA3B/235BA22B               | [qwen3_vl_moe.yaml](configs/multimodal/qwen3_vl/qwen3_vl_moe.yaml)    |\n| [Qwen3-MoE](https://huggingface.co/Qwen)                 | 30BA3B/235BA22B               | [qwen3-moe.yaml](configs/text/qwen3-moe.yaml)                         |\n| [Qwen2-3 Omni](https://huggingface.co/Qwen)              | 7B/30BA3B                     | [qwen25_omni.yaml](configs/multimodal/qwen25_omni/qwen25_omni.yaml)   |\n| [Wan](https://huggingface.co/Wan-AI)                     | Wan2.1-I2V-14B-480P           | [wan_sft.yaml](configs/dit/wan_sft.yaml)                              |\n| Omni Model                                               | Any Modality Training         | [seed_omni.yaml](configs/multimodal/omni/seed_omni.yaml)              |\n\nSupport new models to VeOmni see [Support New Models](https://veomni.readthedocs.io/en/latest/usage/support_new_models/guide_and_checklist.html)\n\n## ⛰️ Performance\n\n\u003cdiv align=\"left\"\u003e\n\u003cimg src=\"./docs/assets/performance.png\" width=\"90%\"\u003e\n\u003c/div\u003e\n\nFor more details, please refer to our [paper](https://arxiv.org/abs/2508.02317).\n\n## 💡 Awesome work using VeOmni\n- [dFactory: Easy and Efficient dLLM Fine-Tuning](https://github.com/inclusionAI/dFactory)\n- [LMMs-Engine](https://github.com/EvolvingLMMs-Lab/lmms-engine)\n- [UI-TARS: Pioneering Automated GUI Interaction with Native Agents](https://github.com/bytedance/UI-TARS)\n- [OpenHA: A Series of Open-Source Hierarchical\nAgentic Models in Minecraft](https://arxiv.org/pdf/2509.13347)\n- [UI-TARS-2 Technical Report: Advancing GUI Agent with Multi-Turn Reinforcement Learning](https://arxiv.org/abs/2509.02544)\n- [Open-dLLM: Open Diffusion Large Language Models](https://github.com/pengzhangzhi/Open-dLLM)\n- [LingBot-VLA: A Pragmatic VLA Foundation Model](https://github.com/Robbyant/lingbot-vla)\n\n## 🎨 Contributing\n\nContributions from the community are welcome! Please check out [CONTRIBUTING.md](CONTRIBUTING.md) our project roadmap(To be updated),\n\n\n## 📝 Citation and Acknowledgement\n\nIf you find VeOmni useful for your research and applications, feel free to give us a star ⭐ or cite us using:\n\n```bibtex\n@article{ma2025veomni,\n  title={VeOmni: Scaling Any Modality Model Training with Model-Centric Distributed Recipe Zoo},\n  author={Ma, Qianli and Zheng, Yaowei and Shi, Zhelun and Zhao, Zhongkai and Jia, Bin and Huang, Ziyue and Lin, Zhiqi and Li, Youjie and Yang, Jiacheng and Peng, Yanghua and others},\n  journal={arXiv preprint arXiv:2508.02317},\n  year={2025}\n}\n```\n\nThanks to the following projects for their excellent work:\n\n- [ByteCheckpoint](https://arxiv.org/abs/2407.20143)\n- [veScale](https://github.com/volcengine/veScale)\n- [Liger-Kernel](https://github.com/linkedin/Liger-Kernel)\n- [LLaMA-Factory](https://github.com/hiyouga/LLaMA-Factory)\n- [torchtitan](https://github.com/pytorch/torchtitan/)\n- [torchtune](https://github.com/pytorch/torchtune)\n\n## Star History\n\n[![Star History Chart](https://api.star-history.com/svg?repos=ByteDance-Seed/VeOmni\u0026type=date\u0026legend=top-left)](https://www.star-history.com/#ByteDance-Seed/VeOmni\u0026type=date\u0026legend=top-left)\n\n\n## 🌱 About [ByteDance Seed Team](https://team.doubao.com/)\n\n\u003cdiv align=\"center\"\u003e\n\u003cimg src=\"https://github.com/user-attachments/assets/c42e675e-497c-4508-8bb9-093ad4d1f216\" width=\"100%\"\u003e\n\u003c/div\u003e\n\nFounded in 2023, ByteDance Seed Team is dedicated to crafting the industry's most advanced AI foundation models. The team aspires to become a world-class research team and make significant contributions to the advancement of science and society. You can get to know Bytedance Seed better through the following channels👇\n\u003cdiv\u003e\n  \u003ca href=\"https://team.doubao.com/\"\u003e\n    \u003cimg src=\"https://img.shields.io/badge/Website-%231e37ff?style=for-the-badge\u0026logo=bytedance\u0026logoColor=white\"\u003e\u003c/a\u003e\n  \u003ca href=\"https://github.com/user-attachments/assets/469535a8-42f2-4797-acdf-4f7a1d4a0c3e\"\u003e\n    \u003cimg src=\"https://img.shields.io/badge/WeChat-07C160?style=for-the-badge\u0026logo=wechat\u0026logoColor=white\"\u003e\u003c/a\u003e\n \u003ca href=\"https://www.xiaohongshu.com/user/profile/668e7e15000000000303157d?xsec_token=ABl2-aqekpytY6A8TuxjrwnZskU-6BsMRE_ufQQaSAvjc%3D\u0026xsec_source=pc_search\"\u003e\n    \u003cimg src=\"https://img.shields.io/badge/Xiaohongshu-%23FF2442?style=for-the-badge\u0026logo=xiaohongshu\u0026logoColor=white\"\u003e\u003c/a\u003e\n  \u003ca href=\"https://www.zhihu.com/org/dou-bao-da-mo-xing-tuan-dui/\"\u003e\n    \u003cimg src=\"https://img.shields.io/badge/zhihu-%230084FF?style=for-the-badge\u0026logo=zhihu\u0026logoColor=white\"\u003e\u003c/a\u003e\n\n\u003c/div\u003e\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FByteDance-Seed%2FVeOmni","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FByteDance-Seed%2FVeOmni","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FByteDance-Seed%2FVeOmni/lists"}