{"id":27958695,"url":"https://github.com/xingchensong/touchnet","last_synced_at":"2025-10-27T15:10:39.501Z","repository":{"id":289016161,"uuid":"934037183","full_name":"xingchensong/TouchNet","owner":"xingchensong","description":"A native-PyTorch library for large scale M-LLM (text/audio) training with tp/cp/dp/pp.","archived":false,"fork":false,"pushed_at":"2025-07-04T05:24:04.000Z","size":4748,"stargazers_count":76,"open_issues_count":1,"forks_count":4,"subscribers_count":8,"default_branch":"main","last_synced_at":"2025-07-04T05:29:35.238Z","etag":null,"topics":["audio","large-scale","mllm","pytorch","text"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/xingchensong.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-02-17T07:02:34.000Z","updated_at":"2025-07-04T04:22:46.000Z","dependencies_parsed_at":"2025-07-04T05:22:45.302Z","dependency_job_id":"e599c20d-ad6c-4343-b6f0-7c635c142daa","html_url":"https://github.com/xingchensong/TouchNet","commit_stats":null,"previous_names":["xingchensong/touchnet"],"tags_count":1,"template":false,"template_full_name":null,"purl":"pkg:github/xingchensong/TouchNet","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/xingchensong%2FTouchNet","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/xingchensong%2FTouchNet/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/xingchensong%2FTouchNet/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/xingchensong%2FTouchNet/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/xingchensong","download_url":"https://codeload.github.com/xingchensong/TouchNet/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/xingchensong%2FTouchNet/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":281288165,"owners_count":26475484,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-27T02:00:05.855Z","response_time":61,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["audio","large-scale","mllm","pytorch","text"],"created_at":"2025-05-07T18:25:57.227Z","updated_at":"2025-10-27T15:10:39.494Z","avatar_url":"https://github.com/xingchensong.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cdiv align=\"center\"\u003e\n\n\u003cimg src=\"./assets/Touchnet_16_9.jpg\" width=\"50%\" alt=\"TouchNet\"\u003e\n\n# 👆 TouchNet [WIP]\n\n#### A PyTorch native N-D parallel library for large-scale multimodal LLM (text/audio) training\n\n[![integration tests](https://github.com/xingchensong/TouchNet/actions/workflows/unit_test_cpu.yaml/badge.svg?branch=main)](https://github.com/xingchensong/TouchNet/actions/workflows/unit_test_cpu.yaml?query=branch%3Amain)\n[![docs](https://img.shields.io/badge/docs-latest-blue.svg)](docs/)\n[![license](https://img.shields.io/badge/license-Apache_2-lightgrey.svg)](./LICENSE)\n\n\u003c/div\u003e\n\n\n## Latest News 🔥\n- [2025/07/07] We support finetuning `Qwen2-Audio-7B` \u0026 `Kimi-Audio-7B` on ASR task! See [WeneSpeech results](examples/audio/sft//asr/README.md) for details.\n\n\n## Overview\n\n`👆 touchnet` is highly motivated by `torchtitan`. Both of them are clean, minimal codebases for large-scale LLM training using native PyTorch. The main goal that differentiates `👆 touchnet` from `torchtitan` is that `👆 touchnet` focuses on multimodal LLM training where special data pipelines and model structures are needed. Please note that `👆 touchnet` is currently in a pre-release state and under extensive development.\n\nOur guiding principles when building `👆 touchnet` are:\n\n1. ⚡️ Blazing-fast checkpointable data loader with modular preprocessing and ​**​fully random access​**​ for large scale **multimodal** data\n    - [[New Storage Format]](https://github.com/xingchensong/TouchNet/blob/main/docs/data.md) optimized for random access on sequentially saved tar files\n    - Efficient [[Sequence Packing]](https://huggingface.co/blog/sirluk/llm-sequence-packing) powered by [[Flex Attention]](https://pytorch.org/docs/main/nn.attention.flex_attention.html#module-torch.nn.attention.flex_attention)\n2. 🤗 Native integration with `transformers` models while get rid of structured trainer classes (e.g., [[PyTorch-Lightning]](https://github.com/Lightning-AI/pytorch-lightning) or [[HuggingFace Trainer]](https://huggingface.co/docs/transformers/v4.50.0/en/main_classes/trainer#transformers.Trainer))\n    - Only reuse model definitions in `transformers` and leave other parts untouched\n    - Entire training logic exposed in a single file [[touchnet/bin/train.py]](https://github.com/xingchensong/TouchNet/blob/main/touchnet/bin/train.py), everything is under your control\n3. 🛠️ Built-in profilers (CPU/GPU/memory) with flight recorder diagnostics.\n    - [[Nsys-like Profiler]](https://github.com/pytorch/kineto/blob/main/tb_plugin/README.md) to get optimization recommendations\n    - [[Memory Monitor]](https://pytorch.org/blog/understanding-gpu-memory-1/) to debug OOM errors and improve memory usage\n4. 🎯 N-D parallelism enabled through **PyTorch native API** and minimal lines of model code changes.\n    - [[FSDP2]](https://pytorch.org/docs/stable/distributed.fsdp.fully_shard.html), [why FSDP1 -\u003e FSDP2?](https://github.com/pytorch/torchtitan/blob/main/docs/fsdp.md)\n    - [[Tensor Parallel]](https://pytorch.org/docs/stable/distributed.tensor.parallel.html), [[Context Parallel]](https://discuss.pytorch.org/t/distributed-w-torchtitan-breaking-barriers-training-long-context-llms-with-1m-sequence-length-in-pytorch-using-context-parallel/215082), [[Pipeline Parallel]](https://discuss.pytorch.org/t/distributed-w-torchtitan-training-with-zero-bubble-pipeline-parallelism/214420) (PP WIP🚧), [[Distributed Checkpoint]](https://pytorch.org/docs/stable/distributed.checkpoint.html)\n5. ✨ Intuitive API design for rapid adoption \u0026 customization in minutes.\n    - Supported tasks: [[text/pretrain]](https://github.com/xingchensong/TouchNet/tree/main/examples/text/pretrain), [[audio/pretrain]](https://github.com/xingchensong/TouchNet/tree/main/examples/audio/pretrain), [[audio/sft/asr]](https://github.com/xingchensong/TouchNet/tree/main/examples/audio/sft/asr), more tasks coming soon\n    - Supported models: [[Llama]](https://github.com/xingchensong/TouchNet/tree/main/touchnet/models/llama), [[LlamaForASR]](./docs/LlamaForASR.md) more models coming soon\n\n\n## Quick Glance at 👆 TouchNet\n\n\u003cdiv align=\"center\"\u003e\n\nhttps://github.com/user-attachments/assets/9e530ad6-2d8d-41b4-9223-8ad7c838e6e4\n\nLoss, Accuracy, Memory, Throughput, TFLOPs, and MFU logged via both stdout and Tensorboard.\n\nhttps://github.com/user-attachments/assets/dc089589-a355-4abc-a2b3-5e0f768b89a0\n\nDetailed CPU/GPU profiling that can be visualized in Tensorboard. Enjoy your optimization journey ~\n\nhttps://github.com/user-attachments/assets/10cbf4ce-5f96-4699-b4f4-72c88ce89802\n\nMemory profiling identifies GPU memory allocation patterns to guide tuning strategies.\n\n\u003c/div\u003e\n\n## Dive into the code\n\nHere is an end-to-end workflow for a traning job in `👆 TouchNet`:\n\n1. `stage-1`: Download dataset. We use `load_dataset` API in `HuggingFace.datasets` to download specific data.\n2. `stage-2`: Convert dataset format to `TouchDataset`. see [[touchnet/bin/make_data.py]](./touchnet/bin/make_data.py)\n3. `stage-3`: (optional) Convert hf-format ckpt to torch distributed ckpt. see [[touchnet/bin/convert_hf_to_dcp.py]](./touchnet/bin/convert_hf_to_dcp.py)\n4. `stage-4`: Start training, either from scratch or from pretrained ckpt that has been converted in stage-3. see [[touchnet/bin/train.py]](./touchnet/bin/train.py)\n5. `stage-5`: Convert torch distributed ckpt to hf-format, enjoy HuggingFace ecosystem for inference and deployment. see [[touchnet/bin/convert_dcp_to_hf.py]](./touchnet/bin/convert_dcp_to_hf.py)\n\nFor a more concrete example running those stages one by one, see [[examples/audio/sft/asr/aishell/run.sh]](./examples/audio/sft/asr/aishell/run.sh)\n\n\n## Installation\n\n```sh\n# NOTE(xcsong): Ensure that the linux system's glibc version is greater than or equal to 2.17 (see `ldd --version`)\n#               (for example, Ubuntu 22.04 and later versions).\nconda create -n touchnet python=3.10\nconda activate touchnet\nconda install -c conda-forge sox ffmpeg -y\n\n# (Optional) install CUDA + cuDNN if they are not already available; change `prefix` to your install path.\n# bash install_cuda_cudnn.sh\n\n# Install TouchNet with GPU support (CUDA 12.6 - recommended)\npip install -e . --index-url https://download.pytorch.org/whl/cu126\n\n# Or install with CUDA 11.8 support\n# pip install -e . --index-url https://download.pytorch.org/whl/cu118\n\n# For development with GPU support\n# pip install -e '.[dev]' --index-url https://download.pytorch.org/whl/cu126\n```\n\n## Citation\n\n```txt\n@misc{touchnet,\n  title={TouchNet: A PyTorch native N-D parallel library for large-scale multimodal LLM (text/audio) training},\n  author={Xingchen Song},\n  year={2025},\n  url={https://github.com/xingchensong/TouchNet},\n}\n```\n\n## Acknowledge\n\n1. This repo is highly motivated by [torchtitan](https://github.com/pytorch/torchtitan) and we borrowed a lot of code from it.\n2. This repo also benefits from [Megatron-LM](https://github.com/NVIDIA/Megatron-LM), [WeNet](https://github.com/wenet-e2e/wenet), [flame](https://github.com/fla-org/flame).\n\nThanks for their wonderful works.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fxingchensong%2Ftouchnet","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fxingchensong%2Ftouchnet","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fxingchensong%2Ftouchnet/lists"}