{"id":31753717,"url":"https://github.com/servicenow/fast-llm","last_synced_at":"2025-10-09T17:53:50.320Z","repository":{"id":257843697,"uuid":"871324801","full_name":"ServiceNow/Fast-LLM","owner":"ServiceNow","description":"Accelerating your LLM training to full speed! Made with ❤️ by ServiceNow Research","archived":false,"fork":false,"pushed_at":"2025-10-02T23:16:12.000Z","size":12421,"stargazers_count":237,"open_issues_count":70,"forks_count":36,"subscribers_count":19,"default_branch":"main","last_synced_at":"2025-10-03T01:21:44.740Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"https://servicenow.github.io/Fast-LLM/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ServiceNow.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":"CITATION.cff","codeowners":null,"security":"SECURITY.md","support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2024-10-11T18:09:30.000Z","updated_at":"2025-10-02T17:21:29.000Z","dependencies_parsed_at":"2024-11-10T19:35:32.835Z","dependency_job_id":"a168cc45-f196-45f2-a83a-b17ef9d679d3","html_url":"https://github.com/ServiceNow/Fast-LLM","commit_stats":null,"previous_names":["servicenow/fast-llm"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/ServiceNow/Fast-LLM","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ServiceNow%2FFast-LLM","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ServiceNow%2FFast-LLM/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ServiceNow%2FFast-LLM/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ServiceNow%2FFast-LLM/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ServiceNow","download_url":"https://codeload.github.com/ServiceNow/Fast-LLM/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ServiceNow%2FFast-LLM/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":279001805,"owners_count":26083197,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-09T02:00:07.460Z","response_time":59,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-10-09T17:53:49.297Z","updated_at":"2025-10-09T17:53:50.315Z","avatar_url":"https://github.com/ServiceNow.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cdiv align=\"center\" style=\"margin-bottom: 1em;\"\u003e\n\n\u003cimg width=50% src=\"docs/assets/images/logo.svg\" alt=\"Fast-LLM\"\u003e\u003c/img\u003e\n\n[![Docker][ci-badge]][ci-workflow]\n[![Documentation][docs-badge]][docs-workflow]\n[![License][license-badge]][license]\n\n*Accelerating your LLM training to full speed*\n\nMade with ❤️ by [ServiceNow Research][servicenow-research]\n\n\u003c/div\u003e\n\n## Overview\n\nFast-LLM is a cutting-edge open-source library for training large language models with exceptional speed, scalability, and flexibility. Built on [PyTorch][pytorch] and [Triton][triton], Fast-LLM empowers AI teams to push the limits of generative AI, from research to production.\n\nOptimized for training models of all sizes—from small 1B-parameter models to massive clusters with 70B+ parameters—Fast-LLM delivers faster training, lower costs, and seamless scalability. Its fine-tuned kernels, advanced parallelism techniques, and efficient memory management make it the go-to choice for diverse training needs.\n\nAs a truly open-source project, Fast-LLM allows full customization and extension without proprietary restrictions. Developed transparently by a community of professionals on GitHub, the library benefits from collaborative innovation, with every change discussed and reviewed in the open to ensure trust and quality. Fast-LLM combines professional-grade tools with unified support for GPT-like architectures, offering the cost efficiency and flexibility that serious AI practitioners demand.\n\n\u003e [!NOTE]\n\u003e Fast-LLM is not affiliated with Fast.AI, FastHTML, FastAPI, FastText, or other similarly named projects. Our library's name refers to its speed and efficiency in language model training.\n\n## Why Fast-LLM?\n\n1. 🚀 **Fast-LLM is Blazingly Fast**:\n    - ⚡️ Optimized kernel efficiency and reduced overheads.\n    - 🔋 Optimized memory usage for best performance.\n    - ⏳ Minimizes training time and cost.\n\n2. 📈 **Fast-LLM is Highly Scalable**:\n    - 📡 Distributed training across multiple GPUs and nodes using 3D parallelism (Data, Tensor, and Pipeline).\n    - 🔗 Supports sequence length parallelism to handle longer sequences effectively.\n    - 🧠 ZeRO-1, ZeRO-2, and ZeRO-3 implementations for improved memory efficiency.\n    - 🎛️ Mixed precision training support for better performance.\n    - 🏋️‍♂️ Large batch training and gradient accumulation support.\n    - 🔄 Reproducible training with deterministic behavior.\n\n3. 🎨 **Fast-LLM is Incredibly Flexible**:\n    - 🤖 Compatible with all common language model architectures in a unified class.\n    - ⚡ Efficient dropless Mixture-of-Experts (MoE) implementation with SoTA performance.\n    - 🧩 Customizable language model architectures, data loaders, loss functions, and optimizers (in progress).\n    - 🤗 Seamless integration with [Hugging Face Transformers][transformers].\n\n4. 🎯 **Fast-LLM is Super Easy to Use**:\n    - 📦 [Pre-built Docker images](https://github.com/ServiceNow/Fast-LLM/pkgs/container/fast-llm) for quick deployment.\n    - 📝 Simple YAML configuration for hassle-free setup.\n    - 💻 Command-line interface for easy launches.\n    - 📊 Detailed logging and real-time monitoring features.\n    - 📚 Extensive [documentation][docs] and practical tutorials (in progress).\n\n5. 🌐 **Fast-LLM is Truly Open Source**:\n    - ⚖️ Licensed under [Apache 2.0][license] for maximum freedom to use Fast-LLM at work, in your projects, or for research.\n    - 💻 Transparently developed on GitHub with public [roadmap][roadmap] and [issue tracking][issues].\n    - 🤝 Contributions and collaboration are always welcome!\n\n## Usage\n\nWe'll walk you through how to use Fast-LLM to train a large language model on a cluster with multiple nodes and GPUs. We'll show an example setup using a Slurm cluster and a Kubernetes cluster.\n\nFor this demo, we will train a Mistral-7B model from scratch for 100 steps on random data. The config file `examples/mistral-4-node-benchmark.yaml` is pre-configured for a multi-node setup with 4 DGX nodes, each with 8 A100-80GB or H100-80GB GPUs.\n\n\u003e [!NOTE]\n\u003e Fast-LLM scales from a single GPU to large clusters. You can start small and expand based on your resources.\n\nExpect to see a significant speedup in training time compared to other libraries! For training Mistral-7B, Fast-LLM is expected to achieve a throughput of **9,800 tokens/s/H100** (batch size 32, sequence length 8k) on a 4-node cluster with 32 H100s.\n\n### Running Fast-LLM on a Slurm Cluster\n\n#### Prerequisites\n\n- A [Slurm](https://slurm.schedmd.com/) cluster with at least 4 DGX nodes with 8 A100-80GB or H100-80GB GPUs each.\n- CUDA 12.1 or higher.\n- Dependencies: [PyTorch][pytorch], [Triton][triton], and [Apex](https://github.com/NVIDIA/apex) installed on all nodes.\n\n#### Steps\n\n1. Deploy the [nvcr.io/nvidia/pytorch:24.07-py3](https://catalog.ngc.nvidia.com/orgs/nvidia/containers/pytorch) Docker image to all nodes (recommended), because it contains all the necessary dependencies.\n2. Install Fast-LLM on all nodes:\n\n    ```bash\n    sbatch \u003c\u003cEOF\n    #!/bin/bash\n    #SBATCH --nodes=$(scontrol show node | grep -c NodeName)\n    #SBATCH --ntasks-per-node=1\n    #SBATCH --ntasks=$(scontrol show node | grep -c NodeName)\n    #SBATCH --exclusive\n\n    srun bash -c 'pip install --no-cache-dir -e \"git+https://github.com/ServiceNow/Fast-LLM.git#egg=llm[CORE,OPTIONAL,DEV]\"'\n    EOF\n    ```\n\n3. Use the example Slurm job script [examples/fast-llm.sbat](examples/fast-llm.sbat) to submit the job to the cluster:\n\n    ```bash\n    sbatch examples/fast-llm.sbat\n    ```\n\n4. Monitor the job's progress:\n\n    - Logs: Follow `job_output.log` and `job_error.log` in your working directory for logs.\n    - Status: Use `squeue -u $USER` to see the job status.\n\nNow, you can sit back and relax while Fast-LLM trains your model at full speed! ☕\n\n### Running Fast-LLM on a Kubernetes Cluster\n\n#### Prerequisites\n\n- A [Kubernetes](https://kubernetes.io/) cluster with at least 4 DGX nodes with 8 A100-80GB or H100-80GB GPUs each.\n- [KubeFlow](https://www.kubeflow.org/) installed.\n- Locked memory limit set to unlimited at the host level on all nodes. Ask your cluster admin to do this if needed.\n\n#### Steps\n\n1. Create a Kubernetes [PersistentVolumeClaim](https://kubernetes.io/docs/concepts/storage/persistent-volumes/) (PVC) named `fast-llm-home` that will be mounted to `/home/fast-llm` in the container using [examples/fast-llm-pvc.yaml](examples/fast-llm-pvc.yaml):\n\n    ```bash\n    kubectl apply -f examples/fast-llm-pvc.yaml\n    ```\n\n2. Create a [PyTorchJob](https://www.kubeflow.org/docs/components/training/user-guides/pytorch/) resource using the example configuration file [examples/fast-llm.pytorchjob.yaml](examples/fast-llm.pytorchjob.yaml):\n\n    ```bash\n    kubectl apply -f examples/fast-llm.pytorchjob.yaml\n    ```\n\n3. Monitor the job status:\n\n    - Use `kubectl get pytorchjobs` to see the job status.\n    - Use `kubectl logs -f fast-llm-master-0 -c pytorch` to follow the logs.\n\nThat's it! You're now up and running with Fast-LLM on Kubernetes. 🚀\n\n## Next Steps\n\n📖 **Want to learn more?** Check out our [documentation][docs] for more information on how to use Fast-LLM.\n\n🔨 **We welcome contributions to Fast-LLM!** Have a look at our [contribution guidelines](CONTRIBUTING.md).\n\n🐞 **Something doesn't work?** Open an [issue](https://github.com/ServiceNow/Fast-LLM/issues)!\n\n## License\n\nFast-LLM is licensed by ServiceNow, Inc. under the Apache 2.0 License. See [LICENSE][license] for more information.\n\n## Vulnerability Reporting\n\nFor security issues, email [disclosure@servicenow.com](mailto:disclosure@servicenow.com). See our [security policy](SECURITY.md).\n\n[roadmap]: https://github.com/ServiceNow/Fast-LLM/milestones\n[issues]: https://github.com/ServiceNow/Fast-LLM/issues\n[ci-badge]: https://github.com/ServiceNow/Fast-LLM/actions/workflows/ci.yaml/badge.svg\n[ci-workflow]: https://github.com/ServiceNow/Fast-LLM/actions/workflows/ci.yaml\n[docs-badge]: https://github.com/ServiceNow/Fast-LLM/actions/workflows/docs.yaml/badge.svg\n[docs-workflow]: https://github.com/ServiceNow/Fast-LLM/actions/workflows/docs.yaml\n[docs]: https://servicenow.github.io/Fast-LLM\n[license-badge]: https://img.shields.io/badge/License-Apache%202.0-blue.svg\n[license]: LICENSE\n[servicenow-research]: https://www.servicenow.com/research/\n[pytorch]: https://pytorch.org/\n[triton]: https://triton-lang.org\n[transformers]: https://huggingface.co/transformers\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fservicenow%2Ffast-llm","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fservicenow%2Ffast-llm","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fservicenow%2Ffast-llm/lists"}