Projects in Awesome Lists tagged with fsdp
A curated list of projects in awesome lists tagged with fsdp .
https://github.com/lambdalabsml/distributed-training-guide
Best practices & guides on how to write distributed pytorch training code
cluster cuda deepspeed distributed-training fsdp gpu gpu-cluster kuberentes lambdalabs mpi nccl pytorch sharding slurm
Last synced: 16 May 2025
https://github.com/LambdaLabsML/distributed-training-guide
Best practices & guides on how to write distributed pytorch training code
cluster cuda deepspeed distributed-training fsdp gpu gpu-cluster kuberentes lambdalabs mpi nccl pytorch sharding slurm
Last synced: 08 Mar 2025
https://github.com/gurpreetkaurjethra/meta-llama3-genai-usecases-end-to-end-implementation-guides
META LLAMA3 GENAI Real World UseCases End To End Implementation Guide
chromadb fine-tuning fsdp generativeai huggingface langchain-python llama3 llama3-70b-8192 llama3-finetune llama3-meta-ai llama3-prompts llama3-rag ollama prompt-tuning pytorch qlora rag sagemaker streamlit
Last synced: 22 Nov 2024
https://github.com/abhilash1910/framework-optimization
Framework, Model & Kernel Optimizations for Distributed Deep Learning - Data Hack Summit
codegen ddp deepspeed fsdp inductor pipelineparallel pytorch tensorparallel triton
Last synced: 15 Mar 2025
https://github.com/hrolive/large-language-models-on-supercomputers
Comprehensive exploration of LLMs, including cutting-edge techniques and tools such as parameter-efficient fine-tuning (PEFT), quantization, zero redundancy optimizers (ZeRO), fully sharded data parallelism (FSDP), DeepSpeed, and Huggingface accelerate.
deepspeed evaluation-metrics fsdp high-performance-computing hpc huggingface huggingface-transformers jupyter llm llm-inference llm-training monitoring peft python quantization slurm tokenization transformer unsloth
Last synced: 23 Feb 2025
https://github.com/hyunnnchoi/google-t5-fsdp-kubeflow
A foundational repository for setting up distributed training jobs using Kubeflow and PyTorch FSDP.
distributed-deep-learning fsdp kubeflow pytorch
Last synced: 08 Apr 2025