Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
Projects in Awesome Lists tagged with nccl
A curated list of projects in awesome lists tagged with nccl .
https://github.com/coreylowman/cudarc
Safe rust wrapper around CUDA toolkit
cublas cuda cuda-kernels cuda-programming cuda-toolkit cudnn curand gpu gpu-acceleration nccl nvrtc rust
Last synced: 19 Dec 2024
https://github.com/huggingface/large_language_model_training_playbook
An open collection of implementation tips, tricks and resources for training large language models
cuda large-language-models llm nccl nlp performance python pytorch scalability troubleshooting
Last synced: 11 Nov 2024
https://github.com/lambdalabsml/distributed-training-guide
Best practices & guides on how to write distributed pytorch training code
cluster cuda deepspeed distributed-training fsdp gpu gpu-cluster kuberentes lambdalabs mpi nccl pytorch sharding slurm
Last synced: 16 Dec 2024
https://github.com/Bluefog-Lib/bluefog
Distributed and decentralized training framework for PyTorch over graph
asynchronous decentralized deeplearning distributed-computing machine-learning mpi nccl one-sided pytorch
Last synced: 27 Nov 2024
https://github.com/LambdaLabsML/distributed-training-guide
Best practices & guides on how to write distributed pytorch training code
cluster cuda deepspeed distributed-training fsdp gpu gpu-cluster kuberentes lambdalabs mpi nccl pytorch sharding slurm
Last synced: 21 Oct 2024
https://github.com/microsoft/msrflute
Federated Learning Utilities and Tools for Experimentation
distributed-learning federated-learning gloo machine-learning nccl personalization privacy-tools pytorch simulation transformers-models
Last synced: 05 Nov 2024
https://github.com/juliagpu/nccl.jl
A Julia wrapper for the NVIDIA Collective Communications Library.
Last synced: 12 Nov 2024
https://github.com/lanl/pydnmfk
Python Distributed Non Negative Matrix Factorization with custom clustering
cupy distributed-computing hpc latent-features machine-learning mpi4py nccl nonnegative-matrix-factorization outofmemory python tensorfactorization
Last synced: 09 Dec 2024
https://github.com/1duo/nccl-examples
NCCL Examples from Official NVIDIA NCCL Developer Guide.
deep-learning distributed-systems nccl nvidia
Last synced: 17 Dec 2024
https://github.com/asprenger/distributed-training-patterns
Experiments with low level communication patterns that are useful for distributed training.
distributed-training horovod mpi mpi4py nccl tensorflow
Last synced: 23 Nov 2024
https://github.com/superlinear-ai/scipy-notebook-gpu
jupyter/scipy-notebook with CUDA Toolkit, cuDNN, NCCL, and TensorRT
cuda cudnn docker nccl scipy-notebook tensorflow tensorrt
Last synced: 11 Nov 2024
https://github.com/tybrucechen/tutorial-conda-cudnn-nccl-installation-for-pytorch
This is a tutorial for installing CUDA (v11.8) and cuDNN (8.6.9) to enable programming torch with GPU. It also mentions about implementation of NCCL for distributed GPU DNN model training.
cuda-installation nccl pytorch-installation ubuntu ubuntu-server windows-10
Last synced: 22 Dec 2024