Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

Projects in Awesome Lists tagged with nccl

A curated list of projects in awesome lists tagged with nccl .

https://github.com/huggingface/large_language_model_training_playbook

An open collection of implementation tips, tricks and resources for training large language models

cuda large-language-models llm nccl nlp performance python pytorch scalability troubleshooting

Last synced: 11 Nov 2024

https://github.com/lambdalabsml/distributed-training-guide

Best practices & guides on how to write distributed pytorch training code

cluster cuda deepspeed distributed-training fsdp gpu gpu-cluster kuberentes lambdalabs mpi nccl pytorch sharding slurm

Last synced: 16 Dec 2024

https://github.com/Bluefog-Lib/bluefog

Distributed and decentralized training framework for PyTorch over graph

asynchronous decentralized deeplearning distributed-computing machine-learning mpi nccl one-sided pytorch

Last synced: 27 Nov 2024

https://github.com/LambdaLabsML/distributed-training-guide

Best practices & guides on how to write distributed pytorch training code

cluster cuda deepspeed distributed-training fsdp gpu gpu-cluster kuberentes lambdalabs mpi nccl pytorch sharding slurm

Last synced: 21 Oct 2024

https://github.com/juliagpu/nccl.jl

A Julia wrapper for the NVIDIA Collective Communications Library.

cuda gpu julia nccl

Last synced: 12 Nov 2024

https://github.com/lanl/pydnmfk

Python Distributed Non Negative Matrix Factorization with custom clustering

cupy distributed-computing hpc latent-features machine-learning mpi4py nccl nonnegative-matrix-factorization outofmemory python tensorfactorization

Last synced: 09 Dec 2024

https://github.com/1duo/nccl-examples

NCCL Examples from Official NVIDIA NCCL Developer Guide.

deep-learning distributed-systems nccl nvidia

Last synced: 17 Dec 2024

https://github.com/asprenger/distributed-training-patterns

Experiments with low level communication patterns that are useful for distributed training.

distributed-training horovod mpi mpi4py nccl tensorflow

Last synced: 23 Nov 2024

https://github.com/superlinear-ai/scipy-notebook-gpu

jupyter/scipy-notebook with CUDA Toolkit, cuDNN, NCCL, and TensorRT

cuda cudnn docker nccl scipy-notebook tensorflow tensorrt

Last synced: 11 Nov 2024

https://github.com/tybrucechen/tutorial-conda-cudnn-nccl-installation-for-pytorch

This is a tutorial for installing CUDA (v11.8) and cuDNN (8.6.9) to enable programming torch with GPU. It also mentions about implementation of NCCL for distributed GPU DNN model training.

cuda-installation nccl pytorch-installation ubuntu ubuntu-server windows-10

Last synced: 22 Dec 2024

https://github.com/sub-mod/nccl-builds

nccl built on centos6

nccl

Last synced: 05 Nov 2024