Projects in Awesome Lists by NVIDIA

https://github.com/NVIDIA/nvidia-docker

Build and run Docker containers leveraging NVIDIA GPUs

cuda docker gpu nvidia-docker

Last synced: 14 Mar 2025

https://github.com/nvidia/nvidia-docker

Build and run Docker containers leveraging NVIDIA GPUs

cuda docker gpu nvidia-docker

Last synced: 24 Jan 2025

https://github.com/nvidia/open-gpu-kernel-modules

NVIDIA Linux open GPU kernel module source

Last synced: 14 May 2025

https://github.com/NVIDIA/open-gpu-kernel-modules

NVIDIA Linux open GPU kernel module source

Last synced: 15 Mar 2025

https://github.com/nvidia/deeplearningexamples

State-of-the-Art Deep Learning scripts organized by models - easy to train and deploy with reproducible accuracy and performance on enterprise-grade infrastructure.

computer-vision deep-learning drug-discovery forecasting large-language-models mxnet nlp paddlepaddle pytorch recommender-systems speech-recognition speech-synthesis tensorflow tensorflow2 translation

Last synced: 23 Apr 2025

https://github.com/NVIDIA/DeepLearningExamples

State-of-the-Art Deep Learning scripts organized by models - easy to train and deploy with reproducible accuracy and performance on enterprise-grade infrastructure.

computer-vision deep-learning drug-discovery forecasting large-language-models mxnet nlp paddlepaddle pytorch recommender-systems speech-recognition speech-synthesis tensorflow tensorflow2 translation

Last synced: 17 Mar 2025

https://github.com/nvidia/nemo

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

asr deeplearning generative-ai large-language-models machine-translation multimodal neural-networks speaker-diariazation speaker-recognition speech-synthesis speech-translation tts

Last synced: 12 May 2025

https://github.com/nvidia/megatron-lm

Ongoing research training transformer models at scale

large-language-models model-para transformers

Last synced: 13 May 2025

https://github.com/nvidia/tensorrt

NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.

deep-learning gpu-acceleration inference nvidia tensorrt

Last synced: 12 May 2025

https://github.com/NVIDIA/NeMo

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

asr deeplearning generative-ai large-language-models machine-translation multimodal neural-networks speaker-diariazation speaker-recognition speech-synthesis speech-translation tts

Last synced: 14 Mar 2025

https://github.com/nvidia/fastphotostyle

Style transfer, deep learning, feature transform

Last synced: 14 May 2025

https://github.com/NVIDIA/FastPhotoStyle

Style transfer, deep learning, feature transform

Last synced: 30 Mar 2025

https://github.com/nvidia/tensorrt-llm

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and support state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that orchestrate the inference execution in performant way.

Last synced: 12 May 2025

https://github.com/NVIDIA/TensorRT

NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.

deep-learning gpu-acceleration inference nvidia tensorrt

Last synced: 20 Mar 2025

https://nvidia.github.io/TensorRT-LLM/

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.

Last synced: 24 Mar 2025

https://github.com/NVIDIA/TensorRT-LLM

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.

Last synced: 14 Mar 2025

https://github.com/NVIDIA/Megatron-LM

Ongoing research training transformer models at scale

large-language-models model-para transformers

Last synced: 20 Mar 2025

https://github.com/nvidia/vid2vid

Pytorch implementation of our method for high-resolution (e.g. 2048x1024) photorealistic video-to-video translation.

Last synced: 14 May 2025

https://github.com/NVIDIA/vid2vid

Pytorch implementation of our method for high-resolution (e.g. 2048x1024) photorealistic video-to-video translation.

Last synced: 28 Mar 2025

https://github.com/nvidia/apex

A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch

Last synced: 13 May 2025

https://github.com/NVIDIA/apex

A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch

Last synced: 09 Apr 2025

https://github.com/nvidia/cosmos

New repo collection for NVIDIA Cosmos: https://github.com/nvidia-cosmos

Last synced: 12 May 2025

https://github.com/nvidia/cuda-samples

Samples for CUDA Developers which demonstrates features in CUDA Toolkit

cuda cuda-driver-api cuda-kernels cuda-opengl

Last synced: 12 May 2025

https://github.com/nvidia/cutlass

CUDA Templates for Linear Algebra Subroutines

cpp cuda deep-learning deep-learning-library gpu nvidia

Last synced: 13 May 2025

https://github.com/NVIDIA/cuda-samples

Samples for CUDA Developers which demonstrates features in CUDA Toolkit

cuda cuda-driver-api cuda-kernels cuda-opengl

Last synced: 18 Mar 2025

https://github.com/NVIDIA/cutlass

CUDA Templates for Linear Algebra Subroutines

cpp cuda deep-learning deep-learning-library gpu nvidia

Last synced: 14 Mar 2025

https://github.com/nvidia/pix2pixhd

Synthesizing and manipulating 2048x1024 images with conditional GANs

computer-graphics computer-vision deep-learning deep-neural-networks gan generative-adversarial-network image-to-image-translation pix2pix pytorch

Last synced: 15 May 2025

https://github.com/NVIDIA/pix2pixHD

Synthesizing and manipulating 2048x1024 images with conditional GANs

computer-graphics computer-vision deep-learning deep-neural-networks gan generative-adversarial-network image-to-image-translation pix2pix pytorch

Last synced: 05 Apr 2025

https://github.com/nvidia/fastertransformer

Transformer related optimization, including BERT, GPT

bert gpt pytorch transformer

Last synced: 13 May 2025

https://github.com/NVIDIA/FasterTransformer

Transformer related optimization, including BERT, GPT

bert gpt pytorch transformer

Last synced: 16 Mar 2025

https://github.com/nvidia/dali

A GPU-accelerated library containing highly optimized building blocks and an execution engine for data processing to accelerate deep learning training and inference applications.

audio-processing data-augmentation data-processing deep-learning fast-data-pipeline gpu gpu-tensorflow image-augmentation image-processing machine-learning mxnet neural-network paddle python pytorch

Last synced: 13 May 2025

https://github.com/NVIDIA/DALI

A GPU-accelerated library containing highly optimized building blocks and an execution engine for data processing to accelerate deep learning training and inference applications.

audio-processing data-augmentation data-processing deep-learning fast-data-pipeline gpu gpu-tensorflow image-augmentation image-processing machine-learning mxnet neural-network paddle python pytorch

Last synced: 15 Mar 2025

https://github.com/nvidia/tacotron2

Tacotron 2 - PyTorch implementation with faster-than-realtime inference

Last synced: 14 May 2025

https://github.com/NVIDIA/tacotron2

Tacotron 2 - PyTorch implementation with faster-than-realtime inference

Last synced: 27 Mar 2025

https://github.com/nvidia/warp

A Python framework for high performance GPU simulation and graphics

Last synced: 14 May 2025

https://github.com/nvidia/thrust

[ARCHIVED] The C++ parallel algorithms library. See https://github.com/NVIDIA/cccl

algorithms cpp cpp11 cpp14 cpp17 cpp20 cuda cxx cxx11 cxx14 cxx17 cxx20 gpu gpu-computing nvidia nvidia-hpc-sdk thrust

Last synced: 30 Mar 2025

https://github.com/thrust/thrust

[ARCHIVED] The C++ parallel algorithms library. See https://github.com/NVIDIA/cccl

algorithms cpp cpp11 cpp14 cpp17 cpp20 cuda cxx cxx11 cxx14 cxx17 cxx20 gpu gpu-computing nvidia nvidia-hpc-sdk thrust

Last synced: 17 Mar 2025

https://github.com/NVIDIA/thrust

[ARCHIVED] The C++ parallel algorithms library. See https://github.com/NVIDIA/cccl

algorithms cpp cpp11 cpp14 cpp17 cpp20 cuda cxx cxx11 cxx14 cxx17 cxx20 gpu gpu-computing nvidia nvidia-hpc-sdk thrust

Last synced: 15 Mar 2025

https://github.com/NVIDIA/warp

A Python framework for high performance GPU simulation and graphics

Last synced: 03 Apr 2025

https://github.com/nvidia/nemo-guardrails

NeMo Guardrails is an open-source toolkit for easily adding programmable guardrails to LLM-based conversational systems.

Last synced: 14 May 2025

https://github.com/NVIDIA/NeMo-Guardrails

NeMo Guardrails is an open-source toolkit for easily adding programmable guardrails to LLM-based conversational systems.

Last synced: 25 Mar 2025

https://github.com/NVIDIA/garak

the LLM vulnerability scanner

ai llm-evaluation llm-security security-scanners vulnerability-assessment

Last synced: 13 May 2025

https://github.com/nvidia/garak

the LLM vulnerability scanner

ai llm-evaluation llm-security security-scanners vulnerability-assessment

Last synced: 11 May 2025

https://github.com/NVIDIA/DIGITS

Deep Learning GPU Training System

caffe deep-learning gpu machine-learning torch

Last synced: 14 Mar 2025

https://github.com/nvidia/digits

Deep Learning GPU Training System

caffe deep-learning gpu machine-learning torch

Last synced: 20 Mar 2025

https://github.com/nvidia/nccl

Optimized primitives for collective multi-GPU communication

Last synced: 13 May 2025

https://github.com/NVIDIA/nccl

Optimized primitives for collective multi-GPU communication

Last synced: 21 Apr 2025

https://github.com/nvidia/isaac-gr00t

NVIDIA Isaac GR00T N1 is the world's first open foundation model for generalized humanoid robot reasoning and skills.

Last synced: 14 May 2025

https://github.com/nvidia/flownet2-pytorch

Pytorch implementation of FlowNet 2.0: Evolution of Optical Flow Estimation with Deep Networks

Last synced: 14 May 2025

https://github.com/NVIDIA/flownet2-pytorch

Pytorch implementation of FlowNet 2.0: Evolution of Optical Flow Estimation with Deep Networks

Last synced: 20 Mar 2025

https://github.com/nvidia/nvidia-container-toolkit

Build and run containers leveraging NVIDIA GPUs

Last synced: 11 May 2025

https://github.com/nvidia/k8s-device-plugin

NVIDIA device plugin for Kubernetes

kubernetes

Last synced: 13 May 2025

https://github.com/nvidia/generativeaiexamples

Generative AI reference workflows optimized for accelerated infrastructure and microservice architecture.

gpu-acceleration large-language-models llm llm-inference microservice nemo rag retrieval-augmented-generation tensorrt triton-inference-server

Last synced: 13 May 2025

https://github.com/nvidia/chatrtx

A developer reference project for creating Retrieval Augmented Generation (RAG) chatbots on Windows using TensorRT-LLM

Last synced: 14 May 2025

https://github.com/NVIDIA/trt-llm-rag-windows

A developer reference project for creating Retrieval Augmented Generation (RAG) chatbots on Windows using TensorRT-LLM

Last synced: 24 Mar 2025

https://github.com/NVIDIA/GenerativeAIExamples

Generative AI reference workflows optimized for accelerated infrastructure and microservice architecture.

gpu-acceleration large-language-models llm llm-inference microservice nemo rag retrieval-augmented-generation tensorrt triton-inference-server

Last synced: 28 Mar 2025

https://github.com/NVIDIA/k8s-device-plugin

NVIDIA device plugin for Kubernetes

kubernetes

Last synced: 03 Apr 2025

https://github.com/NVIDIA/ChatRTX

A developer reference project for creating Retrieval Augmented Generation (RAG) chatbots on Windows using TensorRT-LLM

Last synced: 26 Nov 2024

https://github.com/nvidia/nv-ingest

NVIDIA Ingest is an early access set of microservices for parsing hundreds of thousands of complex, messy unstructured PDFs and other enterprise documents into metadata and text to embed into retrieval systems.

Last synced: 13 May 2025

https://github.com/nvidia/minkowskiengine

Minkowski Engine is an auto-diff neural network library for high-dimensional sparse tensors

3d-convolutional-network 3d-vision 4d-convolutional-neural-network auto-differentiation computer-vision convolutional-neural-networks cuda deep-learning high-dimensional-data high-dimensional-inference minkowski-engine neural-network pytorch semantic-segmentation space-time sparse-convolution sparse-tensor-network sparse-tensors spatio-temporal-analysis trilateral-filter

Last synced: 14 May 2025

https://nvidia.github.io/MinkowskiEngine/

Minkowski Engine is an auto-diff neural network library for high-dimensional sparse tensors

3d-convolutional-network 3d-vision 4d-convolutional-neural-network auto-differentiation computer-vision convolutional-neural-networks cuda deep-learning high-dimensional-data high-dimensional-inference minkowski-engine neural-network pytorch semantic-segmentation space-time sparse-convolution sparse-tensor-network sparse-tensors spatio-temporal-analysis trilateral-filter

Last synced: 08 May 2025

https://github.com/nvidia/cuda-python

CUDA Python: Performance meets Productivity

Last synced: 11 May 2025

https://github.com/NVIDIA/MinkowskiEngine

Minkowski Engine is an auto-diff neural network library for high-dimensional sparse tensors

3d-convolutional-network 3d-vision 4d-convolutional-neural-network auto-differentiation computer-vision convolutional-neural-networks cuda deep-learning high-dimensional-data high-dimensional-inference minkowski-engine neural-network pytorch semantic-segmentation space-time sparse-convolution sparse-tensor-network sparse-tensors spatio-temporal-analysis trilateral-filter

Last synced: 20 Mar 2025

https://github.com/nvidia/transformerengine

A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper, Ada and Blackwell GPUs, to provide better performance with lower memory utilization in both training and inference.

cuda deep-learning fp8 gpu jax machine-learning python pytorch

Last synced: 14 May 2025

https://github.com/NVIDIA/nvidia-container-toolkit

Build and run containers leveraging NVIDIA GPUs

Last synced: 06 Apr 2025

https://github.com/nvidia/waveglow

A Flow-based Generative Network for Speech Synthesis

Last synced: 14 Apr 2025

https://github.com/NVIDIA/waveglow

A Flow-based Generative Network for Speech Synthesis

Last synced: 27 Mar 2025

https://github.com/NVIDIA/libcudacxx

[ARCHIVED] The C++ Standard Library for your entire system. See https://github.com/NVIDIA/cccl

cpp cpp11 cpp14 cpp17 cpp20 cpp23 cuda cxx cxx11 cxx14 cxx17 cxx20 cxx23 gpu libcxx llvm nvidia nvidia-hpc-sdk standard std

Last synced: 21 Apr 2025

https://nvidia.github.io/libcudacxx/

[ARCHIVED] The C++ Standard Library for your entire system. See https://github.com/NVIDIA/cccl

cpp cpp11 cpp14 cpp17 cpp20 cpp23 cuda cxx cxx11 cxx14 cxx17 cxx20 cxx23 gpu libcxx llvm nvidia nvidia-hpc-sdk standard std

Last synced: 31 Mar 2025

https://github.com/nvidia/libcudacxx

[ARCHIVED] The C++ Standard Library for your entire system. See https://github.com/NVIDIA/cccl

cpp cpp11 cpp14 cpp17 cpp20 cpp23 cuda cxx cxx11 cxx14 cxx17 cxx20 cxx23 gpu libcxx llvm nvidia nvidia-hpc-sdk standard std

Last synced: 22 Jan 2025

https://github.com/nvidia/gpu-operator

NVIDIA GPU Operator creates, configures, and manages GPUs in Kubernetes

cuda gpu kubernetes nvidia

Last synced: 12 May 2025

https://github.com/NVIDIA/gpu-operator

NVIDIA GPU Operator creates, configures, and manages GPUs in Kubernetes

cuda gpu kubernetes nvidia

Last synced: 21 Apr 2025

https://github.com/nvidia/stable-diffusion-webui-tensorrt

TensorRT Extension for Stable Diffusion Web UI

Last synced: 15 May 2025

https://github.com/NVIDIA/Stable-Diffusion-WebUI-TensorRT

TensorRT Extension for Stable Diffusion Web UI

Last synced: 28 Mar 2025

https://github.com/nvidia/cudalibrarysamples

CUDA Library Samples

cuda cudss cufft curand cusolver cusparse cutenros gpu linear-algebra mathdx nppcublas nvcomp nvjpeg nvjpeg2000 nvtiff

Last synced: 10 May 2025

https://github.com/nvidia/stdexec

`std::execution`, the proposed C++ framework for asynchronous and parallel programming.

Last synced: 14 May 2025

https://github.com/NVIDIA/stdexec

`std::execution`, the proposed C++ framework for asynchronous and parallel programming.

Last synced: 21 Apr 2025

https://github.com/NVIDIA/semantic-segmentation

Nvidia Semantic Segmentation monorepo

Last synced: 23 Apr 2025

https://github.com/nvidia/semantic-segmentation

Nvidia Semantic Segmentation monorepo

Last synced: 08 Apr 2025

https://github.com/nvidia/cub

[ARCHIVED] Cooperative primitives for CUDA C++. See https://github.com/NVIDIA/cccl

algorithms cpp cpp11 cpp14 cpp17 cpp20 cub cuda cxx cxx11 cxx14 cxx17 cxx20 gpu nvidia nvidia-hpc-sdk

Last synced: 17 Jan 2025

https://github.com/NVIDIA/DeepRecommender

Deep learning for recommender systems

collaborative-filtering deep-autoencoders deep-learning gpu recommendation-engine

Last synced: 27 Nov 2024

https://github.com/NVIDIA/cub

[ARCHIVED] Cooperative primitives for CUDA C++. See https://github.com/NVIDIA/cccl

algorithms cpp cpp11 cpp14 cpp17 cpp20 cub cuda cxx cxx11 cxx14 cxx17 cxx20 gpu nvidia nvidia-hpc-sdk

Last synced: 20 Mar 2025

https://github.com/nvidia/cccl

CUDA Core Compute Libraries

accelerated-computing cpp cpp-programming cuda cuda-cpp cuda-kernels cuda-library cuda-programming gpu gpu-acceleration gpu-computing gpu-programming hpc modern-cpp nvidia nvidia-gpu parallel-algorithm parallel-computing parallel-programming

Last synced: 13 May 2025

https://github.com/NVIDIA/CUDALibrarySamples

CUDA Library Samples

Last synced: 14 May 2025

https://github.com/nvidia/trt-samples-for-hackathon-cn

Simple samples for TensorRT programming

Last synced: 14 May 2025

https://github.com/NVIDIA/trt-samples-for-hackathon-cn

Simple samples for TensorRT programming

Last synced: 20 Mar 2025

https://github.com/nvidia/cosmos-tokenizer

A suite of image and video neural tokenizers

diffusion tokenization transformers

Last synced: 15 Feb 2025

https://github.com/nvidia/openseq2seq

Toolkit for efficient experimentation with Speech Recognition, Text2Speech and NLP

deep-learning float16 language-model mixed-precision multi-gpu multi-node neural-machine-translation seq2seq sequence-to-sequence speech-recognition speech-synthesis speech-to-text tensorflow text-to-speech

Last synced: 18 Jan 2025

https://github.com/NVIDIA/OpenSeq2Seq

Toolkit for efficient experimentation with Speech Recognition, Text2Speech and NLP

deep-learning float16 language-model mixed-precision multi-gpu multi-node neural-machine-translation seq2seq sequence-to-sequence speech-recognition speech-synthesis speech-to-text tensorflow text-to-speech

Last synced: 27 Nov 2024

https://github.com/nvidia/aistore

AIStore: scalable storage for AI applications

batch-jobs distributed-shuffle erasure-coding etl-offload kubernetes linear-scalability multiple-backends network-of-clusters object-storage sds software-defined

Last synced: 13 May 2025

https://github.com/nvidia/physicsnemo

Open-source deep-learning framework for building, training, and fine-tuning deep learning models using state-of-the-art Physics-ML methods

deep-learning machine-learning nvidia-gpu physics pytorch

Last synced: 14 May 2025

https://github.com/NVIDIA/aistore

AIStore: scalable storage for AI applications

batch-jobs distributed-shuffle erasure-coding etl-offload kubernetes linear-scalability multiple-backends network-of-clusters object-storage sds software-defined

Last synced: 26 Mar 2025

https://github.com/NVIDIA/modulus

Open-source deep-learning framework for building, training, and fine-tuning deep learning models using state-of-the-art Physics-ML methods

deep-learning machine-learning nvidia-gpu physics pytorch

Last synced: 28 Apr 2025

Last synced: 14 May 2025

https://github.com/nvidia/videoprocessingframework

Set of Python bindings to C++ libraries which provides full HW acceleration for video decoding, encoding and GPU-accelerated color space and pixel format conversions

Last synced: 18 Jan 2025

https://github.com/NVIDIA/MatX

An efficient C++17 GPU numerical computing library with Python-like syntax

cuda gpgpu gpu gpu-computing hpc

Last synced: 26 Mar 2025

https://github.com/NVIDIA/open-gpu-doc

Documentation of NVIDIA chip/hardware interfaces

Last synced: 08 Apr 2025

https://github.com/nvidia/open-gpu-doc

Documentation of NVIDIA chip/hardware interfaces

Last synced: 14 May 2025