CUDA
CUDA® is a parallel computing platform and programming model developed by NVIDIA for general computing on graphical processing units (GPUs). With CUDA, developers are able to dramatically speed up computing applications by harnessing the power of GPUs.
- GitHub: https://github.com/topics/cuda
- Wikipedia: https://en.wikipedia.org/wiki/CUDA
- Created by: Nvidia
- Released: June 23, 2007
- Related Topics: nvcc,
- Last updated: 2026-06-15 00:07:19 UTC
- JSON Representation
https://github.com/mihaibujanca/dynamicfusion
Implementation of Newcombe et al. CVPR 2015 DynamicFusion paper
3d-reconstruction computer-vision cuda non-rigid opencv
Last synced: 30 Jan 2026
https://github.com/IBM/aihwkit
IBM Analog Hardware Acceleration Kit
ai analog-devices cuda neural-networks pytorch
Last synced: 11 May 2025
https://github.com/JuliaGPU/CUDAnative.jl
Julia support for native CUDA programming
cuda cuda-toolkit julia julia-library
Last synced: 22 Jul 2025
https://github.com/nosferalatu/simplegpuhashtable
A simple GPU hash table implemented in CUDA using lock free techniques
cuda cuda-programming data-structures gpu gpu-cuda-programs
Last synced: 27 Dec 2025
https://github.com/rapidsai/cucim
cuCIM - RAPIDS GPU-accelerated image processing library
computer-vision cuda digital-pathology gpu image-analysis image-data image-processing medical-imaging microscopy multidimensional-image-processing nvidia segmentation
Last synced: 11 Apr 2025
https://github.com/dfm/extending-jax
Extending JAX with custom C++ and CUDA code
Last synced: 05 Apr 2025
https://github.com/engcang/vins-application
VINS-Fusion, VINS-Fisheye, OpenVINS, EnVIO, ROVIO, S-MSCKF, ORB-SLAM2, NVIDIA Elbrus application of different sets of cameras and imu on different board including desktop and Jetson boards
cuda nvidia ros ros2 slam vio visual-slam
Last synced: 11 Oct 2025
https://github.com/fixstars/cuda-bundle-adjustment
A CUDA implementation of Bundle Adjustment
bundle-adjustment cuda g2o slam structure-from-motion visual-slam
Last synced: 05 Apr 2025
https://github.com/NVIDIA/cuQuantum
Home for cuQuantum Python & NVIDIA cuQuantum SDK C++ samples
cuda cuquantum custatevec cutensornet nvidia quantum-computing
Last synced: 02 Apr 2025
https://github.com/nosferalatu/SimpleGPUHashTable
A simple GPU hash table implemented in CUDA using lock free techniques
cuda cuda-programming data-structures gpu gpu-cuda-programs
Last synced: 06 May 2025
https://github.com/Bruce-Lee-LY/cuda_hgemm
Several optimization methods of half-precision general matrix multiplication (HGEMM) using tensor core with WMMA API and MMA PTX instruction.
cublas cuda gemm gpu hgemm matrix-multiply nvidia tensor-core
Last synced: 14 May 2025
https://github.com/alpaka-group/alpaka
Abstraction Library for Parallel Kernel Acceleration :llama:
cpp cpp17 cuda gpu header-only heterogeneous-parallel-programming hip hpc openacc openmp rocm tbb
Last synced: 15 May 2025
https://github.com/osai-ai/tensor-stream
A library for real-time video stream decoding to CUDA memory
c-plus-plus cuda python pytorch video video-processing
Last synced: 07 May 2025
https://github.com/luoyetx/mini-caffe
Minimal runtime core of Caffe, Forward only, GPU support and Memory efficiency.
android caffe cuda cudnn forward-only linux mini-caffe openblas windows
Last synced: 15 Mar 2025
https://github.com/bruce-lee-ly/cuda_hgemm
Several optimization methods of half-precision general matrix multiplication (HGEMM) using tensor core with WMMA API and MMA PTX instruction.
cublas cuda gemm gpu hgemm matrix-multiply nvidia tensor-core
Last synced: 05 Apr 2025
https://github.com/glotzerlab/hoomd-blue
Molecular dynamics and Monte Carlo soft matter simulation on GPUs.
conda-forge cuda docker gpu hard-particle hoomd-blue molecular-dynamics monte-carlo-simulation particle-system python simulation singularity
Last synced: 15 May 2025
https://github.com/uncomplicate/bayadera
High-performance Bayesian Data Analysis on the GPU in Clojure
bayesian bayesian-data-analysis bayesian-inference clojure clojure-library cuda gpu gpu-acceleration gpu-computing high-performance-computing machine-learning markov-chain-monte-carlo mcmc opencl statistics
Last synced: 09 Apr 2025
https://github.com/nersc/timemory
Modular C++ Toolkit for Performance Analysis and Logging. Profiling API and Tools for C, C++, CUDA, Fortran, and Python. The C++ template API is essentially a framework to creating tools: it is designed to provide a unifying interface for recording various performance measurements alongside data logging and interfaces to other tools.
analysis c cplusplus cpp cross-language cross-platform cuda cupti gotcha hardware-counters instrumentation-api memory-measurements modular-design mpi papi performance performance-measurement python roofline
Last synced: 04 Oct 2025
https://github.com/cyberia-to/go-cyber
Your 🔵 Superintelligence
ai blockchain computation-graphs cosmos cosmos-sdk cuda cyber cyber-rank fuckgoogle great-web ipfs knowledge-graph protocol search search-engine soft3 supercomputer tendermint universe-mirror web3
Last synced: 16 Feb 2026
https://github.com/LambdaLabsML/distributed-training-guide
Best practices & guides on how to write distributed pytorch training code
cluster cuda deepspeed distributed-training fsdp gpu gpu-cluster kuberentes lambdalabs mpi nccl pytorch sharding slurm
Last synced: 08 Mar 2025
https://github.com/NERSC/timemory
Modular C++ Toolkit for Performance Analysis and Logging. Profiling API and Tools for C, C++, CUDA, Fortran, and Python. The C++ template API is essentially a framework to creating tools: it is designed to provide a unifying interface for recording various performance measurements alongside data logging and interfaces to other tools.
analysis c cplusplus cpp cross-language cross-platform cuda cupti gotcha hardware-counters instrumentation-api memory-measurements modular-design mpi papi performance performance-measurement python roofline
Last synced: 06 May 2025
https://github.com/cybercongress/go-cyber
Your 🔵 Superintelligence
ai blockchain computation-graphs cosmos cosmos-sdk cuda cyber cyber-rank fuckgoogle great-web ipfs knowledge-graph protocol search search-engine soft3 supercomputer tendermint universe-mirror web3
Last synced: 15 May 2025
https://github.com/uob-hpc/babelstream
STREAM, for lots of devices written in many programming models
benchmark cuda gpgpu gpu hpc kokkos memory-bandwidth openacc opencl openmp parallel-processing raja sycl
Last synced: 21 Oct 2025
https://github.com/zjhellofss/KuiperLLama
校招、秋招、春招、实习好项目,带你从零动手实现支持LLama2/3和Qwen2.5的大模型推理框架。
cpp cuda inference-engine llama2 llama3 llm llm-inference qwen qwen2
Last synced: 08 Sep 2025
https://github.com/harrism/hemi
Simple utilities to enable code reuse and portability between CUDA C/C++ and standard C/C++.
c-plus-plus cuda cuda-device cuda-kernels gpu hemi
Last synced: 06 Apr 2025
https://github.com/zjhellofss/kuiperllama
校招、秋招、春招、实习好项目,带你从零动手实现支持LLama2/3和Qwen2.5的大模型推理框架。
cpp cuda inference-engine llama2 llama3 llm llm-inference qwen qwen2
Last synced: 16 May 2025
https://github.com/omlins/parallelstencil.jl
Package for writing high-level code for parallel high-performance stencil computations that can be deployed on both GPUs and CPUs
cuda gpu julia multi-gpu multi-xpu parallel staggered-grids stencil stencil-codes xpu
Last synced: 28 Jan 2026
https://github.com/kerneltuner/kernel_tuner
Kernel Tuner
auto-tuning autotuning c cplusplus cuda cuda-kernels gpu gpu-computing kernel-tuner machine-learning opencl opencl-kernels optimization python software-development testing
Last synced: 15 May 2025
https://github.com/omlins/ParallelStencil.jl
Package for writing high-level code for parallel high-performance stencil computations that can be deployed on both GPUs and CPUs
cuda gpu julia multi-gpu multi-xpu parallel staggered-grids stencil stencil-codes xpu
Last synced: 27 Mar 2025
https://github.com/nvidia/cuda-checkpoint
CUDA checkpoint and restore utility
Last synced: 16 May 2025
https://github.com/UoB-HPC/BabelStream
STREAM, for lots of devices written in many programming models
benchmark cuda gpgpu gpu hpc kokkos memory-bandwidth openacc opencl openmp parallel-processing raja sycl
Last synced: 21 Apr 2025
https://github.com/mrshaw01/software-engineer
A curated learning repository focused on High-Performance Computing (HPC) — covering fundamentals to advanced topics in CUDA, MPI, C++, and Python-C++ interoperability.
cpp cuda high-performance-computing hip python
Last synced: 16 Jul 2025
https://github.com/agenium-scale/nsimd
Agenium Scale vectorization library for CPUs and GPUs
aarch64 avx avx2 avx512 cpp20 cpp20-library cuda hpc neon neon128 rocm simd simd-instructions simd-library simd-programming sse2 sse42 sve vectorization-library
Last synced: 09 Apr 2025
https://github.com/lmnt-com/haste
Haste: a fast, simple, and open RNN library
algorithm api cpp cuda deep-learning gru lstm machine-learning python pytorch rnn rnn-implementations rnn-layers tensorflow
Last synced: 04 Apr 2025
https://github.com/a2flo/floor
A C++ Compute/Graphics Library and Toolchain enabling same-source CUDA/Host/Metal/OpenCL/Vulkan C++ programming and execution.
c-plus-plus compiler compute cuda graphics ios linux macos metal opencl openxr rendering spir spir-v virtual-reality vulkan windows
Last synced: 16 May 2025
https://github.com/nvidia/tilus
Tilus is a tile-level kernel programming language with explicit control over shared memory and registers.
Last synced: 06 Sep 2025
https://github.com/MFlowCode/MFC
Exascale multiphase flow solver — 2025 Gordon Bell Prize Finalist | 200T grid points on 43K+ GPUs
amd-gpu cfd computational-fluid-dynamics cuda exascale fluid-dynamics fortran gpu gpu-computing hpc mpi multiphase nvidia-gpu openacc openmp parallel-computing physics-simulation rocm scientific-computing simulation
Last synced: 01 Mar 2026
https://github.com/knightcrawler25/optix-pathtracer
Simple physically based path tracer based on Nvidia's Optix Ray Tracing Engine
brdf cuda disney gpu optix pathtracing raytracing
Last synced: 07 Apr 2025
https://github.com/QMCPACK/qmcpack
Main repository for QMCPACK, an open-source production level many-body ab initio Quantum Monte Carlo code for computing the electronic structure of atoms, molecules, and solids with full performance portable GPU support
c-plus-plus cuda electronic-structure gpu high-performance-computing hpc mpi oneapi quantum-chemistry quantum-monte-carlo rocm
Last synced: 26 Mar 2025
https://github.com/rkinas/triton-resources
A curated list of resources for learning and exploring Triton, OpenAI's programming language for writing efficient GPU code.
Last synced: 16 May 2025
https://github.com/charles-r-earp/autograph
A machine learning library for Rust.
cuda machine-learning neural-networks rust
Last synced: 16 May 2025
https://github.com/lattice/quda
QUDA is a library for performing calculations in lattice QCD on GPUs.
c c-plus-plus cuda gpu mpi multi-gpu qcd
Last synced: 15 May 2025
https://github.com/favreau/Sol-R
Open-Source CUDA/OpenCL Speed Of Light Ray-tracer
3d 3d-graphics-engine cuda gpgpu gpu-acceleration gpu-computing graphics-engine interactive opencl path-tracing pathtracing ray-tracing raytracer raytracing raytracing-engine realtime-rendering rendering science virtual-reality vr
Last synced: 30 Apr 2025
https://github.com/gezp/docker-ubuntu-desktop
Docker Image for Ubuntu Desktop which support HW GPU accelerated GUI apps. you can access the Container with ssh or remote desktop, just like Cloud VM.
cuda docker kasmvnc nomachine nvidia-gpu opengl remote-desktop ubuntu virtualgl
Last synced: 13 Apr 2025
https://github.com/nvidia-genomics-research/genomeworks
SDK for GPU accelerated genome assembly and analysis
alignment cuda genomics gpu mapping nvidia partial-order-alignment poa python-api
Last synced: 06 Apr 2026
https://github.com/sekwiatkowski/Komputation
Komputation is a neural network framework for the Java Virtual Machine written in Kotlin and CUDA C.
artificial-intelligence convolutional-neural-networks cuda framework gpu jvm kotlin machine-learning neural-networks nlp nvidia recurrent-neural-networks seq2seq
Last synced: 01 Apr 2025
https://github.com/NVIDIA-Genomics-Research/GenomeWorks
SDK for GPU accelerated genome assembly and analysis
alignment cuda genomics gpu mapping nvidia partial-order-alignment poa python-api
Last synced: 09 May 2025
https://github.com/pcb9382/FaceAlgorithm
face detection face recognition包含人脸检测(retinaface,yolov5face,yolov7face,yolov8face),人脸检测跟踪(ByteTracker),人脸角度计算(Face_Angle)人脸矫正(Face_Aligner),人脸识别(Arcface),口罩检测(MaskRecognitiion),年龄性别检测(Gender_age),静默活体检测(Silent_Face_Anti_Spoofing),FaceAlignment(106keypoints)
cuda face-alignment face-detection face-recognition tensorrt yolov5face yolov7face yolov8face
Last synced: 18 Mar 2025
https://github.com/GoodAI/BrainSimulator
Brain Simulator is a platform for visual prototyping of artificial intelligence architectures.
ai brain-simulator cuda machine-learning
Last synced: 08 Jul 2025
https://github.com/andrewkchan/yalm
Yet Another Language Model: LLM inference in C++/CUDA, no libraries except for I/O
cpp cuda inference-engine llama llamacpp llm llm-inference machine-learning mistral
Last synced: 12 Apr 2025
https://github.com/JuliaGPU/CuArrays.jl
A Curious Cumulation of CUDA Cuisine
Last synced: 22 Jul 2025
https://github.com/rentainhe/pytorch-distributed-training
Simple tutorials on Pytorch DDP training
apex cuda ddp-training deep-learning pytorch
Last synced: 06 Mar 2026
https://github.com/LLNL/blt
A streamlined CMake build system foundation for developing HPC software
blt build-system build-tools cmake cpp cuda hpc radiuss testing
Last synced: 21 Apr 2025
https://github.com/marian-nmt/marian-dev
Fast Neural Machine Translation in C++ - development repository
cpp11 cuda fast gpu-acceleration neural-machine-translation
Last synced: 15 May 2025
https://github.com/llnl/blt
A streamlined CMake build system foundation for developing HPC software
blt build-system build-tools cmake cpp cuda hpc radiuss testing
Last synced: 15 May 2025
https://github.com/koide3/gtsam_points
A collection of GTSAM factors and optimizers for point cloud SLAM
bundle-adjustment continuous-time cuda factor-graph gpu gtsam kdtree localization mapping point-cloud registration slam voxelmap
Last synced: 12 Apr 2025
https://github.com/trinkle23897/fast-poisson-image-editing
A fast poisson image editing implementation that can utilize multi-core CPU or GPU to handle a high-resolution image input.
cpp cuda high-performance-computing image-processing jacobi-iteration jacobi-method mpi numpy openmp parallel-computing poisson-image-editing pybind11 python
Last synced: 05 Apr 2025
https://github.com/bwohlberg/sporco
Sparse Optimisation Research Code
admm convolutional-dictionary-learning convolutional-sparse-coding cuda dictionary-learning fista optimization optimization-algorithms plug-and-play-priors python robust-pca sparse-coding sparse-representations sparsity total-variation total-variation-minimization
Last synced: 15 May 2025
https://github.com/Trinkle23897/Fast-Poisson-Image-Editing
A fast poisson image editing implementation that can utilize multi-core CPU or GPU to handle a high-resolution image input.
cpp cuda high-performance-computing image-processing jacobi-iteration jacobi-method mpi numpy openmp parallel-computing poisson-image-editing pybind11 python
Last synced: 02 Apr 2025
https://github.com/owensgroup/rxmesh
GPU-accelerated triangle mesh processing
3d 3d-graphics cuda data-structure geometry geometry-processing gpu mesh mesh-processing parallel-computing surface-mesh
Last synced: 15 Jun 2025
https://github.com/tumaer/JAXFLUIDS
Differentiable Fluid Dynamics Package
automatic-differentiation cfd compressible-flows computational-fluid-dynamics cuda deep-learning fluid-dynamics gpu gpu-computing high-performance hpc jax jaxfluids machine-learning multi-phase-flows tpu turbulence
Last synced: 26 Oct 2025
https://github.com/crazyguitar/cppcheatsheet
C/C++ Cheat Sheet
c cheatsheet cpp cpp20 cpp23 cuda rust
Last synced: 14 Feb 2026
https://github.com/owensgroup/RXMesh
GPU-accelerated triangle mesh processing
3d 3d-graphics cuda data-structure geometry geometry-processing gpu mesh mesh-processing parallel-computing surface-mesh
Last synced: 25 Apr 2025
https://github.com/slicer/light-the-torch
Install PyTorch distributions with computation backend auto-detection
Last synced: 10 May 2026
https://github.com/openucx/ucc
Unified Collective Communication Library
collectives cuda deep-learning hpc infiniband mpi openshmem pgas pytorch roce sharp
Last synced: 16 May 2025
https://github.com/sergeneren/Volumetric-Path-Tracer
:cloud: Volumetric path tracer using cuda
cloud cuda gpu-rendering nvidia open-vdb path-tracer vdb volume-rendering volumetric
Last synced: 25 Nov 2025
https://github.com/pmeier/light-the-torch
Install PyTorch distributions with computation backend auto-detection
Last synced: 24 Mar 2025
https://github.com/hybridizer-io/hybridizer-basic-samples
Examples of C# code compiled to GPU by hybridizer
avx avx2 avx512 compiler cuda dotnet gpu hybridizer-essentials optimization parallel vectorization visual-studio
Last synced: 16 Jan 2026
https://github.com/asmirnou/watsor
Object detection for video surveillance
camera coral cuda detection ffmpeg gpu hardware-acceleration homeassistant ip mpegts mqtt person-detector python realtime stream surveillance tensorrt tensrflow video zones
Last synced: 05 Apr 2025
https://github.com/modelscope/dash-infer
DashInfer is a native LLM inference engine aiming to deliver industry-leading performance atop various hardware architectures, including CUDA, x86 and ARMv9.
cpu cuda guided-decoding llm llm-inference native-engine
Last synced: 12 Apr 2025
https://github.com/AmusementClub/vs-mlrt
Efficient CPU/GPU/Vulkan ML Runtimes for VapourSynth (with built-in support for waifu2x, DPIR, RealESRGANv2/v3, Real-CUGAN, RIFE, SCUNet and more!)
artificial-intelligence cuda deep-learning directml dpir gpu migraphx ncnn neural-network onnx onnxruntime openvino real-cugan real-esrgan rife tensorrt vapoursynth vulkan waifu2x
Last synced: 24 Mar 2025
https://github.com/ritchieng/dlami
A Deep Learning Amazon Web Service (AWS) AMI that is open, free and works. Run in less than 5 minutes. TensorFlow, Keras, PyTorch, Theano, MXNet, CNTK, Caffe and all dependencies.
ami aws cuda cudnn5 keras python tensorflow ubuntu
Last synced: 08 May 2025
https://github.com/matteo-ronchetti/torch-radon
Computational Tomography in PyTorch
cuda hacktoberfest inverse-problems pytorch radon-transform shearlet-transform tomography
Last synced: 15 Apr 2025
https://github.com/shapelets/khiva
An open-source library of algorithms to analyse time series in GPU and CPU.
clustering cpp cuda data-series discords distances gpu khiva kshape matrix-profile motifs multicore opencl shapelets snippets time-series timeseries
Last synced: 08 May 2025
https://github.com/zjin-lcf/HeCBench
benchmark cuda gpu-computing hip hpc-applications openmp scientific-computing sycl test-driven-development
Last synced: 04 Apr 2025
https://github.com/opendilab/di-hpc
OpenDILab RL HPC OP Lib, including CUDA and Triton kernel
cuda hpc lstm pytorch reinforcement-learning triton
Last synced: 09 Apr 2025
https://github.com/ceed/libceed
CEED Library: Code for Efficient Extensible Discretizations
api ceed cuda ecp exascale-computing gpu high-order high-performance-computing hpc julia linear-algebra
Last synced: 15 May 2025
https://github.com/marnovo/macos-egpu-cuda-guide
Set up CUDA for machine learning (and gaming) on macOS using a NVIDIA eGPU
apple cuda deep-learning egpu gaming gpu guide hacktoberfest mac machine-learning macos nvidia
Last synced: 17 Aug 2025
https://github.com/CEED/libCEED
CEED Library: Code for Efficient Extensible Discretizations
api ceed cuda ecp exascale-computing gpu high-order high-performance-computing hpc julia linear-algebra
Last synced: 07 May 2025
https://github.com/marnovo/macOS-eGPU-CUDA-guide
Set up CUDA for machine learning (and gaming) on macOS using a NVIDIA eGPU
apple cuda deep-learning egpu gaming gpu guide hacktoberfest mac machine-learning macos nvidia
Last synced: 14 Jul 2025
https://github.com/wangzyon/NVIDIA_SGEMM_PRACTICE
Step-by-step optimization of CUDA SGEMM
Last synced: 04 Apr 2025
https://github.com/Hellisotherpeople/CX_DB8
a contextual, biasable, word-or-sentence-or-paragraph extractive summarizer powered by the latest in text embeddings (Bert, Universal Sentence Encoder, Flair)
contextual-summarization cuda debate-evidence embeddings extractive-summarization flair python semantic-search semantic-summarization summarization summarizer token-level-summarization universal-sentence-encoder
Last synced: 13 Jul 2025
https://github.com/bh107/bohrium
Automatic parallelization of Python/NumPy, C, and C++ codes on Linux and MacOSX
cuda gpu gpu-acceleration multi-core numpy opencl parallel-computing
Last synced: 21 Oct 2025
https://github.com/bytedance/abq-llm
An acceleration library that supports arbitrary bit-width combinatorial quantization operations
cuda llm-inference mlsys quantized-networks research
Last synced: 04 Apr 2025
https://github.com/unitaryfoundation/qrack
Comprehensive, GPU accelerated framework for developing universal virtual quantum processors
cuda distributed-quantum-computing gpu hpc integrated-graphics intel-hd-graphics near-clifford opencl physics physics-simulation quantum quantum-computer-simulator quantum-computing quantum-information quantum-simulator qubits
Last synced: 01 Apr 2026
https://github.com/llnl/hiop
HPC solver for nonlinear optimization problems
acopf bfgs constrained-optimization cuda gpu-support hpc interior-point-method interior-point-optimizer math-physics mpi nonlinear-optimization nonlinear-programming nonlinear-programming-algorithms nonsmooth-optimization optimization parallel-programming quasi-newton radiuss rocm solver
Last synced: 16 May 2025
https://github.com/ifl-camp/supra
SUPRA: Software Defined Ultrasound Processing for Real-Time Applications - An Open Source 2D and 3D Pipeline from Beamforming to B-Mode
2d 3d cuda openigtlink pipeline real-time software-defined supra tum ultrasound ultrasound-imaging ultrasound-pipeline
Last synced: 05 Jul 2025
https://github.com/leimao/cuda-gemm-optimization
CUDA Matrix Multiplication Optimization
Last synced: 20 Feb 2026
https://github.com/1ytic/warp-rnnt
CUDA-Warp RNN-Transducer
cuda forward-backward pytorch rnn-transducer tensorflow warp
Last synced: 05 Apr 2025
https://github.com/DeMoriarty/TorchPQ
Approximate nearest neighbor search with product quantization on GPU in pytorch and cuda
cuda nearest-neighbor-search pytorch
Last synced: 01 Apr 2025
https://github.com/demoriarty/torchpq
Approximate nearest neighbor search with product quantization on GPU in pytorch and cuda
cuda nearest-neighbor-search pytorch
Last synced: 05 Apr 2025
https://github.com/alpine-dav/ascent
A flyweight in situ visualization and analysis runtime for multi-physics HPC simulations
analysis cuda data-viz hpc mpi parallel-computing radiuss rendering scientific-computing
Last synced: 25 Jun 2025
https://github.com/helmut-hoffer-von-ankershoffen/jetson
Helmut Hoffer von Ankershoffen experimenting with arm64 based NVIDIA Jetson (Nano and AGX Xavier) edge devices running Kubernetes (K8s) for machine learning (ML) including Jupyter Notebooks, TensorFlow Training and TensorFlow Serving using CUDA for smart IoT.
ansible archiconda cuda docker edge-devices hoffer-von-ankershoffen jupyter k8s kubeflow kubernetes kustomize machine-learning ml nvidia-jetson-nano nvidia-jetson-xavier skaffold smart-iot software-engineering tensorflow-serving virtualbox
Last synced: 14 Apr 2025
https://github.com/nvidia/dl4agx
Deep Learning tools and applications for NVIDIA AGX platforms.
autonomous-driving computer-vision cuda deep-learning drive-agx embedded
Last synced: 12 Apr 2025