CUDA
CUDAยฎ is a parallel computing platform and programming model developed by NVIDIA for general computing on graphical processing units (GPUs). With CUDA, developers are able to dramatically speed up computing applications by harnessing the power of GPUs.
- GitHub: https://github.com/topics/cuda
- Wikipedia: https://en.wikipedia.org/wiki/CUDA
- Created by: Nvidia
- Released: June 23, 2007
- Related Topics: nvcc,
- Last updated: 2026-06-29 00:07:23 UTC
- JSON Representation
https://github.com/lanl/stcuda
StCUDA allows Smalltalk to call CUDA Driver APIs to do GPU computing
Last synced: 12 Apr 2025
https://github.com/bencardoen/singularity_slurm_cuda
Example on how to get started with Singularity and CUDA on a SLURM cluster
cuda nvidia singularity-container slurm-cluster tensorflow
Last synced: 15 Oct 2025
https://github.com/Eve-ning/glcm-cupy
GLCM in CUDA
computer-vision cuda cupy feature-engineering glcm python
Last synced: 15 Mar 2025
https://github.com/nyo16/llama_cpp_ex
Elixir bindings for llama.cpp โ run LLMs locally with Metal, CUDA, Vulkan, or CPU. Streaming, chat templates, embeddings, structured output, and concurrent batched inference.
Last synced: 04 Jun 2026
https://github.com/aresio/cupsoda
cupSODA is CUDA-powered coarse-grain deterministic simulator of mass-action kinetics models
biochemical cuda gpu-computing mass-action simulation
Last synced: 21 Feb 2026
https://github.com/abus-aikorea/aria-coversong
The best gradio web-ui for creating cover song that uses mdx-net and rvc. Easy one click installation. Fully portable.
cuda demucs gradio karaoke mdx-net nvidia python pytorch rvc song-covers uvr vocal-remover voice-conversion
Last synced: 25 Apr 2025
https://github.com/giovaneiwamoto/cuda-shortest-paths
๐งฉ Cuda Shortest Paths - Parallel Dijkstra and Floyd algorithms using Nvidia CUDA to calculate All-Pairs Shortest Path (APSP) in a given graph represented by its adjacency matrix.
all-pairs-shortest-path cuda nvidia
Last synced: 29 Apr 2025
https://github.com/radenmuaz/slope-ad
A small automatic differentiation engine, supporting higher-order derivatives
array autograd automatic-differentiation cuda gradient iree jvp machine-learning metal mlir onnx onnxruntime tensor vjp
Last synced: 26 Jun 2025
https://github.com/ashvardanian/scaling-democracy
GPU-accelerated Schulze voting method in Python, Numba, and CUDA, using ideas from Algebraic Graph Theory
cuda cuda-kernels dynamic-programming gpgpu graph-algorithms graph-theory pybind11 python voting
Last synced: 12 Apr 2025
https://github.com/bybatkhuu/wiki
Personal wiki for public.
cuda docker docker-compose linux manuals nvidia-docker nvidia-gpu wiki
Last synced: 24 Dec 2025
https://github.com/potato3d/grid
GPU-accelerated uniform grid construction for ray tracing
cuda glsl gpu grid ray-tracing
Last synced: 06 May 2025
https://github.com/neoheartbeats/neoheartbeats-kernel
An architecture for LLMs' continual-learning and long-term memories
cuda fine-tuning llama-factory llm
Last synced: 05 May 2025
https://github.com/rocm/hipmm
HIP Memory Manager (ROCm-DS)
amd cuda gpu hip memory-management radeon-instinct-mi-series rocm
Last synced: 12 Apr 2025
https://github.com/nguyenphuminh/planckgpt
Train a GPT from scratch on your laptop
ai attention cuda deep-learning dl gpt gpu language-model llm machine-learning ml nlp torch transformer
Last synced: 16 May 2026
https://github.com/almirneeto99/leetgpu-challenges
This repository contains the solution for LeetGPU Challenges
Last synced: 18 Apr 2026
https://github.com/sbaldu/neural_network_hep
Implementation of a neural network framework from scratch in C++ applied to particle physics
cpp cuda high-energy-physics neural-networks
Last synced: 20 Jul 2025
https://github.com/pyhf/cuda-images
pyhf Docker images built on Nvidia Container Toolkit enabled base images
cuda jax nvidia nvidia-cuda nvidia-docker pyhf
Last synced: 15 Jul 2025
https://github.com/neoblizz/cudagl
CUDA based Graphics Library for NVIDIA's GPUs.
cuda graphics-library graphics-programming opengl
Last synced: 18 Jun 2025
https://github.com/rapidsai/cugraph-docs
cuGraph Docs - RAPIDS Graph Analytics Documentation
cuda cugraph documentation graph rapids
Last synced: 12 Sep 2025
https://github.com/coderonion/cuda-beginner-course-rust-version
bilibili่ง้ขใCUDA 12.x ๅนถ่ก็ผ็จๅ ฅ้จ(Rust็)ใ้ ๅฅไปฃ็
candle cpp cublas cuda cuda-programming cudarc cudnn gpu gpu-programming nvcc nvidia parellel-programming python rust
Last synced: 15 Jun 2025
https://github.com/taeguk/dist-prog-assignment
Sogang Univ. Distributed Programming (CSE5414) Assignments.
assignment cuda distributed mpi-library openmp parallel pthreads sogang
Last synced: 13 Jun 2025
https://github.com/belval/raytracing
Using CUDA to implement "Raytracing in one weekend" by Peter Shirley
cuda raytracing raytracing-in-one-weekend
Last synced: 12 Apr 2025
https://github.com/antoniopelusi/lu-solver
Assignments for High Performace Computing exam at Unimore, Modena, IT.
Last synced: 27 Feb 2026
https://github.com/pratikvn/schwarz-lib
Repository for testing asynchronous schwarz methods.
asynchronous cuda domain-decomposition ginkgo schwarz
Last synced: 14 Apr 2025
https://github.com/eunomia-bpf/basic-cuda-tutorial
A collection of CUDA programming examples to learn GPU programming
Last synced: 15 Jun 2025
https://github.com/pedro-avalos/gpu-burn-snap
Unofficial snap for GPU Burn
cuda gpu gpu-burn linux package snap snapcraft stress-test stress-testing
Last synced: 23 Feb 2026
https://github.com/anantzoid/cuda-genetic-algorithm-travelling-salesman-problem
Implementation of Parallel Genetic Algorithm in CUDA to solve TSP (Berlin52)
c cuda genetic-algorithm tsp tsp-solver
Last synced: 25 Jul 2025
https://github.com/rogerallen/smandelbrotr
SDL2 CUDA OpenGL Mandelbrot explorer.
cuda mandelbrot-viewer opengl sdl2
Last synced: 08 Mar 2026
https://github.com/harrydobbs/torch_ransac3d
A high-performance implementation of 3D RANSAC (Random Sample Consensus) algorithm using PyTorch and CUDA.
3d cloud cubiod cuda cylinder plane plane-detection point point-cloud ransac segmentation
Last synced: 03 Oct 2025
https://github.com/skizzy-create/ayurvedic_his
๐ฉบ A personalized app that serves as your personal Ayurvedic assistant, providing tailored advice and guidance based on Ayurvedic principles. ๐ฉบ
cuda gpt python pytorch transformers
Last synced: 04 Oct 2025
https://github.com/tudasc/cusan
A data race detector for CUDA C and C++ based on ThreadSanitizer
c cpp cuda datarace threadsanitizer
Last synced: 12 Aug 2025
https://github.com/nikhilrout/thegemmcoreproject
SystemVerilog Implementation of Nvidia's CUDA/Tensor Core GEMM Operations
cuda floating-point gemm gpgpu hybrid-precision-training sparse-matrix systolic-array tensorcore tpu
Last synced: 17 Aug 2025
https://github.com/pkestene/tsp
traveling salesman problem solved with different programing models
cea cpp cuda kokkos nvidia-gpu openacc openmp performance-portability stdpar sycl
Last synced: 19 Aug 2025
https://github.com/coderonion/cuda-beginner-course-python-version
bilibili่ง้ขใCUDA 12.x ๅนถ่ก็ผ็จๅ ฅ้จ(Python็)ใ้ ๅฅไปฃ็
cpp cublas cuda cuda-programming cudnn cupy gpu gpu-programming nvcc nvidia parallel-programming python rust
Last synced: 19 Oct 2025
https://github.com/jpuigcerver/nnutils
CPU & CUDA implementation of several neural network utils
cuda deep-learning neural-networks openmp pytorch
Last synced: 11 Apr 2025
https://github.com/lanzani/opencv-cuda-docker
Docker with opencv with cuda support.
cuda docker nvidia-docker nvidia-gpu opencv opencv-cuda opencv-dnn
Last synced: 12 Oct 2025
https://github.com/k-hengzhou/hphoto
ไธไธชๅบไบAI็ๆบ่ฝ็ ง็็ฎก็ๅทฅๅ ท๏ผๆฏๆไบบ่ธ่ฏๅซใ็ธไผผไบบ่ธ่ชๅจ่็ฑปๅnsfwๆฃๆต
cuda insightface nsfw nsfw-detection nudenet photos
Last synced: 26 Feb 2025
https://github.com/jmuwrobotics/libbicos
GPU-Accelerated Binary Correspondence Search for Multishot Stereo Vision
computer-vision cuda depth-map stereo-camera stereo-matching stereo-vision
Last synced: 14 Oct 2025
https://github.com/fatlipp/cuda-tree
CUDA-based Tree builder
algorithms cpp cuda octree quadtree tree
Last synced: 19 Jun 2025
https://github.com/ehwan/r-star-tree
HeaderOnly STL-like template N-dimensional R*Tree implementation on C++14
algorithm cplusplus-14 cuda eigen3 geometric-algorithms gpgpu header-only linear-algebra modern-cpp opencl rtree spatial spatial-index stl-like template traits tree-structure
Last synced: 09 Oct 2025
https://github.com/cascadingradium/cuda-hungarian-clustering
A GPU-Accelerated Clustering Algorithm that uses the Hungarian method
clustering cpp cuda gpu hungarian-algorithm parallel-computing
Last synced: 16 May 2025
https://github.com/neural-bits/ai-programming-hub
Learn and experiment with new techniques and programming languages with a focus on ML
cpp cuda cython openai-triton python rust
Last synced: 12 Apr 2025
https://github.com/gapi505/sparky-2
This is a discord bot running on llama cpp with the llama 3 model and image geneartion
ai cuda llama3 llamacpp stable-diffusion torch transformers
Last synced: 07 Oct 2025
https://github.com/axnjr/snn_be_pro
A state of the art AI framework for no/low-code (visually - drag & drop) building, testing, deploying, integrating latest deep learning models with privacy & security compliance using ollama, as a final year project!
ai cplusplus cpp cuda deep-neural-networks kernel-driver ml mlops python
Last synced: 06 Oct 2025
https://github.com/pinto0309/realsense-cuda-opengl-docker
RealSense execution environment built on a Docker container on Ubuntu 20.04. NIVIDA GPU and OpenGL capable. CUADA 11.4.
cuda docker opengl realsense realsense2 ubuntu wsl2
Last synced: 24 Mar 2025
https://github.com/sammwyy/cuda.js
CUDA bindings for Node.js
bindings bun bunjs cuda cuda-kernels cuda-library javascript library nodejs nvidia typescript
Last synced: 06 Oct 2025
https://github.com/aryagxr/cuda
100 Days of CUDA!!!
cuda gpu-programming kernels parallel-programming
Last synced: 05 Oct 2025
https://github.com/andydevs/cudafractal
Fractal Generator using Nvidia's CUDA framework
Last synced: 23 Apr 2025
https://github.com/ROCm/hipMM
HIP Memory Manager (ROCm-DS)
amd cuda gpu hip memory-management radeon-instinct-mi-series rocm
Last synced: 12 Apr 2025
https://github.com/wzqvip/jetson-pytorch-builder
build PyTorch with CUDA for Jetson Orin and Thor.
Last synced: 01 Dec 2025
https://github.com/mr-technologies/crowsnest
MRTech IFF SDK web interface sample
camera cuda demosaicing dng frontend genicam gpu h264 image-processing jetson json low-latency machine-vision mipi rest-api rtsp tiff vulkan webrtc ximea
Last synced: 06 Sep 2025
https://github.com/jasmcaus/hazel
A Tensor Library written in C++.
artificial-intelligence autodiff autograd automatic-differentiation computing cpp cuda deep-learning differentiation gpu hazel-lang ml neural neural-network python pytorch scientific-computing tensor tensor-library
Last synced: 26 Apr 2025
https://github.com/abaksy/cuda-examples
A repository of examples coded in CUDA C/C++
Last synced: 31 May 2026
https://github.com/hbseong97/tf-c-api
Using tensorflow c api, c++ api, tf lite, tf js, model conversion in Windows
bazel checkpoint cuda cudnn tensorflow
Last synced: 09 Apr 2025
https://github.com/hope2333/tsac-ng
็ฅ็ป้ณ้ข็ผ่งฃ็ ๅจ โ Multi-backend neural audio codec. CPU (AVX/AVX2/AVX-512, NEON/SVE, RVV), GPU (CUDA, HIP/ROCm, Vulkan), LLVM JIT. Clean-room implementation.
arm64 audio-codec avx c cuda dac hip llvm-jit neural-audio riscv simd vulkan
Last synced: 29 Jun 2026
https://github.com/donpablonows/coin
๐ช Crypto Optimization Interface Network (aka COIN) is a high-performance Bitcoin address generator using CUDA acceleration and multi-threading. It optimizes GPU and CPU resources for fast address generation, ensures secure private key creation, and includes real-time monitoring and automatic system optimizations.
bitcoin blockchain cryptography cuda gpu-acceleration
Last synced: 07 May 2026
https://github.com/fynv/curandrtc
CURandRTC is a GPU random number generation module based on ThrustRTC.
cuda nvrtc random-number-generators thrust
Last synced: 05 May 2025
https://github.com/zeloe/rtconvolver
A realtime convolution VST3
c convolution cplusplus cuda juce
Last synced: 22 Apr 2025
https://github.com/pnnl/cuvite
Multi-GPU Graph Community Detection using CUDA
community-detection cuda graph-clustering mpi
Last synced: 25 Jul 2025
https://github.com/official-imvoiid/portable-miniconda-setup-for-window
Portable Miniconda Setup for Windows ๐ Easily create a portable Conda environment with automated scripts for flexible Python version management and CUDA support. ๐
conda conda-environment cuda datascience machinelearning nvidia nvidia-cuda portable python
Last synced: 16 Apr 2026
https://github.com/rogerallen/qtmandelbrotr
Qt CUDA Mandelbrot explorer
cuda cuda-opengl mandelbrot-viewer qt5
Last synced: 02 Aug 2025
https://github.com/sean-bradley/cudalookupsha256
SHA256 Lookup using parallel processing on a NVidia CUDA Compatible Graphics card
cuda parallel-processing sha256
Last synced: 05 May 2025
https://github.com/648trindade/sbac-pad-marathon-problems
Repository containing problems of the SBAC-PAD Marathon of Parallel Programming and some parallel solutions to them.
cuda high-performance-computing mpi openmp parallel-computing
Last synced: 01 May 2025
https://github.com/mre/cudampi
Large hybrid CPU/GPU sorting network using CUDA and MPI
algorithms bucket bucketsort cuda filesystem gpu hybrid-cpu mpi parallel sorting-network
Last synced: 18 Apr 2026
https://github.com/rfsantacruz/mycudasamples
This is a series of CUDA C++ programming samples developed to study CUDA technology and its parallel programming model.
Last synced: 13 Apr 2025
https://github.com/sean-bradley/cudalookupripemd60
RipeMD160 Lookup using parallel processing on NVidia CUDA Graphics card
cuda parallel-processing ripemd160
Last synced: 05 May 2025
https://github.com/lucaangioloni/parallelcomputingexam
Parallel Computing Exam
c cuda histogram-equalization integral-image java java-thread openmp parallel-computing
Last synced: 20 Apr 2026
https://github.com/webis-de/pytorch-window-matmul
a custom CUDA kernel for windowed matrix multiplication
Last synced: 31 Oct 2025
https://github.com/luismisanve/gguf-to-pytorchtensor
Simple Python Script that converts the Weight of a GGUF Model to a PyTorch Tensor
cuda gguf-models huggingface llamacpp numpy python pytorch tensor
Last synced: 20 Apr 2026
https://github.com/jaxony/pynvidia
โ๏ธ NVIDIA GPU utilities for Python ๐ง
cuda deep-learning nvidia-gpu pip python utility
Last synced: 07 May 2025
https://github.com/caps-umu/fideslib
A server-side CKKS GPU library fully interoperable with OpenFHE.
ckks cuda gpu homomorphic-encryption openfhe
Last synced: 08 Oct 2025
https://github.com/tyler-hilbert/cuda-linearregression
Linear Regression in CUDA
ai cublas cuda gpu linear-regression nsight
Last synced: 30 Mar 2025
https://github.com/evanmcclure/hello_gpu
Hello world example for Rust on GPU
apple apple-silicon cuda cuda-programming example-project gpu gpu-programming gpu-support metal rust rust-lang
Last synced: 12 Apr 2025
https://github.com/brosnanyuen/raybnn_raytrace
Ray tracing library using GPUs, CPUs, and FPGAs via CUDA, OpenCL, and oneAPI
arrayfire cuda gpu gpu-computing opencl parallel parallel-computing ray ray-tracing raybnn raylib raytracer raytracing rust
Last synced: 26 Aug 2025
https://github.com/kareimgazer/mat-transpose-cuda
series of trials for optimizing matrix transpose with CUDA
cuda hpc matrix parallel-computing simd
Last synced: 29 Mar 2025
https://github.com/aespinosadev/opengl-renderer
OpenGL renderer showcasing all basic functionality to render 3D scenes.
computer-graphics cuda gpgpu graphics-engine graphics-programming opengl rendering rendering-3d-graphics shaders video-game
Last synced: 24 Jul 2025
https://github.com/rupeshs/anomalydetection
Anomaly Detection Using Anomalib and OpenVINO โ Step by Step by Guide
anomalib anomaly anomalydetection computer-vision cpu cuda gpu intel onnx opencv pytorch
Last synced: 13 Apr 2025
https://github.com/ellite/anchor-sub-sync
Anchor: A universal, hardware-accelerated CLI tool for subtitle synchronization (Whisper) and context-aware translation (NLLB)
ai audio-transcription automation cli cuda nllb python pytorch srt subtitle-sync subtitle-translation subtitles synchronization translation whisper
Last synced: 24 Feb 2026
https://github.com/cascadingradium/air-traffic-distribution
A GPU-Accelerated Multi-Objective Genetic Algorithm for Air Traffic Management
air-traffic-control air-traffic-management c cuda genetic-algorithm gpu-acceleration
Last synced: 16 May 2025
https://github.com/hyeonsangjeon/pdf2llm-tuning-studio
PDF ๋ฌธ์์์ GPU ๊ฐ์ ์ฒ๋ฆฌ๋ก ๊ณ ํ์ง ์ง์์๋ต(QA) ๋ฐ์ดํฐ๋ฅผ ์๋ ์์ฑํ๊ณ LLM์ ํจ์จ์ ์ผ๋ก ํ์ธํ๋ํ๋ ์๋ฃจ์ ์ ๋๋ค. Unstructured ๋ผ์ด๋ธ๋ฌ๋ฆฌ์ AWS Bedrock Claude๋ก ๋๋ฉ์ธ ํนํ QA ์์ ์์ฑํ๊ณ , LoRA ๊ธฐ๋ฒ์ผ๋ก ๊ฒฝ๋ ๋ชจ๋ธ์ ํ๋ จํฉ๋๋ค.
aws bedrock claude cuda data-argumantation data-extraction distillation docker finetuning gpu llm pdf-generation pdf-text-extraction processing processing-job sagemaker text-disti unsloth unstructured
Last synced: 15 Jun 2025
https://github.com/ergus/gpukalmanfilter
Kalman Filter test code using C, C++, Cuda and OpenCL.
cpp cuda gpgpu kalman-filter makefile opencl performance vectorization
Last synced: 28 Oct 2025
https://github.com/mr-technologies/farsight
Basic MRTech IFF C SDK sample application
c camera cuda demosaicing dng genicam gpu h264 h265 image-processing jetson json low-latency machine-vision mipi rest-api rtsp sdk tiff vulkan
Last synced: 11 Apr 2025
https://github.com/skailasa/pyrsvd
Accelerated Randomised SVD in Python
cuda numba python3 randomised-algorithms svd
Last synced: 07 May 2025
https://github.com/valohai/dl4j-nlp-cuda-example
A git repository containing an NLP example using DL4J (cuda) in Java
cuda cuda-details cudnn deep-learning deeplearning4j dl4j docker-container java jvm machine-learning natural-language-processing nlp nvidia nvidia-drivers nvidia-gpu valohai-cli valohai-platform
Last synced: 02 Aug 2025
https://github.com/bokutotu/curs
cuda&cublas&cudnn wrapper for Rust
cuda deep-learning high-performance-computing hpc rust
Last synced: 20 May 2026
https://github.com/mr-technologies/imagebrokerpy
Example of image export from MRTech IFF Python SDK
camera cuda demosaicing dng genicam gpu h264 h265 image-processing jetson json low-latency machine-vision mipi opencv python rest-api rtsp tiff vulkan
Last synced: 12 Apr 2025
https://github.com/mr-technologies/iff
MRTech IFF SDK documentation
camera cuda demosaicing dng genicam gpu h264 h265 image-processing jetson json low-latency machine-vision mipi rest-api rtsp sdk tiff vulkan ximea
Last synced: 11 Apr 2025
https://github.com/amirhoseinmasoumi/onnx-cuda-inference
A C++ project for running CUDA-accelerated ONNX model inference, using ONNX Runtime and OpenCV for image segmentation tasks.
cpp cuda inference onnxruntime onnxruntime-gpu opencv segmentation
Last synced: 12 Apr 2025
https://github.com/garciparedes/parallel-scan-sky
Parallel Computing work
c cuda high-performance-computing hpc mpi openmp parallel parallel-algorithm parallel-computing parallel-processing parallel-programming parallelism parallelization university-of-valladolid
Last synced: 18 Apr 2026
https://github.com/arminms/p2rng
A modern header-only C++ library for parallel algorithmic (pseudo) random number generation supporting OpenMP, CUDA, ROCm and oneAPI
cpp cuda cxx gpu header-only library linux macos multiplatorm oneapi openmp parallel pcg-random prng pseudorandom-number-generator random-number-distributions random-number-generation rocm stl-algorithms windows
Last synced: 04 Apr 2025