CUDA
CUDA® is a parallel computing platform and programming model developed by NVIDIA for general computing on graphical processing units (GPUs). With CUDA, developers are able to dramatically speed up computing applications by harnessing the power of GPUs.
- GitHub: https://github.com/topics/cuda
- Wikipedia: https://en.wikipedia.org/wiki/CUDA
- Created by: Nvidia
- Released: June 23, 2007
- Related Topics: nvcc,
- Last updated: 2026-06-22 00:07:17 UTC
- JSON Representation
https://github.com/pjueon/cuda_intellisense
A simple python script to fix cuda C++ intellisense for visual studio.
Last synced: 09 Apr 2026
https://github.com/mre/talks
...mostly Computer Science related.
computer-science cuda talks tech-talks
Last synced: 28 Apr 2026
https://github.com/saiccoumar/cuda-programming-exercises
Brief collection of GPU exercises (my reimplementation). Comes with relevant resources.
cuda cuda-programming nvcc nvidia
Last synced: 25 May 2026
https://github.com/xavierjiezou/gpu-compute-capability
An application for querying the computing power of each gpu released by NVIDIA.
Last synced: 28 Apr 2026
https://github.com/jblaschke/pynvtx
Thin pybind11 wrapper for NVTX wrappers -- with some bells and whistles attached.
Last synced: 21 Aug 2025
https://github.com/raumberg/hypervision
Neural Network based real-time aimbot system, operating on TensorRT with custom CUDA kernel and C FFI extensions
ai aim cuda cython neural-networks python tensorrt yolo
Last synced: 20 May 2026
https://github.com/alpha74/hungarianalgocuda
Hungarian Algorithm for Linear Assignment Problem implemented using CUDA.
cuda nvcc parallel-computing parallel-programming
Last synced: 01 Jun 2026
https://github.com/leocelente/basic_cuda
My CUDA source files while learning
Last synced: 29 Apr 2026
https://github.com/asadiahmad/gesture-detection
Real-time Gesture Detection using CUDA-accelerated OpenCV in Python.
computer-vision cuda gesture-recognition gpu-acceleration open-pose opencv opencv-cuda pose-detection real-time
Last synced: 29 Apr 2026
https://github.com/nofaralfasi/parallel-sequence-alignment
A parallelized version of multiple DNA sequence alignment algorithm with MPI, OpenMP and CUDA
cuda mpi openmp sequence-alignment
Last synced: 29 Apr 2026
https://github.com/mortafix/quickshift
A working implementation of Quickshift algorithm in CUDA, GPU-compatible.
Last synced: 08 May 2026
https://github.com/emilienmendes/gpgpu
Parallélisation et optimisation de reconnaissance de point dans une image
cuda gpgpu parallel-programming
Last synced: 28 Oct 2025
https://github.com/mark0011astra/simplecuda
CUDAを使用したGPU演算をNumPyと同様のインターフェースで簡単行えるライブラリ。A library that allows users to easily perform GPU operations using CUDA with a NumPy-like interface.
cuda cupy gpu machine-learning numpy python vector
Last synced: 02 May 2026
https://github.com/xza85hrf/ml-framework_checker
ML Framework and CUDA Checker is a Python-based GUI application for checking PyTorch, TensorFlow, and CUDA installations. It provides detailed system specs, compatibility checks, advanced GPU management, and offers options to view instructions, export logs, and update machine learning frameworks.
compatibility cuda gpu-management gui-application machine-learning python pytorch system-checker system-specs tensorflow
Last synced: 28 Apr 2026
https://github.com/jonathanraiman/mini_cuda_rtc
Miniature CUDA Array library with Runtime Compilation
cpp11 cuda jit runtime-compilation
Last synced: 14 Apr 2026
https://github.com/liberxue/parallel_computing
CUDA Algorithm && Hacker's Delight
algorithms cuda cuda-kernels cuda-programming hacker-s-delight nvidia
Last synced: 24 Feb 2026
https://github.com/enp1s0/curand_fp16
FP16 pseudo random number generator on GPU
cuda gpu half-precision random-number-generators
Last synced: 20 Aug 2025
https://github.com/hdelan/msc-hpc-final-project
In this project I implement a CUDA Lanczos method to approximate the matrix exponential. The matrix exponential is an important centrality measure for large, sparse graphs.
cuda graph-algorithms krylov-methods
Last synced: 12 Apr 2025
https://github.com/ismailtekin05/caloriedetectingai
🍎🔍 Smart AI system that identifies food items in photos and calculates their calorie content automatically. Built with TensorFlow, YOLOv8, CUDA and computer vision for accurate nutrition tracking.
ai aimodel calorie-calculator computer-vision cuda data-analysis data-science data-segmentation data-visualization dataset dataset-generation image-processing image-recognition python segmentation-models tensorflow ultralytics yaml yolo yolov8
Last synced: 29 Apr 2026
https://github.com/tommaso-dognini/polimi_gpu101_courseproject
Polimi Passion In Action GPU101 course project. Implementation in CUDA of BFS algorithm
cpp cuda cuda-programming parallel-computing
Last synced: 10 Apr 2026
https://github.com/kartavyaantani/cuda_image_processing
A CUDA-accelerated image processing project featuring multiple GPU-based filters and enhancement techniques. Implements convolution, edge detection, Non-Local Means (NLM) denoising, K-Nearest Neighbors (KNN), and pixelization. Each operation is optimized using CUDA kernels for real-time performance on large images. The project supports command-line
cuda cuda-kernels cuda-programming cuda-toolkit gpu-programming high-performance-computing image-manipulation image-processing nvidia-cuda nvidia-gpu
Last synced: 30 Apr 2026
https://github.com/jtompuri/weighted-voronoi-stippling
High-performance weighted Voronoi stippling implementation. Exports PNG and TSP files. Visualizes TSP tours as continuous line drawings.
computer-graphics cuda gpu-acceleration lloyd-relaxation numba python stippling traveling-salesman tsp voronoi
Last synced: 18 May 2026
https://github.com/syncush/cifar-10-pytorch
cifar10 cuda deep-learning machine-learning python python3 pytroch
Last synced: 19 May 2026
https://github.com/gogolb/ee147
Intro to GPU Computing
c cuda cuda-kernels cuda-toolkit gpu-computing gpu-programming university-course
Last synced: 01 May 2026
https://github.com/eric900115/parallelprogramming
The repository contains the coursework for CS5422, NTHU's Parallel Programming Course.
Last synced: 26 May 2026
https://github.com/manishklach/thermal-observatory
A generic thermal observability framework for CPU, GPU, board, and platform telemetry across vendor APIs, kernel interfaces, and runtime correlation layers.
amd arm64 cuda linux nvidia nvml observability rocm telemetry thermal-framework thermal-monitoring x86-64
Last synced: 09 Jun 2026
https://github.com/dpetrosy/fractal
This project is a Fractal Visualizer developed in C++ with SFML and CUDA.
burning-ship cmake cmakelists cpp cpp-programming cpp-project cuda cuda-opengl cuda-programming fractal fractal-generation fractal-visualization julia mandelbox mandelbrot opengl opengl-project sfml sfml-library tricorn
Last synced: 21 Feb 2026
https://github.com/thisalmandula/gpu_accelerated_lpt_cfd_code
This repository contains GPU accelerated version of the particle tracking model developed by Merel Kooi for biofouled microplastic particles ( available at: https://pubs.acs.org/doi/10.1021/acs.est.6b04702) written in CUDA Fortran and CUDA Python. This repository is intended as a learning tool for GPU programming.
biofouling computational-fluid-dynamics cuda fortran lagrangian-particle-tracking microplastics python
Last synced: 02 May 2026
https://github.com/jessetg/cuda-practice
Working through the chapters of Cuda by Example
c cpp cuda cuda-by-example gpgpu
Last synced: 01 May 2026
https://github.com/nickolasrm/gpuvscpumatrixmultiplication
CPU and GPU optimized matrix multiplication (AVX, transposition, CUDA and other)
avx comparison cuda hpc matrix multiplication
Last synced: 06 Sep 2025
https://github.com/vishwamartur/btc_recovery
High-performance Bitcoin wallet password recovery system with GPU acceleration and integrated graphics support. Recover Bitcoin Core wallet.dat files without blockchain download using advanced algorithms and blockchain APIs.
bitcoin bitcoin-core blockchain blockchain-api cpp cryptocurrency cuda electrum gpu-acceleration integrated-graphics multithreading opencl password-recovery private-keys recovery-tools wallet-dat wallet-recovery
Last synced: 14 Apr 2026
https://github.com/antonioberna/nn-gpu-logic-gates
Neural Network implementation on GPU using CUDA C++ to learn logic gates operations
cpp cuda gpu logic-gates neural-networks nvidia
Last synced: 01 May 2026
https://github.com/gvvsnrnaveen/cuda
this repository contains the various programs that can written using CUDA Toolkit.
c cpp cuda nvcc nvidia-cuda nvidia-gpu
Last synced: 17 Jan 2026
https://github.com/alwaysai/jetpack-46-hacky-hour
NVIDIA’s Jetpack 4.6 capabilities and how to use them with EdgeIQ, alwaysAI Computer Vision framework.
alwaysai computer-vision cuda edge-computing jetpack tensorrt
Last synced: 01 May 2026
https://github.com/dhruvsrikanth/monte-carlo-ray-tracing
In this repository, you will find a serial and distributed GPU-based implementation of the ray tracing simulation.
c cpp cuda gpu-computing gpu-programming high-performance-computing parallel-programming raytracing unified-memory-parallelism
Last synced: 01 May 2026
https://github.com/pratikvn/nla4hpc-exercises-framework
The exercises framework for the Numerical Linear Algebra for HPC course at Karlsruhe Institute of Technology.
cuda ginkgo homeworks hpc-course teaching
Last synced: 19 May 2026
https://github.com/dansolombrino/gphungarian
A GPU-accelerated implementation of the Hungarian Algorithm, written in CUDA
Last synced: 31 Aug 2025
https://github.com/arakiss/hecate-os
Linux distro with automatic hardware detection and per-system optimization. Ubuntu 24.04 base. Alpha.
ai cuda docker gpu hardware-optimization kernel-tuning linux linux-distribution machine-learning nvidia operating-system performance sysctl ubuntu workstation zram
Last synced: 16 Feb 2026
https://github.com/ruturaj4/cuda_nvidia_tutorial
cuda projects
cuda cuda-vector-addition nvidia nvidia-cuda parallel
Last synced: 26 Oct 2025
https://github.com/dvhh/masscorrelation
An exercise in writing an efficient correlation calculator
calculations correlation-calculation cuda matrix multi-threading openmp
Last synced: 15 May 2026
https://github.com/anne-andresen/multi-modal-cuda-c-gan
Raw C/cuda implementation of 3d GAN
3d 3d-models attention-mechanism c cross-attention cross-attention-c cuda gan gan-models low-level-programming medical-imaging multimodal-deep-learning pytorch transformer-pytorch transformers transformers-c
Last synced: 06 Jan 2026
https://github.com/rnabla/cuda-des
Bruteforcing DES using CUDA
bruteforce cuda data des encryption gpu parallel standard
Last synced: 27 Oct 2025
https://github.com/a-nau/python-cuda-envs
Script to automatically map a specific CUDA version to a Conda Python environment.
anaconda anaconda-environment cuda installation installation-script python python-environment python3
Last synced: 18 Apr 2026
https://github.com/liuyuweitarek/pytorch-docker-builder
Automate PyTorch Docker image builds with compatible Python, CUDA, and Poetry versions, including CI/CD for testing.
cicd containerd cuda docker docker-image poetry-python python python3 pytorch pytorch-docker
Last synced: 06 Feb 2026
https://github.com/miniex/maidenx
Rust-based CUDA library designed for learning purposes and building my AI engines named Maiden Engine
Last synced: 20 Mar 2025
https://github.com/daelsepara/hipslm
CPU and GPU (using HIP) implementations of phase pattern generators for use with spatial light modulators
computer-generated-holography cuda gpu hip hologram holography phase phase-pattern slm spatial-light-modulator
Last synced: 09 Sep 2025
https://github.com/dafadey/GPGPU_OpenCL_vs_CUDA
This is a repository with sample codes for testing memory bandwidth, arithmetic latency hiding and shared/local memory performance on AMD and nVidia devices
cuda gpgpu gpgpu-computing opencl
Last synced: 16 May 2025
https://github.com/michaelfranzl/image_debian-gpgpu
Dockerfile for a Debian base image with AMD and Nvidia GPGPU support
amd container container-image cuda debian docker gpgpu nvidia opencl
Last synced: 10 May 2026
https://github.com/daelsepara/hipmandelbrot
GPU Implementation of Mandelbrot Fractal Generator with Benchmarking
amd cuda fractal gpu gpu-compute gpu-computing hip mandelbrot parallel-computing rocm sdk
Last synced: 20 Feb 2026
https://github.com/andih/cuda-fortran-stream
Variant of STREAM Benchmark in CUDA Fortran
cuda cuda-fortran gpu stream-benchmarks variants
Last synced: 02 Mar 2025
https://github.com/alekseyscorpi/vacancies_server
This is a server for vacancies generation using LLM (Saiga3)
code cuda cuda-toolkit docker dockerfile flask llama3 llamacpp llm ngrok pydantic saiga
Last synced: 06 Feb 2026
https://github.com/kichappa/spy-sim
Simulate a spying strategy on a topography
combat-modeling cuda differential-equations julia modeling-and-simulation topography-simulation
Last synced: 09 Mar 2026
https://github.com/jorgedavyd/nsight.nvim
A developer oriented Neovim framework for CUDA performance profiling and analysis.
cuda cuda-kernels cuda-profiler cuda-programming cuda-support cuda-toolkit deep-learning machine-learning neovim neovim-plugin performance-engineering
Last synced: 13 Apr 2026
https://github.com/renatomaynard/a-multiple-population-coarse-grained-genetic-algorithm-to-solve-the-quadratic-assignment-problem-
A Multiple-population coarse-grained Genetic Algorithm to solve the Quadratic Assignment Problem
c cuda genetic-algorithm quadratic-assignment-problem
Last synced: 09 May 2026
https://github.com/lhldev/rust-neural-network
neural network implementation in rust
cuda feedforward-neural-network
Last synced: 16 May 2026
https://github.com/enriquebdel/clases-cuda-programacion-paralela-en-c-
En este repositorio encontrarás varias lecciones creadas por mí sobre la librería CUDA en C. El programa que utilizo para programar es MobaXterm.
c cuda cuda-programming gnu-linux googlecolab mobaxterm nvidia parallel-programming ubuntu university
Last synced: 19 May 2026
https://github.com/SanaeProject/Matrix-for-Cpp
This repository has types that handle matrices.
cpp14 cpp14-library cuda matrix-library
Last synced: 15 May 2025
https://github.com/lcsb-biocore/cufluxsampler.jl
GPU-accelerated algorithms for flux sampling in CUDA.jl
cobra cuda gpu julia metabolic-network metabolism sampling
Last synced: 02 May 2026
https://github.com/deepankaracharyya/6th_sem_assignments
c cuda data-mining postgresql-database python
Last synced: 02 May 2026
https://github.com/bl33h/pythagoreantheorem
A program that calculates the Pythagorean theorem for a large number of elements using GPU parallel processing.
arrays cuda kernel parallel-programming pythagoras pythagorean-theorem
Last synced: 19 May 2026
https://github.com/microo8/micronn
Simple neural network library with backpropagation using CUDA
Last synced: 19 May 2026
https://github.com/lightshade12/kittlespt
A hobby CUDA pathtracing renderer.
3d-graphics computer-graphics cuda gpu path-tracing ray-tracing
Last synced: 18 Mar 2025
https://github.com/blazekill/hello-cuda
Cpp + Vcpkg + CUDA + VsCode starter project.
Last synced: 18 May 2026
https://github.com/pintamonas4575/tfg-classification-model-customdataset
Modelo de clasificación en Tensorflow y Keras sobre un Dataset propio.
cnn cnn-classification cuda deep-learning efficientnet gpu image-classification keras tensorflow transfer-learning
Last synced: 02 May 2026
https://github.com/graiphic/graiphic-documentation
Graiphic Toolkits for LabVIEW provide advanced AI, GPU, and graph-oriented computing capabilities directly inside LabVIEW. Built on ONNX Runtime, they enable seamless integration of SOTA, Accelerator, and Deep Learning Toolkit for high-performance execution across CPUs, GPUs, and edge devices.
accelerator-toolkit ai-orchestration computer-vision cuda deep-learning directml edge-ai graph-computing hardware-acceleration high-performance-computing inference labview neural-networks onednn onnx onnxruntime openvino sota tensorrt training
Last synced: 22 Nov 2025
https://github.com/shahed-chy-suzan/psd-to-html--cuda
Cuda is a single page creative portfolio psd to html template which is built with HTML5 & CSS3. The site can be customized easily to suit your needs.
Last synced: 18 Jan 2026
https://github.com/villekf/helmet
High-dimensional Kalman filter toolbox (HELMET)
arrayfire cuda gpgpu kalman-filter kalman-smoother matlab octave opencl reconstruction scientific-computing state-estimation
Last synced: 01 May 2026
https://github.com/m-torhan/cuda-stl-renderer
CUDA C++ implementation of STL file renderer using ray tracing method
Last synced: 25 Feb 2026
https://github.com/ophoperhpo/dcgan-lentach-logo-generator
The Lentach logo generator. #MachineLearningFun
cuda dcgan dcgan-tensorflow keras lentach machinelearning ml
Last synced: 23 Feb 2025
https://github.com/matx64/rs-netbot
Old School Runescape bot with CNN for object identification
Last synced: 04 May 2026
https://github.com/piyush26c/cuda-programming
c cuda ipynb-jupyter-notebook mathematics sppu-computer-engineering
Last synced: 03 Mar 2026
https://github.com/greg-tarr/fastsimplex
CUDA/MPS accelerated 2D & 3D simplex noise generation.
cuda mps noise-generator python simplex-noise
Last synced: 20 Apr 2026
https://github.com/xihuai18/image-processing-in-cuda
Implementation of Image Processing Method
Last synced: 04 Oct 2025
https://github.com/yaronkoresh/definers
A comprehensive Python toolkit for AI, data processing, media manipulation, and system utilities.
artificial-intelligence cuda data-science deep-learning diffusers feature-extraction generative-ai gpu gradio image-generation machine-learning multimedia music-generation python-library pytorch scikit-learn toolkit transformers video-generation web-scraping
Last synced: 08 Apr 2026
https://github.com/unvercan/ssd300-model-pytorch
SSD300 Model using PyTorch
cnn computer-vision convolutional-neural-networks cuda deep-learning image-processing neural-network object-detection opencv python pytorch single-shot-detection ssd ssd300
Last synced: 17 Mar 2025
https://github.com/croko22/vit-cpp
An implementation of the Transformer model architecture ("Attention Is All You Need") in pure C++17 from scratch
cpp cuda deep-learning machine-learning neural-network transformer
Last synced: 17 Jan 2026
https://github.com/sartajbhuvaji/cuda
Deloped CUDA kernel functions to load and train a Convolution Neural Network from scratch.
cuda cuda-programming gpu-programming neural-network nvidia-cuda
Last synced: 30 Mar 2025
https://github.com/amirbroker/cudadtw
Use CUDA with numba for Dynamic Time Warping
cuda dtw dynamic-time-warping gpu numba
Last synced: 16 Apr 2026
https://github.com/iag-geo/image-classification
Image classification scripts using YOLOv5 with aerial imagery
cuda image-classification python pytorch swimming-pools yolov5
Last synced: 22 Feb 2026
https://github.com/matteogianferrari/qr-decomposition
Tthis project implements different methods to exploit caches usage, the multicore CPU and the GPU architectures, on the Gram-Schmidt QR Decomposition algorithm and measure the performance of the different implementations.
cuda openmp parallel-computing
Last synced: 12 Apr 2026
https://github.com/hatamiarash7/cuda-python
GPU programming using CUDA & Python
cuda gpu gpu-computing gpu-programming python
Last synced: 29 Apr 2026
https://github.com/viktor-shcherb/triage
Script running tool for optimizing GPU memory utilization
automation cli cuda deep-learning devops-tools experiment-runner gpu-monitoring gpu-scheduler hyperparameter-sweep job-queue machine-learning nvidia-smi pypi-package python resource-management script-runner
Last synced: 12 Feb 2026
https://github.com/meirbek-dev/face-mask_detector
Обнаружие маски на лице в реальном времени
artificial-intelligence covid-19 cuda cudnn deep-learning face-mask graduation-project jupyter-notebook keras machine-learning mask-detection mobilnet-v2 object-detection object-recognition object-tracking opencv4-python python real-time supervised-learning tensorflow2-gpu
Last synced: 03 May 2026
https://github.com/subatomicplanets/simplebitcoinminer
A simple Bitcoin C++ and CUDA solo miner
bitcoin cpp cryptocurrency cuda miner
Last synced: 19 Apr 2026
https://github.com/aliyoussef97/triton-hub
A container of various PyTorch neural network modules written in Triton.
cuda deep-learning openai pytorch triton triton-lang
Last synced: 30 Mar 2025