CUDA
CUDA® is a parallel computing platform and programming model developed by NVIDIA for general computing on graphical processing units (GPUs). With CUDA, developers are able to dramatically speed up computing applications by harnessing the power of GPUs.
- GitHub: https://github.com/topics/cuda
- Wikipedia: https://en.wikipedia.org/wiki/CUDA
- Created by: Nvidia
- Released: June 23, 2007
- Related Topics: nvcc,
- Last updated: 2026-06-30 00:07:24 UTC
- JSON Representation
https://github.com/flosmume/cpp-cuda-deepvision-rtx-starter
CUDA C++ practice project for RTX 4070 SUPER — explore GPU concurrency, pinned memory, and Nsight profiling. Includes SAXPY and 2D blur kernels to train optimization, stream overlap, and timing analysis for NVIDIA Developer Technology Engineering skillset.
cpp cuda cuda-kernels cuda-streams deep-learning-inference gpu gpu-optimization gpu-profiling high-performance-computing nsight nvidia parrallel-computing pinned-memory
Last synced: 16 May 2026
https://github.com/ahmadrafidev/learn-cuda
A place where I learn about CUDA
cuda cuda-programming gpu os parallel-programming
Last synced: 13 Apr 2025
https://github.com/naidezhujimo/cuda-learning-just-record-the-learning-process-
just record the learning process,There are notes,Welcome to learn.
Last synced: 26 Mar 2025
https://github.com/azdavis/parallel-portrait-mode
Parallel Portrait Mode
cuda image-processing ispc openmp
Last synced: 13 Apr 2026
https://github.com/sevilze/folderesque
Python Script to process and upscale images in specified folders using RRDB models.
Last synced: 02 Mar 2026
https://github.com/kenwuqianghao/c4ai-cuda-birds
Homework assignments for C4AI Beginners in Research-Driven Studies
Last synced: 18 Apr 2026
https://github.com/TheodoreAI/monte-carlo-simulator
CUDA application for Monte Carlo simulation is used to determine the range of outcomes for a series of parameters, each of which has a probability distribution showing how likely each option is to happen. This is using CUDA.
cuda gpu-computing monte-carlo-simulation parallel-computing
Last synced: 06 Oct 2025
https://github.com/hshshshshsh12e/gpumkat
Gpumkat is a shader debugger for metal which is designed to do what instruments can't do
alternative api control cuda darwin debugger debugging gpumkat macos management profiler release shaders threads
Last synced: 14 Apr 2026
https://github.com/aeyage/intraday_prices
GPU-accelerated portfolio optimisation
Last synced: 05 Apr 2025
https://github.com/mmz33/practice-cuda
c cpp cuda cuda-programming gpu-programming parallel-programming
Last synced: 14 Apr 2026
https://github.com/kanttouchthis/cuda_schem
script for voxelization of 3d models to minecraft .schem schematics with texture support powered by numba cuda.
cuda minecraft numba voxelization
Last synced: 07 Oct 2025
https://github.com/dreoporto/tensorflow-gpu-docker
An example project to run TensorFlow with CUDA-enabled GPU acceleration using Windows, Docker and WSL2.
artificial-intelligence cuda deep-learning docker docker-compose jupyter machine-learning nvidia-docker python windows wsl2
Last synced: 27 Jan 2026
https://github.com/jadc/cuda-raytracer
A simple path tracer written in CUDA.
cpp cuda gpu-programming graphics parallel-programming path-tracing raytracing
Last synced: 16 May 2026
https://github.com/toshikinakamura0412/dockerfiles
Development environment using Docker for some Linux distributions
alpine bash cuda debian devcontainer devcontainers docker docker-compose fedora opencv opensuse ros ros-humble ros-noetic ros2 ubuntu ubuntu2004 ubuntu2204 vscode zsh
Last synced: 10 Jul 2025
https://github.com/ne0nwinds/gpupuzzles
My solutions to srush/GPU-Puzzles using CUDA
Last synced: 16 May 2026
https://github.com/thanduriel/cuda_hip_comparison
performance study of atomics on GPUs
Last synced: 09 Oct 2025
https://github.com/enesdoruk/opencv-cpp
Opencv CPP tutorials
computer-vision cpp cuda opencv
Last synced: 09 Oct 2025
https://github.com/mradovic38/pycuda-simulated-annealing
Simulated annealing process for finding the 'minimum energy' of an image.
cuda image-energy parallel-computing parallel-programming pycuda python simulated-annealing
Last synced: 09 Oct 2025
https://github.com/skyguy126/cuda-learnings
Collection of personal CUDA learnings.
Last synced: 10 Oct 2025
https://github.com/bhavinpatel4199/image-processing-with-opencv-and-cuda-on-google-colab
This repository demonstrates image processing using OpenCV with CUDA for GPU acceleration on Google Colab. It includes basics like displaying and manipulating images, alongside advanced techniques using CUDA to enhance performance. Ideal for learning GPU-accelerated image processing in Python.
computer-vision cuda google-colab gpu-acceleration high-performance-computing image-processing opencv pixel-manupulation
Last synced: 19 Jan 2026
https://github.com/adesoji1/visis_backend_assessment_submission-adesoji
Create a backend API to handle book information requests, and summary generation.
bart cache cuda data-extraction fastapi flask hugging-face hugging-face-hub llama postman-api python3 pytorch spacy sqlite3-database swagger-api tensorboard-visualizations transformer ubuntu2304
Last synced: 19 Jan 2026
https://github.com/zcemycl/distributecompute
Parallel Computing and Distributed Computing with C++ threads, Python threads+asyncio+multiprocessing and Spark, and Cuda.
asyncio boost cpp cuda global-interpreter-lock jthread multiprocessing python spark thread
Last synced: 14 Apr 2026
https://github.com/ericrihm/yt-whisper
Fast, local YouTube transcription with speaker diarization and a keyboard-first Textual TUI. YouTube-subs fast path, faster-whisper on CUDA, opt-in pyannote diarization, prompt profile auto-detection.
cli cuda faster-whisper pyannote python speaker-diarization textual transcription tui whisper youtube
Last synced: 31 May 2026
https://github.com/bd2720/accesspatterns
Comparing chunked vs. striped memory access patterns for CPU and GPU code using the CUDA toolkit in C.
c cache cuda cuda-toolkit performance-analysis performance-testing profiling
Last synced: 16 May 2026
https://github.com/isaurabhmeshram28/cuda-examples
This repository contains examples and experiments with CUDA programming to explore GPU computing and parallel processing using NVIDIA's CUDA framework.
Last synced: 19 May 2026
https://github.com/yash-1335/qwen600
🚀 Build a fast inference engine for the QWEN3-0.6B model using CUDA, optimizing performance with minimal dependencies for efficient learning and practice.
cuda cuda-programming gpu llamacpp llm llm-inference qwen qwen3 transformer
Last synced: 16 May 2026
https://github.com/boostibot/bachelors
My bachelors thesis at CTU in Prague, Faculty of Nuclear Sciences and Physical Engineering supervised by Ing. Pavel Strachota, Ph.D
crystal-growth cuda finite-volume-method parallel-programming phase-field-method
Last synced: 26 Oct 2025
https://github.com/centuriontheman/parallelsortingalgorithms
Cpp/CUDA application for benchmarking sorting algorithms
benchamark cpp cuda multithreading sorting-algorithms
Last synced: 18 Feb 2026
https://github.com/fabricsoul/pytorch-jupyter-cuda
The docker image that can run pytorch and jupyterlab
ai artificial-intelligence cuda deep-learning deep-neural-networks docker docker-container docker-image gpu-computing jupyterlab machine-learning machinelearning nvidia nvidia-cuda nvidia-docker nvidia-smi pytorch
Last synced: 13 Mar 2026
https://github.com/lanceberge/cuda-newton-fractals
Parallelize and visualize the Newton Iteration
cpp cuda mathematical-modelling visualization
Last synced: 16 May 2026
https://github.com/voschezang/holographic-projector-simulations
Optimizations of Simulations of Holographic Projectors using CUDA
cuda gpu holography parallel-computing photonics
Last synced: 16 May 2026
https://github.com/djenriquez/ccminer
Dockerized ccminer
cuda docker ethereum mining nvidia nvidia-docker
Last synced: 05 May 2026
https://github.com/anthongretter/spmv-cuda-analysis
A analysis on different approaches on Sparse Matrix-Vector Multiplication (SpMV) on GPU using CUDA
cuda gpu matrix-computations spmv unitn
Last synced: 14 Oct 2025
https://github.com/riciokzz/computer-vision
Computer Vision project
cuda data-cleaning data-engineering data-science exploratory-data-analysis machine-learning neural-network
Last synced: 20 May 2026
https://github.com/shermanlo77/oxwasp_phd
Code for the PhD thesis. The topic was on defect detection of 3D printing using x-rays. The repository includes an implementation of the mode filter and empirical null filter.
3d-printing applied-statistics computational-statistics cuda empirical-null imagej mode-filter statistics xray-projection
Last synced: 27 Mar 2025
https://github.com/bergolho/sycl
Repository with simple programs to learn SYCL.
Last synced: 16 May 2026
https://github.com/sjmonson/tdr-inverse
A set of CUDA programs that invert matrices
cuda gpu matrix-inverse matrix-inversion tdr
Last synced: 14 Oct 2025
https://github.com/fanziyang-v/parallel-computing
Parallel Computing course materials from Harbin Institute of Technology(Shenzhen).
cuda openmp openmpi parallel-computing
Last synced: 27 Mar 2025
https://github.com/tzervas/unsloth-rs
Memory-optimized GPU kernels for LLM fine-tuning in Rust (2-5x speedup, 70-80% less VRAM)
cuda gpu machine-learning optimization rust
Last synced: 25 Jan 2026
https://github.com/obitech/tuc-ki-gpu-docker
cuda docker machine-learning nvidia-docker nvidia-gpu tensorflow tuc
Last synced: 14 Apr 2026
https://github.com/lu1smgb/ppr
Asignatura de Programacion Paralela. Curso 2024/2025. Universidad de Granada
Last synced: 15 Oct 2025
https://github.com/illagrenan/cuda-80-cudnn6-runtime-1604-py36
Ubuntu 16.04 with Python 3.6 and CUDA Dockerfile
Last synced: 22 Jun 2025
https://github.com/danieljvickers/fluid_simulation
An educational example for learning the Navier-Stoke equations. Also included is a C++ and CUDA shared object library, buildable with CMake, for use in your personal projects.
cpp cuda differential-equations navier-stokes numpy physics python simulation
Last synced: 04 May 2026
https://github.com/shineiarakawa/particle-stabilizer
A C++ and CUDA-based program for simulating the motion of particles.
Last synced: 12 May 2026
https://github.com/masterskepticista/parallel_reductions_cuda
Iteratively optimizing parallel reductions in CUDA.
Last synced: 16 Oct 2025
https://github.com/oaslananka/cv_cuda_cpp_sample
This is a sample project demonstrating how to use OpenCV and CUDA in C++ for detecting people in drone footage with YOLO. The project aims to be simple and understandable for those who want to learn how to use OpenCV and CUDA in C++.
computervision cpp cuda opencv
Last synced: 01 May 2026
https://github.com/sergiomarquezdev/yt-transcriber
🛠️ CLI tool to transcribe YouTube videos using OpenAI Whisper with CUDA acceleration, generate AI summaries (EN/ES) with Gemini, and create LinkedIn/Twitter content. Supports YouTube, Google Drive, and local files.
ai cli cuda gemini python transcription whisper youtube
Last synced: 15 May 2026
https://github.com/puzzlef/vector-sum-cuda
Comparing performance of sequential vs CUDA-based vector element sum.
cuda element experiment gpu sum vector
Last synced: 14 Apr 2026
https://github.com/tornikeo/minimal-vscode-cuda-meson
Minimal sample of using VSCode and Meson to build CUDA applications
Last synced: 08 Sep 2025
https://github.com/sakurabtc888/1000_btc_bitcoin_challenge
🔥 针对 [privatekeys.pw] 160个比特币的CPU+GPU碰撞工具
Last synced: 21 Oct 2025
https://github.com/lablup/backend.ai-accelerator-cuda
The Backend.AI CUDA Accelerator Plugin
Last synced: 16 May 2026
https://github.com/paranoia55/env-setup
🚀 Set up a complete, production-ready JavaScript/TypeScript development environment on macOS with AI tools using a single command.
cuda deep-learning ethical-hacking-tools ios kali-linux kali-linux-tools make makefile next-14 next-appdir next-starter open-source-project prettier radix-ui shell tensorflow-gpu typescript ubuntu
Last synced: 29 Apr 2026
https://github.com/chibby0ne/cuda_by_example
Old notes (and new ones) of the Cuda by Example book
cuda cuda-programming gpgpu gpu-computing gpu-programming
Last synced: 15 Mar 2026
https://github.com/elcruzo/cuda-conv
Lightweight CUDA kernel for 2D image convolution achieving 20x+ speedup. Built with CuPy for the NVIDIA Hackathon.
computer-vision convolution cuda cupy gpu-computing hackathon high-performance-computing image-processing nvidia python
Last synced: 15 May 2026
https://github.com/dhakalnirajan/baghchal-rl
C/CUDA implementation of Baagh Chaal Game with Neural Network
bagh-chal baghchal c clang cuda cuda-kernels neural-network reinforcement-learning
Last synced: 14 Apr 2026
https://github.com/islam-hady9/deep-cuda
Image Classification with CNN in CUDA C++
artificial-intelligence cnn cpp cuda deep-learning gpu-programming image-classification machine-learning mnist-dataset neural-networks-from-scratch parallel-computing
Last synced: 02 May 2026
https://github.com/lehoangan2906/cuda_basics
A simple implementation of operations on vectors and matrices, optimized for running on Nvidia GPU with CUDA
Last synced: 16 Jun 2025
https://github.com/gaidardzhiev/haliotis
cuda emulation gpu-computing runtime scheduler
Last synced: 11 Jun 2026
https://github.com/soloangema/nvidia-8z6ms
🐱 Generate randomized README files to enhance DX farming, powered by NVIDIA technology for an engaging development experience.
artificial-intelligence computer-vision cuda data-science deep-learning gpu image-processing machine-learning neural-networks nvidia parallel-computing performance-optimization pytorch tensor video-processing
Last synced: 02 May 2026
https://github.com/chad24dev/gpu-agent-opt
🧠 Optimize GPU workflows with `gpu-agent-opt`, a Python package for profiling, scientific computing, and efficient CUDA exploration.
ai-agents autotuning cuda edge-ai geospatial gpu hpc nvidia optimization performance pytorch
Last synced: 07 May 2026
https://github.com/kichappa/videosift
CUDA based 3D Computer Vision for Exoskins
computer-vision convolution-filter cuda hpc julia sift-algorithm
Last synced: 15 May 2026
https://github.com/deltatecs/voses
Volatile Secret Searcher - massively parallel, brute force memory dump analysis for (D)TLS secret extraction
cuda memory-hacking reverse-engineering tls
Last synced: 15 Jun 2025
https://github.com/viktor-akusoff/chernabogpy
ChernabogPy is a Python package for visualizing gravitational distortions caused by black holes using nonlinear ray tracing.
cuda gpu physics-simulation python3 relativity-of-space-and-time torch
Last synced: 15 May 2026
https://github.com/thephiltacular/voice-ai-pipeline
A containerized AI pipeline for real-time speech-to-text and text-to-speech conversion, leveraging Whisper ASR and Coqui TTS models with Kubernetes orchestration. Features a Gradio web interface, GPU acceleration, and local processing for privacy-focused voice applications. 🚀🎤📝
ai asr containerization coqui-tts cuda docker fastapi gpu gradio kubernetes machine-learning nvidia orchestration python real-time-processing speech-recognition text-to-speech tts voice-assistant whisper
Last synced: 15 Apr 2026
https://github.com/sedflix/cuda_pattern_matching
Getting words frequency using the concepts of pattern matching in CUDA
Last synced: 17 Mar 2026
https://github.com/matiasvlevi/cuno
Provides cuda bindings, kernel maps and device memory managment for Dannjs computations. [Experimental and not complete]
addon cuda dann dannjs machine-learning nodejs
Last synced: 15 Apr 2026
https://github.com/mohamedsamirx/yolov12-tensorrt-cpp
YOLOv12 Inference Using CPP, Tensorrt, And CUDA
cpp cuda tensorrt tensorrt-inference yolo yolov12
Last synced: 15 Apr 2026
https://github.com/manu-sh/cuda-mandelbrot
how to use cuda acceleration to compute mandelbrot set
Last synced: 15 Apr 2026
https://github.com/ahmed5827/image_generation
This application provides a graphical user interface (GUI) for generating images using the Stable Diffusion model. The GUI allows users to input a text prompt, and the application generates an image based on the prompt.
ai cuda generative-ai image-generation
Last synced: 15 Apr 2026
https://github.com/lyynn777/cuda-bitonic-sort
Simple CUDA project to implement Bitonic Sort and compare it with normal CPU sorting.
bitonic-sort cuda gpu-computing gpu-vs-cpu parallel-computing performance-testing pycuda python
Last synced: 15 Apr 2026
https://github.com/tkemmer/cunessie.jl
CUDA-accelerated Nonlocal Electrostatics in Structured Solvents
bioinformatics boundary-element-method cuda electrostatics gpu-computing julia proteins
Last synced: 31 Jan 2026
https://github.com/snandasena/courseera_gpu_specilization
Example for Cuda streaming
Last synced: 15 Apr 2026
https://github.com/starlitdreams/pacman-convolutional-q-learning
This project implements a Deep Q-Network (DQN) using PyTorch to train an agent to play Atari's Ms. Pac-Man. It utilizes reinforcement learning with a convolutional neural network (CNN) for image processing. Features include experience replay, frame preprocessing, and CUDA support, with trained model saving and video rendering of gameplay.
artificial-intelligence artificial-neural-networks atari cuda deep-learning deep-learning-algorithms deep-q-learning deeplearning gymnasium gymnasium-environment python pytorch
Last synced: 15 Apr 2026
https://github.com/cscfi/csc-env-julia
Julia language environment including MPI.jl, CUDA.jl and AMDGPU.jl preferences for HPC clusters at CSC.
amdgpu ansible cuda hpc julia julia-language mpi
Last synced: 01 Feb 2026
https://github.com/zury7/parallel-programming
A collection of performance optimizations and comparisons between multiprocessing and multithreading using pthreads, OpenMP, and CUDA. The experiments analyze execution speed, resource usage, and parallelization efficiency across different computational models. ( CS 4553 : Scientific Computing )
Last synced: 08 May 2026
https://github.com/teambipartite/bipartite-gemm
High throughput data-parallel GEMM implementations in Cuda using Cuda cores and Tensor cores
Last synced: 17 Apr 2026
https://github.com/m-torhan/cuda-fractals
CUDA C++ implementation of Fractals visualization
Last synced: 25 Feb 2026
https://github.com/xza85hrf/flag_prediction_project
This application predicts the name of a country (or countries) based on an input flag image. It uses advanced image processing techniques and deep learning models built with PyTorch to classify flags accurately.
cross-validation cuda data-augmentation docker efficientnetb0 flag-recognition image-classification machine-learning mixed-precision-training mobilenetv2 python pytorch resnet resnet-50 transfer-learning
Last synced: 15 Apr 2026
https://github.com/grizzz13/minimal-cuda
Minimal configurations to setup cuda cpp in cmake.
Last synced: 18 Apr 2026
https://github.com/fieldcure/fieldcure-whisper-runtimes
Pre-built Whisper.net native runtime binaries (CPU/CUDA/Vulkan) for the FieldCure software ecosystem.
cuda dotnet native-binaries nuget redistributable vulkan whisper whisper-net
Last synced: 01 Jun 2026
https://github.com/baremetalrt/baremetalrt
BareMetalRT — edge GPU compute mesh
cuda distributed-computing gpu inference llm nvidia tensorrt windows
Last synced: 18 Apr 2026
https://github.com/ironjr/minimal-cuda-pytorch
Repository-level snippet for minimal implementation of a PyTorch CUDA extension.
Last synced: 04 May 2026
https://github.com/dstrigl/cnnplus
Master thesis 2010: Fast Convolutional Neural Network Training and Classification on CUDA GPUs
cnn convolutional-neural-networks cpp cuda gpu neural-networks speedup thesis
Last synced: 30 Jun 2026
https://github.com/AMYPAD/miutil
Basic functionality needed for AMYPAD
cuda matlab medical-imaging python
Last synced: 10 Apr 2025