CUDA
CUDA® is a parallel computing platform and programming model developed by NVIDIA for general computing on graphical processing units (GPUs). With CUDA, developers are able to dramatically speed up computing applications by harnessing the power of GPUs.
- GitHub: https://github.com/topics/cuda
- Wikipedia: https://en.wikipedia.org/wiki/CUDA
- Created by: Nvidia
- Released: June 23, 2007
- Related Topics: nvcc,
- Last updated: 2026-06-23 00:07:15 UTC
- JSON Representation
https://github.com/elcruzo/cuda-conv
Lightweight CUDA kernel for 2D image convolution achieving 20x+ speedup. Built with CuPy for the NVIDIA Hackathon.
computer-vision convolution cuda cupy gpu-computing hackathon high-performance-computing image-processing nvidia python
Last synced: 15 May 2026
https://github.com/naidezhujimo/cuda-learning-just-record-the-learning-process-
just record the learning process,There are notes,Welcome to learn.
Last synced: 26 Mar 2025
https://github.com/nxoti1/points-reader-ocr
🖥️ Extract text from images easily with POINTS-Reader OCR, a high-accuracy application for seamless document conversion and processing.
cuda gradio huggingface-transformers ocr open-source points-reader reportlab spaces tencent vision-language-model vlm
Last synced: 20 May 2026
https://github.com/azdavis/parallel-portrait-mode
Parallel Portrait Mode
cuda image-processing ispc openmp
Last synced: 13 Apr 2026
https://github.com/lehoangan2906/cuda_basics
A simple implementation of operations on vectors and matrices, optimized for running on Nvidia GPU with CUDA
Last synced: 16 Jun 2025
https://github.com/sevilze/folderesque
Python Script to process and upscale images in specified folders using RRDB models.
Last synced: 02 Mar 2026
https://github.com/kenwuqianghao/c4ai-cuda-birds
Homework assignments for C4AI Beginners in Research-Driven Studies
Last synced: 18 Apr 2026
https://github.com/TheodoreAI/monte-carlo-simulator
CUDA application for Monte Carlo simulation is used to determine the range of outcomes for a series of parameters, each of which has a probability distribution showing how likely each option is to happen. This is using CUDA.
cuda gpu-computing monte-carlo-simulation parallel-computing
Last synced: 06 Oct 2025
https://github.com/thesupercd/cuda_sort
A simple project implementing and measuring the runtime performance metrics related to massively parallel algorithms (radix sort) on an NVIDIA GPU device.
benchmarking c cpp cuda cuda-programming gpu-acceleration gpu-programming multithreading parallel-processing radix-sort sorting-algorithms
Last synced: 10 May 2026
https://github.com/hshshshshsh12e/gpumkat
Gpumkat is a shader debugger for metal which is designed to do what instruments can't do
alternative api control cuda darwin debugger debugging gpumkat macos management profiler release shaders threads
Last synced: 14 Apr 2026
https://github.com/deltatecs/voses
Volatile Secret Searcher - massively parallel, brute force memory dump analysis for (D)TLS secret extraction
cuda memory-hacking reverse-engineering tls
Last synced: 15 Jun 2025
https://github.com/mmz33/practice-cuda
c cpp cuda cuda-programming gpu-programming parallel-programming
Last synced: 14 Apr 2026
https://github.com/kanttouchthis/cuda_schem
script for voxelization of 3d models to minecraft .schem schematics with texture support powered by numba cuda.
cuda minecraft numba voxelization
Last synced: 07 Oct 2025
https://github.com/grindelfp/cuda-texture-memory
Exercise on using texture memory in CUDA.
Last synced: 30 Mar 2025
https://github.com/andreasholt/cuda-matmul-benchmarking
Implementing and benchmarking various matmul implementations in CUDA
Last synced: 01 Nov 2025
https://github.com/dreoporto/tensorflow-gpu-docker
An example project to run TensorFlow with CUDA-enabled GPU acceleration using Windows, Docker and WSL2.
artificial-intelligence cuda deep-learning docker docker-compose jupyter machine-learning nvidia-docker python windows wsl2
Last synced: 27 Jan 2026
https://github.com/viktor-akusoff/chernabogpy
ChernabogPy is a Python package for visualizing gravitational distortions caused by black holes using nonlinear ray tracing.
cuda gpu physics-simulation python3 relativity-of-space-and-time torch
Last synced: 15 May 2026
https://github.com/thanduriel/cuda_hip_comparison
performance study of atomics on GPUs
Last synced: 09 Oct 2025
https://github.com/ironjr/minimal-cuda-pytorch
Repository-level snippet for minimal implementation of a PyTorch CUDA extension.
Last synced: 04 May 2026
https://github.com/enesdoruk/opencv-cpp
Opencv CPP tutorials
computer-vision cpp cuda opencv
Last synced: 09 Oct 2025
https://github.com/mradovic38/pycuda-simulated-annealing
Simulated annealing process for finding the 'minimum energy' of an image.
cuda image-energy parallel-computing parallel-programming pycuda python simulated-annealing
Last synced: 09 Oct 2025
https://github.com/skyguy126/cuda-learnings
Collection of personal CUDA learnings.
Last synced: 10 Oct 2025
https://github.com/bhavinpatel4199/image-processing-with-opencv-and-cuda-on-google-colab
This repository demonstrates image processing using OpenCV with CUDA for GPU acceleration on Google Colab. It includes basics like displaying and manipulating images, alongside advanced techniques using CUDA to enhance performance. Ideal for learning GPU-accelerated image processing in Python.
computer-vision cuda google-colab gpu-acceleration high-performance-computing image-processing opencv pixel-manupulation
Last synced: 19 Jan 2026
https://github.com/maltsev-andrey/cuda-nn-inference
GPU-accelerated neural network inference using custom CUDA kernels. Achieves 97.82% accuracy on MNIST.
cuda deep-learning gpu-programming neural-networks numba nvidia parallel-computing parallel-programming performance-optimization python3 pytorch rhel9 tesla-p100
Last synced: 07 Mar 2026
https://github.com/adesoji1/visis_backend_assessment_submission-adesoji
Create a backend API to handle book information requests, and summary generation.
bart cache cuda data-extraction fastapi flask hugging-face hugging-face-hub llama postman-api python3 pytorch spacy sqlite3-database swagger-api tensorboard-visualizations transformer ubuntu2304
Last synced: 19 Jan 2026
https://github.com/zcemycl/distributecompute
Parallel Computing and Distributed Computing with C++ threads, Python threads+asyncio+multiprocessing and Spark, and Cuda.
asyncio boost cpp cuda global-interpreter-lock jthread multiprocessing python spark thread
Last synced: 14 Apr 2026
https://github.com/ericrihm/yt-whisper
Fast, local YouTube transcription with speaker diarization and a keyboard-first Textual TUI. YouTube-subs fast path, faster-whisper on CUDA, opt-in pyannote diarization, prompt profile auto-detection.
cli cuda faster-whisper pyannote python speaker-diarization textual transcription tui whisper youtube
Last synced: 31 May 2026
https://github.com/thesoenke/deeplearning-docker
Setup for Deep Learning experiments in Docker with Cuda
Last synced: 11 May 2026
https://github.com/isaurabhmeshram28/cuda-examples
This repository contains examples and experiments with CUDA programming to explore GPU computing and parallel processing using NVIDIA's CUDA framework.
Last synced: 19 May 2026
https://github.com/zury7/parallel-programming
A collection of performance optimizations and comparisons between multiprocessing and multithreading using pthreads, OpenMP, and CUDA. The experiments analyze execution speed, resource usage, and parallelization efficiency across different computational models. ( CS 4553 : Scientific Computing )
Last synced: 08 May 2026
https://github.com/boostibot/bachelors
My bachelors thesis at CTU in Prague, Faculty of Nuclear Sciences and Physical Engineering supervised by Ing. Pavel Strachota, Ph.D
crystal-growth cuda finite-volume-method parallel-programming phase-field-method
Last synced: 26 Oct 2025
https://github.com/centuriontheman/parallelsortingalgorithms
Cpp/CUDA application for benchmarking sorting algorithms
benchamark cpp cuda multithreading sorting-algorithms
Last synced: 18 Feb 2026
https://github.com/fabricsoul/pytorch-jupyter-cuda
The docker image that can run pytorch and jupyterlab
ai artificial-intelligence cuda deep-learning deep-neural-networks docker docker-container docker-image gpu-computing jupyterlab machine-learning machinelearning nvidia nvidia-cuda nvidia-docker nvidia-smi pytorch
Last synced: 13 Mar 2026
https://github.com/anthongretter/spmv-cuda-analysis
A analysis on different approaches on Sparse Matrix-Vector Multiplication (SpMV) on GPU using CUDA
cuda gpu matrix-computations spmv unitn
Last synced: 14 Oct 2025
https://github.com/grizzz13/minimal-cuda
Minimal configurations to setup cuda cpp in cmake.
Last synced: 18 Apr 2026
https://github.com/dragonscypher/prompty
Tool for generating smart and secure prompts for language models!
autotokenizer bert-model cuda google-t5 llm python3 tensorflow threading
Last synced: 02 Jan 2026
https://github.com/sjmonson/tdr-inverse
A set of CUDA programs that invert matrices
cuda gpu matrix-inverse matrix-inversion tdr
Last synced: 14 Oct 2025
https://github.com/voduchuy/cudafsp
CUDA-based implementation of the Finite State Projection (FSP) algorithm.
chemical-master-equation cuda stochastic-reaction-networks sundials
Last synced: 20 Jan 2026
https://github.com/branebb/nn-framework
Framework for creating neural networks using C++ and CUDA platform. This project is part of my final university assignment for bachelor's degree.
cmake cpp cuda cuda-programming
Last synced: 20 Jan 2026
https://github.com/obitech/tuc-ki-gpu-docker
cuda docker machine-learning nvidia-docker nvidia-gpu tensorflow tuc
Last synced: 14 Apr 2026
https://github.com/lu1smgb/ppr
Asignatura de Programacion Paralela. Curso 2024/2025. Universidad de Granada
Last synced: 15 Oct 2025
https://github.com/ergus/cuda-ts-mode
An emacs Cuda mode supported by tree-sitter
Last synced: 20 May 2026
https://github.com/masterskepticista/parallel_reductions_cuda
Iteratively optimizing parallel reductions in CUDA.
Last synced: 16 Oct 2025
https://github.com/airvzxf/c-plus-plus-understanding-cuda
Understanding CUDA with C++
cuda hacktoberfest hacktoberfest-accepted
Last synced: 22 Mar 2025
https://github.com/yangfengzzz/tardis
Travel space and time by using autodiff and codegen
Last synced: 03 May 2026
https://github.com/puzzlef/vector-sum-cuda
Comparing performance of sequential vs CUDA-based vector element sum.
cuda element experiment gpu sum vector
Last synced: 14 Apr 2026
https://github.com/ojaswithag/opencv-doc
OpenCV ile görüntü ve video işleme, makine öğrenmesi ve proje uygulamaları için Türkçe kapsamlı bir rehber. 🐙 Adım adım kod örnekleriyle öğrenin ve projeler geliştirin.
arm-architecture cuda cuda-support deployment django docker-image docker-images heroku image-processing javascript nodejs nvidia opencv-contrib opencv3 production python scanner tutorial
Last synced: 08 Apr 2026
https://github.com/maltsev-andrey/julia_set_cuda
High-performance Julia set fractal computation in pure CUDA C, achieving 2.78 billion pixels/second on Tesla P100. Demonstrates GPU kernel programming, memory optimization, and massive parallelization (16M+ threads)."
cuda fractals gpu-programming high-performance-computing nvidia parallel-computing science visualization
Last synced: 03 Nov 2025
https://github.com/rbuj-uoc/m1.209
PAC 1, PAC 2, PAC 3 i PAC 4 de l'assignatura Computació d'altes prestacions del MUEI
Last synced: 21 May 2026
https://github.com/AMYPAD/miutil
Basic functionality needed for AMYPAD
cuda matlab medical-imaging python
Last synced: 10 Apr 2025
https://github.com/sakurabtc888/1000_btc_bitcoin_challenge
🔥 针对 [privatekeys.pw] 160个比特币的CPU+GPU碰撞工具
Last synced: 21 Oct 2025
https://github.com/paranoia55/env-setup
🚀 Set up a complete, production-ready JavaScript/TypeScript development environment on macOS with AI tools using a single command.
cuda deep-learning ethical-hacking-tools ios kali-linux kali-linux-tools make makefile next-14 next-appdir next-starter open-source-project prettier radix-ui shell tensorflow-gpu typescript ubuntu
Last synced: 29 Apr 2026
https://github.com/chibby0ne/cuda_by_example
Old notes (and new ones) of the Cuda by Example book
cuda cuda-programming gpgpu gpu-computing gpu-programming
Last synced: 15 Mar 2026
https://github.com/nmicic/k-tuplet-search
k-tuplet-search
computational-number-theory cuda experimental-mathematics gmp gpu-computing high-performance-computing hpc k-tuplets number-theory primality-testing prime-numbers prime-tuples sieve
Last synced: 21 May 2026
https://github.com/dhakalnirajan/baghchal-rl
C/CUDA implementation of Baagh Chaal Game with Neural Network
bagh-chal baghchal c clang cuda cuda-kernels neural-network reinforcement-learning
Last synced: 14 Apr 2026
https://github.com/islam-hady9/deep-cuda
Image Classification with CNN in CUDA C++
artificial-intelligence cnn cpp cuda deep-learning gpu-programming image-classification machine-learning mnist-dataset neural-networks-from-scratch parallel-computing
Last synced: 02 May 2026
https://github.com/phrutis/bip39scan
brute bip39 mnemonic GPU - $250
bip39 brute brute-force bruteforce cuda gpu mnemonic phrases seed
Last synced: 10 Apr 2025
https://github.com/soloangema/nvidia-8z6ms
🐱 Generate randomized README files to enhance DX farming, powered by NVIDIA technology for an engaging development experience.
artificial-intelligence computer-vision cuda data-science deep-learning gpu image-processing machine-learning neural-networks nvidia parallel-computing performance-optimization pytorch tensor video-processing
Last synced: 02 May 2026
https://github.com/chad24dev/gpu-agent-opt
🧠 Optimize GPU workflows with `gpu-agent-opt`, a Python package for profiling, scientific computing, and efficient CUDA exploration.
ai-agents autotuning cuda edge-ai geospatial gpu hpc nvidia optimization performance pytorch
Last synced: 07 May 2026
https://github.com/kichappa/videosift
CUDA based 3D Computer Vision for Exoskins
computer-vision convolution-filter cuda hpc julia sift-algorithm
Last synced: 15 May 2026
https://github.com/maxenceleguery/jare
3D Render engine accelerated with CUDA
Last synced: 21 May 2026
https://github.com/sbstndb/nbody_k
A simple 3D naïve NBody simulation using Kokkos enabling CUDA or OpenMP backend
cuda kokkos nbody openmp simulation
Last synced: 21 May 2026
https://github.com/shermanlo77/poisson_icing
Gibbs sampling on the Poisson-Ising model. The Poisson-Ising model is a 2D image of Poisson distributed random variables but has a dependency on their four neighbours. This causes the Poisson random variables to be similar (or dissimilar) to their neighbours.
cuda cupy gibbs-sampling gpu ising-model mcmc monte-carlo poisson poisson-ising
Last synced: 21 May 2026
https://github.com/bjornmelin/ml-algorithm-playground
🧪 Core ML algorithm implementations with GPU acceleration. Featuring optimized implementations across various libraries with comprehensive analysis. 📈
algorithms cuda gpu-computing lightgbm machine-learning python scikit-learn xgboost
Last synced: 13 May 2026
https://github.com/thephiltacular/voice-ai-pipeline
A containerized AI pipeline for real-time speech-to-text and text-to-speech conversion, leveraging Whisper ASR and Coqui TTS models with Kubernetes orchestration. Features a Gradio web interface, GPU acceleration, and local processing for privacy-focused voice applications. 🚀🎤📝
ai asr containerization coqui-tts cuda docker fastapi gpu gradio kubernetes machine-learning nvidia orchestration python real-time-processing speech-recognition text-to-speech tts voice-assistant whisper
Last synced: 15 Apr 2026
https://github.com/sedflix/cuda_pattern_matching
Getting words frequency using the concepts of pattern matching in CUDA
Last synced: 17 Mar 2026
https://github.com/matiasvlevi/cuno
Provides cuda bindings, kernel maps and device memory managment for Dannjs computations. [Experimental and not complete]
addon cuda dann dannjs machine-learning nodejs
Last synced: 15 Apr 2026
https://github.com/mohamedsamirx/yolov12-tensorrt-cpp
YOLOv12 Inference Using CPP, Tensorrt, And CUDA
cpp cuda tensorrt tensorrt-inference yolo yolov12
Last synced: 15 Apr 2026
https://github.com/manu-sh/cuda-mandelbrot
how to use cuda acceleration to compute mandelbrot set
Last synced: 15 Apr 2026
https://github.com/ahmed5827/image_generation
This application provides a graphical user interface (GUI) for generating images using the Stable Diffusion model. The GUI allows users to input a text prompt, and the application generates an image based on the prompt.
ai cuda generative-ai image-generation
Last synced: 15 Apr 2026
https://github.com/lyynn777/cuda-bitonic-sort
Simple CUDA project to implement Bitonic Sort and compare it with normal CPU sorting.
bitonic-sort cuda gpu-computing gpu-vs-cpu parallel-computing performance-testing pycuda python
Last synced: 15 Apr 2026
https://github.com/tkemmer/cunessie.jl
CUDA-accelerated Nonlocal Electrostatics in Structured Solvents
bioinformatics boundary-element-method cuda electrostatics gpu-computing julia proteins
Last synced: 31 Jan 2026
https://github.com/snandasena/courseera_gpu_specilization
Example for Cuda streaming
Last synced: 15 Apr 2026
https://github.com/starlitdreams/pacman-convolutional-q-learning
This project implements a Deep Q-Network (DQN) using PyTorch to train an agent to play Atari's Ms. Pac-Man. It utilizes reinforcement learning with a convolutional neural network (CNN) for image processing. Features include experience replay, frame preprocessing, and CUDA support, with trained model saving and video rendering of gameplay.
artificial-intelligence artificial-neural-networks atari cuda deep-learning deep-learning-algorithms deep-q-learning deeplearning gymnasium gymnasium-environment python pytorch
Last synced: 15 Apr 2026
https://github.com/cscfi/csc-env-julia
Julia language environment including MPI.jl, CUDA.jl and AMDGPU.jl preferences for HPC clusters at CSC.
amdgpu ansible cuda hpc julia julia-language mpi
Last synced: 01 Feb 2026
https://github.com/teambipartite/bipartite-gemm
High throughput data-parallel GEMM implementations in Cuda using Cuda cores and Tensor cores
Last synced: 17 Apr 2026
https://github.com/m-torhan/cuda-fractals
CUDA C++ implementation of Fractals visualization
Last synced: 25 Feb 2026
https://github.com/joe-mruz/hgvisualizer
An interactive simulation and visualization tool for evolving hypergraphs, inspired by the Wolfram Physics Project.
cpp cuda hypergraph physics simulator wolfram
Last synced: 02 May 2026
https://github.com/xza85hrf/flag_prediction_project
This application predicts the name of a country (or countries) based on an input flag image. It uses advanced image processing techniques and deep learning models built with PyTorch to classify flags accurately.
cross-validation cuda data-augmentation docker efficientnetb0 flag-recognition image-classification machine-learning mixed-precision-training mobilenetv2 python pytorch resnet resnet-50 transfer-learning
Last synced: 15 Apr 2026
https://github.com/dasbd72/nthu-ipc-2022
National Tsing Hua University - Introduction to Parallel Computing - 2022
cuda cuda-programming hpc mpi openmp pthreads
Last synced: 30 Mar 2025
https://github.com/fieldcure/fieldcure-whisper-runtimes
Pre-built Whisper.net native runtime binaries (CPU/CUDA/Vulkan) for the FieldCure software ecosystem.
cuda dotnet native-binaries nuget redistributable vulkan whisper whisper-net
Last synced: 01 Jun 2026
https://github.com/baremetalrt/baremetalrt
BareMetalRT — edge GPU compute mesh
cuda distributed-computing gpu inference llm nvidia tensorrt windows
Last synced: 18 Apr 2026
https://github.com/stephanmg/cuda-playground
CUDA playground
cpu cuda gp100 gpu gv100 openmp parallel-computing parallel-programming
Last synced: 30 Mar 2025
https://github.com/kentakoong/mtnlog
A simple multinode performance logger for Python
cuda lanta nvitop python slurm-cluster
Last synced: 11 Jan 2026
https://github.com/muppetsg2/cudaraytracer
A custom ray tracer originally developed during university studies to run on CPU, now ported to GPU using CUDA. This project was created to explore GPU rendering techniques and to gain hands-on experience with CUDA programming.
cuda mit-license nvidia-cuda nvidia-gpu raytracing sfml stb-image student-project study-project
Last synced: 16 Apr 2026
https://github.com/equiel-1703/cuhip
Wrapper tool to convert CUDA source code to HIP code and compile it with HIPCC. Useful for learning CUDA programming using AMD devices..
Last synced: 14 May 2026
https://github.com/yashpotdar-py/flood-vision
Flood Vision - A deep learning–based computer vision system for flood mapping and damage assessment using aerial imagery.
cuda deep-learning flood-detection iot python
Last synced: 16 Apr 2026
https://github.com/sferez/sspp_sparse_matrix_cuda
Small Scale Parallel Programming, Sparse Matrix multiplication with CUDA
cpp cuda omp omp-parallel parallel-computing small-scale-parallel-programming sparse-matrix
Last synced: 30 Apr 2026
https://github.com/daelsepara/hipnewton
GPU Implementation of Newton Fractal Generator with Benchmarking
amd cuda fractal gpu gpu-compute gpu-computing hip newton parallel-computing rocm sdk
Last synced: 03 May 2026