CUDA
CUDA® is a parallel computing platform and programming model developed by NVIDIA for general computing on graphical processing units (GPUs). With CUDA, developers are able to dramatically speed up computing applications by harnessing the power of GPUs.
- GitHub: https://github.com/topics/cuda
- Wikipedia: https://en.wikipedia.org/wiki/CUDA
- Created by: Nvidia
- Released: June 23, 2007
- Related Topics: nvcc,
- Last updated: 2026-07-03 00:07:23 UTC
- JSON Representation
https://github.com/isquicha/cuda-parallel-studies
Learning CUDA programming here =D
cuda cuda-programming cuda-toolkit
Last synced: 03 Jul 2025
https://github.com/yutakseo/docker_ubuntu-cuda_environment
🐳 A ready-to-use Docker environment for deep learning development with Ubuntu 22.04 and CUDA 11.8.
container cuda docker environment ubuntu
Last synced: 12 Apr 2026
https://github.com/matthewfeickert/report-urssi-fellowship-2025
Report on URSSI 2025 Early-Career Fellowship
Last synced: 17 Jan 2026
https://github.com/nikhilrout/thetensorcoreproject
Microarchitecture implementation of Nvidia's Tensor Cores
cuda floating-point gpgpu hybrid-precision-training tensorcore
Last synced: 01 Apr 2025
https://github.com/ray-chew/modified_ch
Density functional theory (DFT) and self-consistent field theory (SCFT) simulation of diblock copolymers
cuda density-functional-theory diblock-copolymer numerical-analysis numerical-methods self-consistent-field-theory
Last synced: 11 May 2026
https://github.com/hr-fahim/transformer-model-optimization
Sample GPT Transformer Model from Scratch.
cuda few-shot-learning transfomers
Last synced: 02 May 2026
https://github.com/rog0d/gpuss_watchers
"The GPU Watchers swore upon their shared memory hierarchy, from L1 to global memory, which also served as their mandate as lords of parallel computation."
cuda gpu-acceleration gpu-monitoring gpu-profiling
Last synced: 28 Apr 2026
https://github.com/aurelienperez/gpu-heston-monte-carlo
GPU-accelerated Monte Carlo simulation for option pricing under the Heston model using CUDA.
Last synced: 01 Apr 2025
https://github.com/drilonaliu/parallel-mandelbrot-set
GPU-accelerated Mandelbrot Set generation with CUDA and OpenGL interoperability.
cuda fractals gpu mandelbrot-fractal parallel-programming
Last synced: 12 Apr 2026
https://github.com/juliankarrer/reyn
CUDA-based Implementation of Smoothed Particle Hydrodynamics for Fluid Simulation
cuda fluid lagrangian simulation sph
Last synced: 02 Jul 2026
https://github.com/TeamBipartite/bipartite-gemm
High throughput data-parallel GEMM implementations in Cuda using Cuda cores and Tensor cores
Last synced: 14 Jan 2026
https://github.com/sephiroth7712/k-nearest-neigbours
Implementation of K-Nearest Neighbors algorithm using multiple parallel computing approaches: CUDA (GPU), Hadoop, Spark, MPI, OpenMP, and PThreads. Demonstrates scalable machine learning across different parallel computing paradigms from GPU to distributed frameworks.
cuda cuda-programming hadoop-mapreduce java mpi multiprocessing multithreading openmp pthreads scala spark
Last synced: 12 Apr 2026
https://github.com/farukalamai/100-days-of-cuda
100 days of writing CUDA kernels!
cuda cuda-kernels cuda-programming gpu gpu-programming pytroch
Last synced: 02 Jul 2026
https://github.com/tomtolleson/cuda-kernel-benchmarking-tool
A benchmarking tool in C++ that creates Cuda kernels and tests the overall system performance between CPU and GPU
cuda cuda-kernels cuda-support cuda-toolkit nvidia nvidia-cuda nvidia-gpu
Last synced: 30 Mar 2025
https://github.com/ne0nwinds/gpupuzzles
My solutions to srush/GPU-Puzzles using CUDA
Last synced: 16 May 2026
https://github.com/eminem5410/devmind-platform
Linux-first CLI for AI environment diagnostics, repair & automation
ai automation cli cuda developer-tools devops docker generative-ai linux local-llm observability ollama python self-hosted system-monitoring
Last synced: 30 May 2026
https://github.com/xza85hrf/flux_pipeline
FluxPipeline is a prototype experimental project that provides a framework for working with the FLUX.1-schnell image generation model. This project is intended for educational and experimental purposes only.
ai cuda docker educational experimental flux1 flux1-schnell flux1ai gradio image-generation model non-commercial python pytorch research transformer-model
Last synced: 05 Jul 2025
https://github.com/doxakis/cosinesimilaritydistancesongpu
Compute cosine similarity distances for all combinations of the dataset on the gpu with CUDA
Last synced: 13 Apr 2026
https://github.com/toshikinakamura0412/dockerfiles
Development environment using Docker for some Linux distributions
alpine bash cuda debian devcontainer devcontainers docker docker-compose fedora opencv opensuse ros ros-humble ros-noetic ros2 ubuntu ubuntu2004 ubuntu2204 vscode zsh
Last synced: 10 Jul 2025
https://github.com/jadc/cuda-raytracer
A simple path tracer written in CUDA.
cpp cuda gpu-programming graphics parallel-programming path-tracing raytracing
Last synced: 16 May 2026
https://github.com/ysl1016/cudadigitfilter
CUDA-based parallel image filtering system for MNIST dataset
computer-vision cuda deep-learning gpu-acceleration image-processing mnist parallel-computing
Last synced: 28 Mar 2025
https://github.com/abhiksark/gluon-by-example
Learn Triton's Gluon by example — the same GPU kernels written in Triton and Gluon, benchmarked
cuda deep-learning gluon gpu gpu-kernels triton tutorial
Last synced: 01 Jul 2026
https://github.com/eyelor/text-to-image-item-generator
A Python workflow for generating random item images using models from Hugging Face.
ai conda cuda flux-schnell generator huggingface item llama python pytorch text-to-image
Last synced: 13 Apr 2026
https://github.com/axeloooo/pytorch
Collection of deep learning workflows in PyTorch, from fundamentals and classification to transfer learning and experiment tracking.
Last synced: 28 Apr 2026
https://github.com/ltsyk/smart-snake-ai
Advanced Deep Q-Network AI for Snake Game with CUDA support and 700% performance boost
artificial-intelligence cuda deep-q-network dqn game-ai machine-learning pytorch reinforcement-learning snake-game
Last synced: 28 Apr 2026
https://github.com/asadiahmad/100_sports_image_classification
A deep learning project for sport image classification using a custom VGG19-based architecture with integrated Grad-CAM heatmap visualization for model interpretability.
computer-vision cuda data-augmentation deep-learning explainable-ai gpu-acceleration grad-cam heatmap-visualization image-classification mixed-precision-training pytorch pytorch-grad-cam sports-analytics sports-classification transfer-learning vgg19
Last synced: 11 Jun 2025
https://github.com/kronbii/thermal-super-resolution
State-of-the-art thermal super-resolution system (IMDN) with RGB→thermal adaptation, custom multi-component loss, 29.6 dB PSNR, 0.713 SSIM, 250+ FPS, production-ready PyTorch + CUDA implementation.
computer-vision cuda deep-learning image-enhancement imdn model-optimization production-machine-learning pytorch real-time real-time-processing research super-resolution thermal-imaging
Last synced: 18 Apr 2026
https://github.com/tornikeo/minimal-vscode-cuda-meson
Minimal sample of using VSCode and Meson to build CUDA applications
Last synced: 08 Sep 2025
https://github.com/aeyage/intraday_prices
GPU-accelerated portfolio optimisation
Last synced: 05 Apr 2025
https://github.com/voltr0x/raytracing-cuda
Raytracing in a weekend using CUDA
Last synced: 01 Apr 2026
https://github.com/codename-detective/cuda_gpgpus_shared_memory_systems_pdp
CUDA GPGPUs Shared Memory Systems Parallel & Distributed Programming
cuda cuda-programming numa parallel-programming
Last synced: 30 Mar 2025
https://github.com/tdavidcl/cu_intercept
cuda cuda-memory cuda-programming hook massif memory-tracking preload
Last synced: 03 May 2026
https://github.com/kataglyphis/machinelearningalgorithms
Basic Machine Learning Algorithms
cuda machine-learning python tensorflow
Last synced: 31 Mar 2025
https://github.com/Neuro-Mechatronics-Interfaces/python-intan
Tools and demos for working with EMG data from intan using python
circuitpython cuda emg pico python realtime tensorflow
Last synced: 13 Jan 2026
https://github.com/monajemi-arman/sparkling
Easy to use Spark cluster management panel with GPU support
apache-spark csharp cuda distributed-computing distributed-learning docker gpu javascript nextjs torch typescript
Last synced: 12 Apr 2026
https://github.com/eastonman/tensorrt-pytorch-wrapper
A wrapper makes TensorRT engine accept PyTorch Cuda Tensor.
Last synced: 06 May 2026
https://github.com/efecaliskannn/pneumonia-detection-with-cnn--vgg16--and-resnet50-deep-learning-models
In this project, pneumonia detection using deep learning, a subset of artificial intelligence, is aimed. The performance of deep learning algorithms, including CNN, VGG16, and ResNet50 models, in detecting pneumonia has been examined.(Bu projede yapay zekanın alt kümesi olan derin öğrenme ile zatürre tespiti amaçlanmaktadır.)
artificial-intelligence convolutional-neural-networks cuda deep-learning keras-tensorflow nvidia-cuda pyhton transfer-learning
Last synced: 13 Jun 2025
https://github.com/shtrophic/wicuvanity
Generate wireguard vanity keys on your Nvidia GPU
cuda gpu vanity-address vanity-addresses vanitygen wireguard
Last synced: 10 Mar 2025
https://github.com/lablup/backend.ai-accelerator-cuda
The Backend.AI CUDA Accelerator Plugin
Last synced: 16 May 2026
https://github.com/cuda8/brainwords2
GPU brainflayer for sale $250
brain brainflayer brainwords cuda gpu key pass passphrase private
Last synced: 10 Mar 2025
https://github.com/elcruzo/cuda-conv
Lightweight CUDA kernel for 2D image convolution achieving 20x+ speedup. Built with CuPy for the NVIDIA Hackathon.
computer-vision convolution cuda cupy gpu-computing hackathon high-performance-computing image-processing nvidia python
Last synced: 15 May 2026
https://github.com/ionmich/cs149-local-dev
Provides `conda` installation instructions for Stanford's CS149 (Parallel Computing) programming assignments
conda cs149 cuda ispc parallel-computing
Last synced: 31 Mar 2025
https://github.com/marnovo/cuda-projects
cuda cuda-kernels gpu gpu-programming nvidia-cuda parallel-computing
Last synced: 10 Jun 2025
https://github.com/ndgigliotti/torch-ipca
GPU-accelerated Incremental PCA for PyTorch
cuda dimensionality-reduction gpu incremental-pca machine-learning pca pytorch
Last synced: 26 Jan 2026
https://github.com/mrgkanev/tensorflow-gpu-docker-setup
A Docker environment for TensorFlow GPU development with optimized configurations for WSL2, troubleshooting guides, and common error fixes
cuda cuda-toolkit deep-learning dev-environment development-tools docker gpu-acceleration machine-learning nvidia-docker nvidia-docker-support python tensorflow
Last synced: 13 Apr 2026
https://github.com/moesio-f/cla
C Linear Algebra (CLA) library. A simple toy library for basic vector/matrix operations with CUDA support and Python bindings.
Last synced: 09 May 2026
https://github.com/minseoc03/cuda-100-days
A 100-day journey to master CUDA programming, inspired by the CUDA-120-DAYS--CHALLENGE project. This repo contains daily CUDA exercises and code folders, with learning notes hosted on Notion. Practicing on leetgpu.com due to lack of local NVIDIA GPU.
100daysofcode cuda deeplearning gpgpu gpu hpc nvidia parallel-computing
Last synced: 19 Apr 2025
https://github.com/hrshl212/custom-cuda-kernels-with-neural-network-implementation
The repository contains custom CUDA kernels for linear layer, softmax and relu which are integrated with python to develop a Neural Network
cuda neural-network python pytorch
Last synced: 08 May 2026
https://github.com/atelierarith/julia_gpu_playground
For those who want use Julia with GPU
cuda docker docker-compose julia
Last synced: 28 Apr 2026
https://github.com/bolu-atx/cuda-dojo
Level up your CUDA skills - RPG style. Do or do not, there is no try.
cuda examples learning tutorial
Last synced: 01 Jul 2026
https://github.com/sneha-at-hub/bruteforce_passwordcracking_in-milliseconds
Last synced: 28 Apr 2026
https://github.com/mattjesc/federated-learning-simulation-1gpu-mi-is
Federated Learning Simulation on a Single GPU with Model Interpretability and Interactive Visualization
ai cuda deep-learning distributed-systems federated-learning gpu hpc keras machine-learning ml model-interpretability python pytorch simulation streamlit tensorflow
Last synced: 05 Jan 2026
https://github.com/lord-turmoil/cudacmakedemo
A demo for building CUDA program with CMake
Last synced: 16 Mar 2025
https://github.com/delusionary/histoptimizer
Solves a minimum variance cost of the partition problem.
Last synced: 14 Jan 2026
https://github.com/dgcnz/nvtx-vscode
Create NVIDIA NVTX ranges directly in VS Code, then profile with Nsight Systems without modifying source code.
Last synced: 13 Apr 2026