CUDA
CUDA® is a parallel computing platform and programming model developed by NVIDIA for general computing on graphical processing units (GPUs). With CUDA, developers are able to dramatically speed up computing applications by harnessing the power of GPUs.
- GitHub: https://github.com/topics/cuda
- Wikipedia: https://en.wikipedia.org/wiki/CUDA
- Created by: Nvidia
- Released: June 23, 2007
- Related Topics: nvcc,
- Last updated: 2026-07-01 00:07:09 UTC
- JSON Representation
https://github.com/lablup/backend.ai-accelerator-cuda
The Backend.AI CUDA Accelerator Plugin
Last synced: 16 May 2026
https://github.com/prateekshukla1108/thunderkittens-docs
Documentation for ThunderKittens framework
Last synced: 18 Mar 2025
https://github.com/elcruzo/cuda-conv
Lightweight CUDA kernel for 2D image convolution achieving 20x+ speedup. Built with CuPy for the NVIDIA Hackathon.
computer-vision convolution cuda cupy gpu-computing hackathon high-performance-computing image-processing nvidia python
Last synced: 15 May 2026
https://github.com/nourmorsy/convolution-neural-network-cuda
Code for optimization to CNN using CUDA
Last synced: 13 May 2026
https://github.com/kenmalik/cuda-dr-bcg
CUDA C++ implementation of the DR-BCG algorithm for numerically solving linear systems.
cpp cuda hpc numerical-methods
Last synced: 19 Apr 2026
https://github.com/wojcikmikolaj/particles-in-a-jar
Collisions between particles simulated on GPU.
algorithms-and-data-structures collision-detection collisions cuda gpu-programming
Last synced: 25 May 2026
https://github.com/bolu-atx/cuda-dojo
Level up your CUDA skills - RPG style. Do or do not, there is no try.
cuda examples learning tutorial
Last synced: 01 Jul 2026
https://github.com/ragu-manjegowda/parallel-programming
Assignments and Projects of Udacity's Introduction to Parallel Programming Course
cuda gpu-programming nvidia-cuda nvidia-gpu udacity-parallel-programming
Last synced: 25 May 2026
https://github.com/unknownnuts/meshsdk
Mesh processing library with extra `C/C#/JS/TS/PYTHON` bindings.
cuda dicom electron emscripten mesh modelling pybind11 stl stomatology threejs wasm
Last synced: 10 Apr 2026
https://github.com/jpodivin/gputomata
Cellular automata running on CUDA capable GPUs
cellular-automata cellular-automaton cuda
Last synced: 07 Nov 2025
https://github.com/lehoangan2906/cuda_basics
A simple implementation of operations on vectors and matrices, optimized for running on Nvidia GPU with CUDA
Last synced: 16 Jun 2025
https://github.com/GTruf/Driver-Drowsiness-Detector
Prototype of an intelligent safety system for detecting driver drowsiness
cpp cuda cudnn deep-learning driver-drowsiness-detection driver-drowsiness-detector drowsiness-detection face-recognition image-recognition machine-learning neural-network nvidia-cuda object-recognition opencv qt6 recognition-neural-network yolo yolov10 yolov5 yolov9
Last synced: 14 Mar 2025
https://github.com/ngoma1713/rushirb2001
🤖 Explore advanced AI and machine learning solutions for protein modeling and medical applications, developed by a dedicated data science graduate student.
computer-vision-opencv cuda data-science-portfolio deep-learning generative-ai machine-learning medical-ai protein-modeling published-researcher pytorch quantum-ml rag-chatbot tensorflow
Last synced: 02 May 2026
https://github.com/bjornmelin/llm-gpu-optimization
🚄 Advanced LLM optimization techniques using CUDA. Features efficient attention mechanisms, custom CUDA kernels for transformers, and memory-efficient training strategies. ⚡
cuda deep-learning gpu-acceleration llm-optimization machine-learning memory-optimization parallel-computing transformers
Last synced: 18 Mar 2025
https://github.com/deepschneider/tinygrad-universal
Universal version of Tinygrad with CUDA and OpenCL support
autograd automatic-differentiation cuda pycuda pyopencl tinygrad tinygrad-cuda
Last synced: 06 Mar 2025
https://github.com/mjun0812/setup-cuda
Set up a specific version of NVIDIA CUDA in GitHub Actions on Linux x86_64, arm64 (Debian and Fedora based distribution) and Windows
action cuda cuda-toolkit github-actions
Last synced: 13 Jan 2026
https://github.com/kmock930/texture-image-comparison
This project aims to build a model which classifies the type of an unseen image as accurate as possible, by implementing, evaluating, and comparing amongst 2 different multi-layer perceptron Neural Networks.
computer-vision conda confusion-matrix convolutional-neural-networks cuda image-preprocessing keras keras-tensorflow learning-curve-analysis matplotlib multi-layer-perceptron neural-network pickle-file python3 skimage
Last synced: 12 Apr 2026
https://github.com/flavienbwk/tensorflow2-cuda-10.2-docker
Tensorflow 2.3, CUDA 10.2, Docker compatible image
cuda docker python3 tensorflow ubuntu1804
Last synced: 11 Apr 2026
https://github.com/promptromp/aws-bootstrap-g4dn
fast and easy bootstrapping of AWS EC2 instances for CUDA development. Use as a CLI, as a programmatic SDK, or as an Agent Skill!
aws cuda ec2 jupyter-notebook machine-learning mlops python
Last synced: 21 Feb 2026
https://github.com/deltatecs/voses
Volatile Secret Searcher - massively parallel, brute force memory dump analysis for (D)TLS secret extraction
cuda memory-hacking reverse-engineering tls
Last synced: 15 Jun 2025
https://github.com/sonhm3029/setup-experience
This project for storage my setup experience, error met-and-solve in developing end to end AI, software project
ai computer-vision cuda deep-learning software
Last synced: 10 Jun 2026
https://github.com/jesuscopado/parallel-programming
My solutions for the course Programming Parallel Computers at Aalto University (http://ppc.cs.aalto.fi/). Grade: 5/5
cpp cuda image-segmentation median-filter sorting-algorithms
Last synced: 19 Apr 2026
https://github.com/alan-cooney/python-cuda-starter-template
Python CUDA Starter Template
Last synced: 30 Mar 2025
https://github.com/viktor-akusoff/chernabogpy
ChernabogPy is a Python package for visualizing gravitational distortions caused by black holes using nonlinear ray tracing.
cuda gpu physics-simulation python3 relativity-of-space-and-time torch
Last synced: 15 May 2026
https://github.com/gaaniruddha/mphil-gpu-imager
This repository contains code for project #1 of MPhil: test-version of GPU imager for a single time-step, single-channel and single time-step, multi-channel.
astronomy benchmarks cuda cufft google-sheets gpu-imager imaging-astronomy interferometry radio-astronomy
Last synced: 11 Jun 2026
https://github.com/hit07/ml-dl-torch
This repository contains comprehensive understanding of Machine Leaning, DeepLeaning using Pytorch
computer-vision convolutional-neural-networks cuda neural-networks pytorch
Last synced: 28 Feb 2025
https://github.com/lordofhyphens/gpu-path-delay-coverage
CUDA-based Path Delay Fault Coverage
Last synced: 04 May 2026
https://github.com/usman619/pdc
Parallel and Distributed Computing
cuda distributed-computing distributed-systems nextcloud
Last synced: 11 Apr 2026
https://github.com/bfalls/img-compressor
GPU-accelerated JPEG compressor
cli-tool command-line compression cpp cpp-cuda-gpu-programming-parallel-computing cuda dct demo-project gpgpu gpu-programming high-performance-computing hpc image-compression image-processing jpeg parallel-computing
Last synced: 20 Apr 2026
https://github.com/zury7/parallel-programming
A collection of performance optimizations and comparisons between multiprocessing and multithreading using pthreads, OpenMP, and CUDA. The experiments analyze execution speed, resource usage, and parallelization efficiency across different computational models. ( CS 4553 : Scientific Computing )
Last synced: 08 May 2026
https://github.com/mvishiu11/kmeans-clustering
K-Means Clustering with both GPU (CUDA) and CPU implementations
Last synced: 15 Mar 2025
https://github.com/abhiram-kandiyana/cuda-blast-2024
Reimplementation of NCBI BLAST with CUDA backend for faster retrieval
blast cuda gpu-acceleration parallel-processing
Last synced: 15 Mar 2025
https://github.com/zhaocc1106/cuxx-programing
一些cuda库的样例,cuda、cublas、cublaslt、cusparse...
Last synced: 23 Mar 2025
https://github.com/proafxin/cuda-docker
High performance computing Images with pycuda and tensorrt preinstalled
cuda docker dockerfile libcudnn nvidia-tensorrt pycuda python tensorrt
Last synced: 11 Apr 2026
https://github.com/BardiFarsi/ThreadPoolManager
ThreadPoolManager is a C++ project that implements an efficient multi-threading system using a thread pool for generic functions of the same type and different tasks. It includes task management, synchronization mechanisms, and thread-safe logging to demonstrate concurrent task execution.
cpp cpp17 cpp20 cuda cuda-programming memory-management multiprocessing multithreading parallel-computing parallel-processing parallel-programming thread thread-pool thread-safety threadpool threads threadsafe
Last synced: 15 May 2025
https://github.com/baudneo/zomi-server
FastAPI ML server designed for ZoneMinder (zomi-client)
alpr coral-tpu cuda face-detection face-recognition fastapi machine-learning object-detection onnxruntime opencv pydantic-v2 tensorrt torch zoneminder
Last synced: 18 Jan 2026
https://github.com/vladd12/libexecstd
Modern C++ library for using an execution context of computer devices
cpp cpp17 cuda gpu-acceleration gpu-computing
Last synced: 06 May 2026
https://github.com/jaidevd/ipec-fdp
cuda hpc keras mapreduce numba spark tensorflow
Last synced: 11 Apr 2026
https://github.com/grizzz13/minimal-cuda
Minimal configurations to setup cuda cpp in cmake.
Last synced: 18 Apr 2026
https://github.com/gammahazard/locate-anything
Sleek, mobile-friendly web UI for NVIDIA LocateAnything-3B — open-vocabulary object detection & grounding on your own GPU, via one docker compose up.
bounding-boxes computer-vision cuda docker fastapi gpu grounding locate-anything machine-learning nvidia object-detection ocr open-vocabulary-detection react self-hosted tailwindcss typescript vision-language-model web-ui
Last synced: 28 May 2026
https://github.com/sahil-rajwar-2004/vector-cuda
vector calculation with GPU acceleration using CUDA
c cpp11 cuda cuda-kernels cuda-programming nvcc
Last synced: 15 May 2025
https://github.com/neel-dandiwala/cuda-programs
Miscellaneous programs that grasp the concept of Parallel Computing
cuda gpu-programming parallel-programming
Last synced: 16 May 2025
https://github.com/tchung1970/sd-cli-cuda
CUDA-accelerated Stable Diffusion plugin for wavespeed-desktop
cuda gpu linux nvidia stable-diffusion
Last synced: 09 May 2026
https://github.com/bikrammajhi/100-days-of-gpu
This is my 🔥 100 Days of GPU — a wild, hands-on journey through CUDA kernels, Triton spells, and PTX sorcery.
cuda nsight-compute ptx triton
Last synced: 18 Jun 2025
https://github.com/dstrigl/cnnplus
Master thesis 2010: Fast Convolutional Neural Network Training and Classification on CUDA GPUs
cnn convolutional-neural-networks cpp cuda gpu neural-networks speedup thesis
Last synced: 30 Jun 2026
https://github.com/h4ck3r-04/fpassword
Fpassword merges Hashcat's hash-cracking precision with Hydra's parallelized network login, offering penetration testers a powerful tool for swift hash deciphering and simultaneous login attempts across diverse protocols.
brute-force brute-force-attacks c cracking cuda gpgpu hashcat hashes hydra network-security opencl password penetration-testing
Last synced: 16 Jan 2026
https://github.com/lucatedeschini/feedforwardnn
This project is my submission for the exam "Project Work in Architecture and Platform for Artificial Intelligence"
c cuda neural-networks openmp scratch-implementation
Last synced: 20 Apr 2026
https://github.com/akshaysinhaaa/emova
A deep learning framework designed for emotion and sentiment recognition using text, audio, and video modalities. This project leverages the MELD (Multimodal EmotionLines Dataset) to train a robust and flexible model that reflects human communication more accurately than unimodal models.
bert cnn cuda deep-learning multimodal python pytorch resnet-18 tensorboard transformers
Last synced: 05 May 2026
https://github.com/cmazakas/cuda-stuff
A CUDA-based playground
cmake cuda delaunay-triangulation vscode
Last synced: 24 Mar 2025
https://github.com/Parxd/cuda-optim
various CUDA kernels optimized for specific ML algos
Last synced: 02 Sep 2025
https://github.com/mattjesc/federated-learning-simulation-1gpu-mi-is
Federated Learning Simulation on a Single GPU with Model Interpretability and Interactive Visualization
ai cuda deep-learning distributed-systems federated-learning gpu hpc keras machine-learning ml model-interpretability python pytorch simulation streamlit tensorflow
Last synced: 05 Jan 2026
https://github.com/minseoc03/cuda-100-days
A 100-day journey to master CUDA programming, inspired by the CUDA-120-DAYS--CHALLENGE project. This repo contains daily CUDA exercises and code folders, with learning notes hosted on Notion. Practicing on leetgpu.com due to lack of local NVIDIA GPU.
100daysofcode cuda deeplearning gpgpu gpu hpc nvidia parallel-computing
Last synced: 19 Apr 2025
https://github.com/moesio-f/cla
C Linear Algebra (CLA) library. A simple toy library for basic vector/matrix operations with CUDA support and Python bindings.
Last synced: 09 May 2026
https://github.com/ndgigliotti/torch-ipca
GPU-accelerated Incremental PCA for PyTorch
cuda dimensionality-reduction gpu incremental-pca machine-learning pca pytorch
Last synced: 26 Jan 2026
https://github.com/marnovo/cuda-projects
cuda cuda-kernels gpu gpu-programming nvidia-cuda parallel-computing
Last synced: 10 Jun 2025
https://github.com/ionmich/cs149-local-dev
Provides `conda` installation instructions for Stanford's CS149 (Parallel Computing) programming assignments
conda cs149 cuda ispc parallel-computing
Last synced: 31 Mar 2025
https://github.com/AMYPAD/miutil
Basic functionality needed for AMYPAD
cuda matlab medical-imaging python
Last synced: 10 Apr 2025
https://github.com/cuda8/brainwords2
GPU brainflayer for sale $250
brain brainflayer brainwords cuda gpu key pass passphrase private
Last synced: 10 Mar 2025
https://github.com/shtrophic/wicuvanity
Generate wireguard vanity keys on your Nvidia GPU
cuda gpu vanity-address vanity-addresses vanitygen wireguard
Last synced: 10 Mar 2025
https://github.com/monajemi-arman/sparkling
Easy to use Spark cluster management panel with GPU support
apache-spark csharp cuda distributed-computing distributed-learning docker gpu javascript nextjs torch typescript
Last synced: 12 Apr 2026
https://github.com/Neuro-Mechatronics-Interfaces/python-intan
Tools and demos for working with EMG data from intan using python
circuitpython cuda emg pico python realtime tensorflow
Last synced: 13 Jan 2026
https://github.com/kataglyphis/machinelearningalgorithms
Basic Machine Learning Algorithms
cuda machine-learning python tensorflow
Last synced: 31 Mar 2025
https://github.com/tdavidcl/cu_intercept
cuda cuda-memory cuda-programming hook massif memory-tracking preload
Last synced: 03 May 2026
https://github.com/codename-detective/cuda_gpgpus_shared_memory_systems_pdp
CUDA GPGPUs Shared Memory Systems Parallel & Distributed Programming
cuda cuda-programming numa parallel-programming
Last synced: 30 Mar 2025
https://github.com/voltr0x/raytracing-cuda
Raytracing in a weekend using CUDA
Last synced: 01 Apr 2026
https://github.com/phrutis/bip39scan
brute bip39 mnemonic GPU - $250
bip39 brute brute-force bruteforce cuda gpu mnemonic phrases seed
Last synced: 10 Apr 2025
https://github.com/kronbii/thermal-super-resolution
State-of-the-art thermal super-resolution system (IMDN) with RGB→thermal adaptation, custom multi-component loss, 29.6 dB PSNR, 0.713 SSIM, 250+ FPS, production-ready PyTorch + CUDA implementation.
computer-vision cuda deep-learning image-enhancement imdn model-optimization production-machine-learning pytorch real-time real-time-processing research super-resolution thermal-imaging
Last synced: 18 Apr 2026
https://github.com/asadiahmad/100_sports_image_classification
A deep learning project for sport image classification using a custom VGG19-based architecture with integrated Grad-CAM heatmap visualization for model interpretability.
computer-vision cuda data-augmentation deep-learning explainable-ai gpu-acceleration grad-cam heatmap-visualization image-classification mixed-precision-training pytorch pytorch-grad-cam sports-analytics sports-classification transfer-learning vgg19
Last synced: 11 Jun 2025
https://github.com/ysl1016/cudadigitfilter
CUDA-based parallel image filtering system for MNIST dataset
computer-vision cuda deep-learning gpu-acceleration image-processing mnist parallel-computing
Last synced: 28 Mar 2025
https://github.com/tomtolleson/cuda-kernel-benchmarking-tool
A benchmarking tool in C++ that creates Cuda kernels and tests the overall system performance between CPU and GPU
cuda cuda-kernels cuda-support cuda-toolkit nvidia nvidia-cuda nvidia-gpu
Last synced: 30 Mar 2025
https://github.com/sephiroth7712/k-nearest-neigbours
Implementation of K-Nearest Neighbors algorithm using multiple parallel computing approaches: CUDA (GPU), Hadoop, Spark, MPI, OpenMP, and PThreads. Demonstrates scalable machine learning across different parallel computing paradigms from GPU to distributed frameworks.
cuda cuda-programming hadoop-mapreduce java mpi multiprocessing multithreading openmp pthreads scala spark
Last synced: 12 Apr 2026
https://github.com/TeamBipartite/bipartite-gemm
High throughput data-parallel GEMM implementations in Cuda using Cuda cores and Tensor cores
Last synced: 14 Jan 2026
https://github.com/drilonaliu/parallel-mandelbrot-set
GPU-accelerated Mandelbrot Set generation with CUDA and OpenGL interoperability.
cuda fractals gpu mandelbrot-fractal parallel-programming
Last synced: 12 Apr 2026
https://github.com/joe-mruz/hgvisualizer
An interactive simulation and visualization tool for evolving hypergraphs, inspired by the Wolfram Physics Project.
cpp cuda hypergraph physics simulator wolfram
Last synced: 02 May 2026
https://github.com/aurelienperez/gpu-heston-monte-carlo
GPU-accelerated Monte Carlo simulation for option pricing under the Heston model using CUDA.
Last synced: 01 Apr 2025
https://github.com/nikhilrout/thetensorcoreproject
Microarchitecture implementation of Nvidia's Tensor Cores
cuda floating-point gpgpu hybrid-precision-training tensorcore
Last synced: 01 Apr 2025
https://github.com/yutakseo/docker_ubuntu-cuda_environment
🐳 A ready-to-use Docker environment for deep learning development with Ubuntu 22.04 and CUDA 11.8.
container cuda docker environment ubuntu
Last synced: 12 Apr 2026
https://github.com/isquicha/cuda-parallel-studies
Learning CUDA programming here =D
cuda cuda-programming cuda-toolkit
Last synced: 03 Jul 2025
https://github.com/fikri-rouzan/cuda-c-program-part-1
CUDA C program from NVIDIA course.
Last synced: 12 Apr 2026
https://github.com/crazyguitar/libefaxx
aws benchmark cpp20-coroutine cuda efa gpu gpu-benchmarks hpc large-language-models llm rdma rdma-benchmarks
Last synced: 16 Jan 2026
https://github.com/alpinebuster/meshlib
Mesh processing library with extra `C/C#/JS/TS/PYTHON` bindings.
cuda dicom electron emscripten mesh mesh-modelling pybind11 stl stomatology threejs wasm
Last synced: 03 Jul 2025
https://github.com/pipecruz/cuda-flocking-sim
CPU and GPU (CUDA) implementations of naive/optimized flocking algorithms
Last synced: 07 May 2026