CUDA
CUDA® is a parallel computing platform and programming model developed by NVIDIA for general computing on graphical processing units (GPUs). With CUDA, developers are able to dramatically speed up computing applications by harnessing the power of GPUs.
- GitHub: https://github.com/topics/cuda
- Wikipedia: https://en.wikipedia.org/wiki/CUDA
- Created by: Nvidia
- Released: June 23, 2007
- Related Topics: nvcc,
- Last updated: 2026-06-30 00:07:24 UTC
- JSON Representation
https://github.com/debowin/gpu-parallel-recommender-system
GPGPU Parallel User-User Collaborative Filtering System in CUDA C
collaborative-filtering cuda gpu-programming movielens-dataset recommender-system
Last synced: 24 Apr 2026
https://github.com/cfries/javagpuexperiments
Repository used to demo OpenCL, JOCL, JCuda.
Last synced: 25 Apr 2026
https://github.com/kpetridis24/four-russians-algorithm
Boolean matrix multiplication accelerated by the four-Russians algorithm
c cuda gpu high-performance matrix-multiplication preprocess
Last synced: 29 May 2026
https://github.com/kagof/julia-image-processing
Image processing programs written in Julia
Last synced: 18 May 2026
https://github.com/juntyr/necsim-rust
Spatially explicit biodiversity simulations using a parallel library written in Rust
biodiversity cuda mpi necsim rust simulation
Last synced: 22 Mar 2025
https://github.com/stdogpkg/cukuramoto
A python/CUDA pkg which solves numerically the kuramoto model through the Heun's method
complex-networks cuda kuramoto-model
Last synced: 28 Jan 2026
https://github.com/acrlakshman/gradient-augmented-levelset-cuda
Implementation of Gradient Augmented Levelset method for CPU and GPU
Last synced: 17 Feb 2026
https://github.com/markdtw/parallel-programming
Basic Pthread, OpenMP, CUDA examples
cuda openmp parallel-programming pthreads
Last synced: 20 Apr 2026
https://github.com/kilamper/matrix-multiplication
AC - Matrix multiplication using OpenMP, MPI and CUDA
Last synced: 16 May 2026
https://github.com/alpha74/cuda_basics
Nvidia NVCC CUDA programs for begineers.
c cpp cuda cuda-programs nvcc nvidia parallel-computing parallel-programming
Last synced: 08 May 2026
https://github.com/djenriquez/ewbf-cuda-miner
Run ewbf-miner for zcash
cuda docker mining nvidia nvidia-docker zcash zcl zclassic
Last synced: 17 May 2026
https://github.com/pothosware/pothosgpu
Pothos toolkit for ArrayFire API support
arrayfire cuda dataflow dataflow-programming gpu opencl pothos
Last synced: 19 Apr 2026
https://github.com/muhac/jupyter-pytorch-docker
JupyterLab for AI in Docker! Anaconda and PyTorch GPU supported.
conda-environment cuda docker jupyterlab pytorch
Last synced: 01 Oct 2025
https://github.com/andreasholt/cusmc
A CUDA-accelerated Statistical Model Checker for Stochastic Timed Automata
Last synced: 11 Feb 2026
https://github.com/alpinebuster/arkime-docker-compose
Deploy Arkime with GPU-accelerated Rust/Python parsers and custom plugins using Docker Compose.
arkime c cuda deep-neural-networks docker docker-compose llm machine-learning networking pcap pcapng python rust traffic-analysis
Last synced: 16 Apr 2026
https://github.com/tvanfossen/entropic
Local-first agentic inference engine in C/C++. Multi-tier model routing, grammar-constrained output, MCP tool servers. Embeddable via C ABI.
agentic-ai agentic-framework cpp cpp20 cuda edge-ai embedded-ai gbnf gguf grammar-constrained-decoding inference-engine llama-cpp llm local-llm mcp on-device-ai privacy-first tool-calling
Last synced: 30 May 2026
https://github.com/szymon423/tsp-cpu-vs-gpu
Simple brute force approach to solve travelling salesman problem with CPU and GPU
Last synced: 11 Mar 2025
https://github.com/kar-dim/watermarking-gpu
Code for my Diploma thesis at Information and Communication Systems Engineering (University of the Aegean, School of Engineering) with title "Efficient implementation of watermark and watermark detection algorithms for image and video using the graphics processing unit". Part 2 / GPU
arrayfire cpp cuda ffmpeg gpu image-processing opencl parallel-computing video-processing watermark-image watermarking
Last synced: 09 Apr 2025
https://github.com/podgorskiy/deeplearningserversetup
My notes on setting up a server for Deep-Learning
cuda deep-learning driver ethernet ipmi neural-network nfs notes nvidia nvidia-driver nvidia-gpu server sshfs ubuntu
Last synced: 22 Aug 2025
https://github.com/prithivsakthiur/vlm-parsing
VLM-Parsing is a Gradio-based web application for parsing documents and images into structured HTML and Markdown formats using advanced Vision Language Models (VLMs).
cuda gradio html huggingface-models huggingface-spaces huggingface-transformers logics markdown ocr-recognition pytorch qwen2-5-vl spaces vlm
Last synced: 05 Apr 2026
https://github.com/kishore-narendran/eecs221-highperformancecomputing
Assignments done during the graduate course EECS 221 - Introduction to HPC that I took in the Spring Quarter of 2016 at University of California, Irvine. Involves assignments that use OpenMP, MPI and CUDA.
Last synced: 17 May 2026
https://github.com/l30nardosv/reproduce-parcosi-moleculardocking
Reproducing paper: "Benchmarking the Performance of Irregular Computations in AutoDock-GPU Molecular Docking"
autodock-gpu cpu cuda gpu molecular-docking molecular-docking-scripts opencl paper reproducible-research
Last synced: 16 Feb 2026
https://github.com/tthebc01/cudaconda3
Lightweight container environment with Cuda, Miniconda3, and Jupyter Lab.
cuda docker gpu jupyterlab marimo-notebook miniconda3 reverse-proxy-application
Last synced: 11 Feb 2026
https://github.com/xiongsp/pytorch-docker
Pure Pytorch Docker Images. Support almost all combinations of Pytorch, Python, Ubuntu, CentOS, and CUDA. 纯净的Pytorch镜像,支持几乎各种Pytorch、Python、Ubuntu、CentOS、CUDA版本的组合。
centos cuda docker docker-image python3 pytorch ubuntu
Last synced: 17 Apr 2026
https://github.com/droduit/multiprocessor-architecture
Introduction to Multiprocessor Architecture @ EPFL
cuda multiprocessor multithreading openmp-parallelization
Last synced: 17 Apr 2026
https://github.com/amypad/numcu
Numerical CUDA-based Python library
array buffer c cpp cpython cpython-api cpython-extensions cuda cxx hacktoberfest numpy python vector
Last synced: 29 Jun 2025
https://github.com/galaxies99/inception-cuda
CUDA Implementation of Inception
Last synced: 12 Apr 2025
https://github.com/true-real-michael/python-plane-ransac
Parallel RANSAC for plane detection for multiple point clouds using Python and CUDA
cuda numba plane-detection python ransac
Last synced: 14 Mar 2025
https://github.com/lintenn/cudaaddvectors-explicit-vs-unified-memory
Performance comparison of two different forms of memory management in CUDA
c cuda explicit memory memory-management performance unified-memory
Last synced: 17 May 2026
https://github.com/mulx10/firefly
Enhancing Object Detection in using Thermal Imaging for thin cross-section unidentifiable objects(eg. cyclist, pedestrians).
autonomous-cars autonomous-navigation autonomous-vehicles c cuda object-detection thermal-camera yolov3
Last synced: 03 Sep 2025
https://github.com/programmer-rd-ai/detectx
A Pythonic approach to object detection using Detectron2, a clean, modular framework for training and deploying computer vision models. DetectX simplifies the complexity of object detection while maintaining high performance and extensibility.
coco-dataset computer-vision computer-vision-library cuda deep-learning detectron2 faster-rcnn gpu-accelerated machine-learning ml-framework object-detection object-recognition python3 pytorch retinanet
Last synced: 10 Jun 2025
https://github.com/tyler-hilbert/cuda-kmeans
K-Means in CUDA
cuda kmeans-clustering machine-learning nsight
Last synced: 30 Mar 2025
https://github.com/LKohlhepp/Ito-Monte-Carlo
MC-Simulation of the Ito-SDE (Krülls 1994)
astronomy astrophysics cuda gpu-acceleration monte-carlo physics-simulation simulation stochastic-differential-equations
Last synced: 10 Mar 2025
https://github.com/dzimiks/cuda-matrix-multiplication
CUDA Matrix Multiplication
cuda matrix matrix-multiplication python
Last synced: 16 Apr 2026
https://github.com/xlite-dev/HGEMM
⚡️Write HGEMM from scratch using Tensor Cores with WMMA, MMA PTX and CuTe API. 🎉🎉
Last synced: 30 Jul 2025
https://github.com/qin-yu/julia-svm-gpu-cuda
2019 [Julia] GPU CUDAnative SVM: a stochastic decomposition implementation of support-vector machine training
cpp cuda cuda-programming gpu gpu-computing gpu-programming julia julia-language julia-package machine-learning machine-learning-algorithms machine-learning-library online-learning supervised-learning svm svm-classifier svm-learning svm-library svm-model svm-training
Last synced: 12 Apr 2026
https://github.com/B1-663R/docker-mining
Dockerfiles to build docker images to start mining with an NVIDIA Docker architecture
cryptocurrency cuda docker-image docker-nvidia mining
Last synced: 28 Mar 2025
https://github.com/matthewfeickert/cuda-tf-torch
An Ubuntu 18.04 NVIDIA Docker image with CUDA 10.1 CuDNN 7 with TensorFlow and PyTorch
cuda cuda-101 cudnn cudnn-v7 docker docker-image gpu nvidia-docker nvidia-gpu pytorch tensorflow torch
Last synced: 07 Jan 2026
https://github.com/capelliexp/sc2-im-pf-pathfinding-thesis
Master of science thesis project. Using CUDA to utilize a systems GPU to create pathfinding data (IM+PF), usable by multiple agents in the same environment.
ai cplusplus cuda gpgpu pathfinding starcraft2
Last synced: 15 May 2026
https://github.com/bdwhst/fluora
A CUDA PBR path tracer
cpp cuda pathtracing pbr rendering
Last synced: 13 Feb 2026
https://github.com/peri044/cuda
GPU implementations of algorithms
cuda gauss-jordan parallel-programming
Last synced: 14 Jul 2025
https://github.com/andreabak/whispersubs
Generate subtitles for your video or audio files using the power of AI
ai cuda deep-learning gpu-acceleration machine-learning srt subtitles transcribe transcription translate whisper
Last synced: 15 Feb 2026
https://github.com/projectcontinuum/continuum-feature-ai
AI and ML features for continuum
ai continuum continuum-feature cuda llm ml mlops pytourch unsloth
Last synced: 04 Apr 2026
https://github.com/terrylindev/image-to-ASCII
🖼️ A command-line tool for converting images to ASCII art
ascii ascii-art cli command-line cpp cuda docker image-processing image-to-ascii mpi opencv terminal
Last synced: 12 Jul 2025
https://github.com/szaghi/adam
Multi-physics AMR SDK and apps for High Performance Computing — from laptop to exascale device-accelerated superpc
amr cfd cuda fluid-dynamics fortran gas-dynamics hpc hydro-dynamics mpi openacc openmp plasma-dynamics
Last synced: 04 Apr 2026
https://github.com/nellogan/distributed_compy
Distributed_compy is a distributed computing library that offers multi-threading, heterogeneous (CPU + mult-GPU), and multi-node support
cluster cuda heterogeneous-parallel-programming multi-threading multigpu openmp openmpi
Last synced: 16 Aug 2025
https://github.com/navdeep-g/dimreduce4gpu
Dimensionality reduction ("dimreduce") on GPUs ("4gpu")
cplusplus cuda dimensionality-reduction gpu linear-algebra pca python svd unsupervised-learning
Last synced: 14 Apr 2025
https://github.com/tank3-tk3/pi-calculation-cpu-gpu
PI calculation with CPU and GPU
c cpp cuda parallel-computing pi
Last synced: 13 Apr 2026
https://github.com/copperfr/blendervxkex
Windows 7 CUDA & OptiX support for Blender 4.x
blender cuda cycles-renderer optix vxkex windows-7
Last synced: 20 Jan 2026
https://github.com/boltzmannentropy/vllm-5090
vLLM-5090: Docker Container for RTX 5090 on WSL2/Windows
Last synced: 08 Oct 2025
https://github.com/dujonwalker/nixos-config-x86_64-cuda
This repository contains my NixOS configuration optimized for 64-bit x86 systems with NVIDIA CUDA support, featuring a Plasma 6 desktop environment and a variety of essential applications for development, multimedia, and productivity. It serves as a backup for easy restoration and setup on new installations.
cuda flatpak nix nixos nixos-configuration ollama
Last synced: 17 Jan 2026
https://github.com/mu7annad0/100gpu
100 Days of CUDA: Optimizing My Life, One Kernel at a Time. 🔄🔥
Last synced: 08 Mar 2026
https://github.com/matthias-fauconneau/combustion
Reaction rates and transport properties
ast cantera chemistry code-generation combustion compute cranelift cuda cvode interpreter ir rates reaction spirv transport vulkan
Last synced: 04 Apr 2026
https://github.com/betarixm/cuecc
POSTECH: Heterogeneous Parallel Computing (Fall 2023)
cryptography ctypes cuda ecc postech secp256k1
Last synced: 12 May 2025
https://github.com/ginkgo-project/cudaarchitectureselector
A CMake module simplifying the specification of CUDA architectures
Last synced: 05 Nov 2025
https://github.com/kim-hwiwon/T-espresso
A CUDA Library for Low-overhead Host-to-Device Transmission of Patterned Profile Data
Last synced: 10 Apr 2025
https://github.com/rogerallen/jmandelbrotr
Java CUDA Mandelbrot explorer
cuda cuda-opengl java jcuda joml lwjgl3 mandelbrot-viewer opengl
Last synced: 18 Apr 2026
https://github.com/liberxue/parallel_computing
CUDA Algorithm && Hacker's Delight
algorithms cuda cuda-kernels cuda-programming hacker-s-delight nvidia
Last synced: 24 Feb 2026
https://github.com/miniex/maidenx
Rust-based CUDA library designed for learning purposes and building my AI engines named Maiden Engine
Last synced: 20 Mar 2025
https://github.com/stanczakdominik/cuda_poisson
A 2D poisson solver via CUDA
Last synced: 29 Jun 2025
https://github.com/saiccoumar/cuda-programming-exercises
Brief collection of GPU exercises (my reimplementation). Comes with relevant resources.
cuda cuda-programming nvcc nvidia
Last synced: 25 May 2026
https://github.com/dansolombrino/gphungarian
A GPU-accelerated implementation of the Hungarian Algorithm, written in CUDA
Last synced: 31 Aug 2025
https://github.com/SanaeProject/Matrix-for-Cpp
This repository has types that handle matrices.
cpp14 cpp14-library cuda matrix-library
Last synced: 15 May 2025
https://github.com/thunder-compute/thunder-compute-documentation
Documentation for Thunder Compute, a cloud platform creating technology to virtualize GPUs over TCP
ai artificial-intelligence cloud cloud-computing cuda gpu llm machine-learning nvidia pytorch tensorflow thunder-compute virtualization
Last synced: 15 Oct 2025
https://github.com/bl33h/productoftwovectors
This code utilizes CUDA for parallel vector multiplication on a GPU, demonstrating GPU's acceleration capabilities.
cuda gpu kernel paralelism parallel-programming product vector
Last synced: 16 May 2026
https://github.com/thisalmandula/gpu_accelerated_lpt_cfd_code
This repository contains GPU accelerated version of the particle tracking model developed by Merel Kooi for biofouled microplastic particles ( available at: https://pubs.acs.org/doi/10.1021/acs.est.6b04702) written in CUDA Fortran and CUDA Python. This repository is intended as a learning tool for GPU programming.
biofouling computational-fluid-dynamics cuda fortran lagrangian-particle-tracking microplastics python
Last synced: 02 May 2026
https://github.com/vishwamartur/btc_recovery
High-performance Bitcoin wallet password recovery system with GPU acceleration and integrated graphics support. Recover Bitcoin Core wallet.dat files without blockchain download using advanced algorithms and blockchain APIs.
bitcoin bitcoin-core blockchain blockchain-api cpp cryptocurrency cuda electrum gpu-acceleration integrated-graphics multithreading opencl password-recovery private-keys recovery-tools wallet-dat wallet-recovery
Last synced: 14 Apr 2026
https://github.com/satyajitghana/gpu-programming
Contains the contents of GPU Architecture and Programming course done on NPTEL
c cpp cuda cuda-programming gpu-programming nptel nvidia
Last synced: 09 Mar 2026
https://github.com/enp1s0/curand_fp16
FP16 pseudo random number generator on GPU
cuda gpu half-precision random-number-generators
Last synced: 20 Aug 2025
https://github.com/a-nau/python-cuda-envs
Script to automatically map a specific CUDA version to a Conda Python environment.
anaconda anaconda-environment cuda installation installation-script python python-environment python3
Last synced: 18 Apr 2026
https://github.com/renatomaynard/a-multiple-population-coarse-grained-genetic-algorithm-to-solve-the-quadratic-assignment-problem-
A Multiple-population coarse-grained Genetic Algorithm to solve the Quadratic Assignment Problem
c cuda genetic-algorithm quadratic-assignment-problem
Last synced: 09 May 2026
https://github.com/emmanuelmess/firstcollisiontimesteprarefiedgassimulator
This simulator computes all possible intersections for a very small timestep for a particle model
Last synced: 17 Apr 2026
https://github.com/daelsepara/hipmandelbrot
GPU Implementation of Mandelbrot Fractal Generator with Benchmarking
amd cuda fractal gpu gpu-compute gpu-computing hip mandelbrot parallel-computing rocm sdk
Last synced: 20 Feb 2026
https://github.com/kichappa/spy-sim
Simulate a spying strategy on a topography
combat-modeling cuda differential-equations julia modeling-and-simulation topography-simulation
Last synced: 09 Mar 2026
https://github.com/ruturaj4/cuda_nvidia_tutorial
cuda projects
cuda cuda-vector-addition nvidia nvidia-cuda parallel
Last synced: 26 Oct 2025
https://github.com/dvhh/masscorrelation
An exercise in writing an efficient correlation calculator
calculations correlation-calculation cuda matrix multi-threading openmp
Last synced: 15 May 2026
https://github.com/blazekill/hello-cuda
Cpp + Vcpkg + CUDA + VsCode starter project.
Last synced: 18 May 2026
https://github.com/rnabla/cuda-des
Bruteforcing DES using CUDA
bruteforce cuda data des encryption gpu parallel standard
Last synced: 27 Oct 2025
https://github.com/xihuai18/image-processing-in-cuda
Implementation of Image Processing Method
Last synced: 04 Oct 2025
https://github.com/andrewboessen/bitonic-merge-sort
Bitonic Merge Sort algorithm optimized for GPU execution
bitonic-merge-sort cuda sorting-network
Last synced: 16 May 2026
https://github.com/andih/cuda-fortran-stream
Variant of STREAM Benchmark in CUDA Fortran
cuda cuda-fortran gpu stream-benchmarks variants
Last synced: 02 Mar 2025
https://github.com/liuyuweitarek/pytorch-docker-builder
Automate PyTorch Docker image builds with compatible Python, CUDA, and Poetry versions, including CI/CD for testing.
cicd containerd cuda docker docker-image poetry-python python python3 pytorch pytorch-docker
Last synced: 06 Feb 2026
https://github.com/lhldev/rust-neural-network
neural network implementation in rust
cuda feedforward-neural-network
Last synced: 16 May 2026
https://github.com/bjornmelin/ml-vision-lab
👁️ Production-grade computer vision implementations. Real-world applications in image processing, object detection, and video analytics with GPU acceleration. 📸
computer-vision cuda deep-learning image-processing object-detection opencv pytorch video-analytics
Last synced: 04 Apr 2026
https://github.com/michaelfranzl/image_debian-gpgpu
Dockerfile for a Debian base image with AMD and Nvidia GPGPU support
amd container container-image cuda debian docker gpgpu nvidia opencl
Last synced: 10 May 2026
https://github.com/mre/talks
...mostly Computer Science related.
computer-science cuda talks tech-talks
Last synced: 28 Apr 2026
https://github.com/dafadey/GPGPU_OpenCL_vs_CUDA
This is a repository with sample codes for testing memory bandwidth, arithmetic latency hiding and shared/local memory performance on AMD and nVidia devices
cuda gpgpu gpgpu-computing opencl
Last synced: 16 May 2025
https://github.com/sarah627/horus_eye_fcih_graduation_project
An AI-powered tourism website using YOLOv7 for real-time landmark detection in images. Built with Flask, PyTorch, and Roboflow for seamless tourist interaction.
computer-vision cuda flask jupyter-notebook kaggle matplotlib object-detection opencv python pytorch roboflow
Last synced: 14 Apr 2026
https://github.com/alekseyscorpi/vacancies_server
This is a server for vacancies generation using LLM (Saiga3)
code cuda cuda-toolkit docker dockerfile flask llama3 llamacpp llm ngrok pydantic saiga
Last synced: 06 Feb 2026
https://github.com/malolm/jupyter-ml-with-gpu-support
Jupyter with GPU acceleration for Windows 10/11
cuda cudnn jupternotebook jupyter jupyterlab nvidia-gpu windows-10 windows-11
Last synced: 09 Apr 2026