CUDA
CUDA® is a parallel computing platform and programming model developed by NVIDIA for general computing on graphical processing units (GPUs). With CUDA, developers are able to dramatically speed up computing applications by harnessing the power of GPUs.
- GitHub: https://github.com/topics/cuda
- Wikipedia: https://en.wikipedia.org/wiki/CUDA
- Created by: Nvidia
- Released: June 23, 2007
- Related Topics: nvcc,
- Last updated: 2026-07-01 00:07:09 UTC
- JSON Representation
https://github.com/maliknaik16/parallel-computing
CUDA programming in C++ for high-performance computing using Nvidia GPUs, optimized for tasks like machine learning, or image processing
cores cpp cuda gpu makefile matrix nvcc optimization
Last synced: 10 Jun 2025
https://github.com/xiongsp/pytorch-docker
Pure Pytorch Docker Images. Support almost all combinations of Pytorch, Python, Ubuntu, CentOS, and CUDA. 纯净的Pytorch镜像,支持几乎各种Pytorch、Python、Ubuntu、CentOS、CUDA版本的组合。
centos cuda docker docker-image python3 pytorch ubuntu
Last synced: 17 Apr 2026
https://github.com/stdogpkg/cukuramoto
A python/CUDA pkg which solves numerically the kuramoto model through the Heun's method
complex-networks cuda kuramoto-model
Last synced: 28 Jan 2026
https://github.com/babak2/optimizedsum
Optimized Parallel Sum program demonstrating CPU vs GPU performance
cuda cuda-programming gpu-acceleration gpu-computing gpu-parallelism visual-studio
Last synced: 27 Mar 2025
https://github.com/droduit/multiprocessor-architecture
Introduction to Multiprocessor Architecture @ EPFL
cuda multiprocessor multithreading openmp-parallelization
Last synced: 17 Apr 2026
https://github.com/l1cacheDell/CUDA_Code
Codes for learning cuda. Implementation of multiple kernels.
Last synced: 10 Mar 2025
https://github.com/cfries/javagpuexperiments
Repository used to demo OpenCL, JOCL, JCuda.
Last synced: 25 Apr 2026
https://github.com/alexjmercer/fractal-art
Generating Fractals in C++ using SFML. For the ultimate visual stimulation and in-depth code!
cmake cmakelists cpp20 cuda cuda-programming fractal-rendering graphics mandelbrot multithreading sfml2
Last synced: 05 Mar 2026
https://github.com/mulx10/firefly
Enhancing Object Detection in using Thermal Imaging for thin cross-section unidentifiable objects(eg. cyclist, pedestrians).
autonomous-cars autonomous-navigation autonomous-vehicles c cuda object-detection thermal-camera yolov3
Last synced: 03 Sep 2025
https://github.com/programmer-rd-ai/detectx
A Pythonic approach to object detection using Detectron2, a clean, modular framework for training and deploying computer vision models. DetectX simplifies the complexity of object detection while maintaining high performance and extensibility.
coco-dataset computer-vision computer-vision-library cuda deep-learning detectron2 faster-rcnn gpu-accelerated machine-learning ml-framework object-detection object-recognition python3 pytorch retinanet
Last synced: 10 Jun 2025
https://github.com/tyler-hilbert/cuda-kmeans
K-Means in CUDA
cuda kmeans-clustering machine-learning nsight
Last synced: 30 Mar 2025
https://github.com/headless-start/data-augmentation-impact
This repository contains effect of Data Augmentation of Training Set during Model Training.
augmented-images cuda data gpu keras matplotlib mnist opencv-python python3 tensorflow training-data
Last synced: 05 Apr 2026
https://github.com/digimortl/libguess
Patches that give Bitcoin Core an ability of CUDA mining
bitcoin c-plus-plus cryptocurrency cuda
Last synced: 16 Apr 2026
https://github.com/juntyr/necsim-rust
Spatially explicit biodiversity simulations using a parallel library written in Rust
biodiversity cuda mpi necsim rust simulation
Last synced: 22 Mar 2025
https://github.com/LKohlhepp/Ito-Monte-Carlo
MC-Simulation of the Ito-SDE (Krülls 1994)
astronomy astrophysics cuda gpu-acceleration monte-carlo physics-simulation simulation stochastic-differential-equations
Last synced: 10 Mar 2025
https://github.com/zeloe/juce_cuda_convolution
GPU acceleration for efficient, high-quality audio processing.
audio audio-processing convolution cuda dsp juce
Last synced: 03 Mar 2026
https://github.com/qin-yu/julia-svm-gpu-cuda
2019 [Julia] GPU CUDAnative SVM: a stochastic decomposition implementation of support-vector machine training
cpp cuda cuda-programming gpu gpu-computing gpu-programming julia julia-language julia-package machine-learning machine-learning-algorithms machine-learning-library online-learning supervised-learning svm svm-classifier svm-learning svm-library svm-model svm-training
Last synced: 12 Apr 2026
https://github.com/tank3-tk3/pi-calculation-cpu-gpu
PI calculation with CPU and GPU
c cpp cuda parallel-computing pi
Last synced: 13 Apr 2026
https://github.com/alpinebuster/arkime-docker-compose
Deploy Arkime with GPU-accelerated Rust/Python parsers and custom plugins using Docker Compose.
arkime c cuda deep-neural-networks docker docker-compose llm machine-learning networking pcap pcapng python rust traffic-analysis
Last synced: 16 Apr 2026
https://github.com/rogerallen/jmandelbrotr
Java CUDA Mandelbrot explorer
cuda cuda-opengl java jcuda joml lwjgl3 mandelbrot-viewer opengl
Last synced: 18 Apr 2026
https://github.com/andreabak/whispersubs
Generate subtitles for your video or audio files using the power of AI
ai cuda deep-learning gpu-acceleration machine-learning srt subtitles transcribe transcription translate whisper
Last synced: 15 Feb 2026
https://github.com/dzimiks/cuda-matrix-multiplication
CUDA Matrix Multiplication
cuda matrix matrix-multiplication python
Last synced: 16 Apr 2026
https://github.com/orlandopalmeira/trabalho-cp-2023-2024
Repositório do trabalho prático no âmbito da UC de Computação Paralela (CP) - Mestrado em Engenharia Informática (MEI/MIEI) - Universidade do Minho (UMinho)
computacao-paralela cp cuda cuda-programming mei miei nvidia nvidia-cuda openmp optimization optimization-problem parallelism performance uminho uminho-mei uminho-miei
Last synced: 18 May 2026
https://github.com/mazharuddin-mohammed/semidgfem
High-performance TCAD Simulator Using Discontinuous Galerkin FEM
cuda discontinuous-galerkin-method tcad tcad-device-simulator
Last synced: 15 Jun 2025
https://github.com/tthebc01/cudaconda3
Lightweight container environment with Cuda, Miniconda3, and Jupyter Lab.
cuda docker gpu jupyterlab marimo-notebook miniconda3 reverse-proxy-application
Last synced: 11 Feb 2026
https://github.com/andreasholt/cusmc
A CUDA-accelerated Statistical Model Checker for Stochastic Timed Automata
Last synced: 11 Feb 2026
https://github.com/betarixm/cuecc
POSTECH: Heterogeneous Parallel Computing (Fall 2023)
cryptography ctypes cuda ecc postech secp256k1
Last synced: 12 May 2025
https://github.com/hanzhi713/bitonic-sort
In-place GPU sort with bitonic sort
bitonic-sort cuda gpu in-place sorting
Last synced: 09 Feb 2026
https://github.com/fattorib/thunderkittens-simple-gemm
Simple Tensorcore GEMM in ThunderKittens
Last synced: 09 Feb 2026
https://github.com/lukasboettcher/msc-code
This is the repo for my master thesis on a GPU accelerated andersen analysis.
andersen-analysis clang cuda llvm static-analysis
Last synced: 16 Jan 2026
https://github.com/mr-technologies/imagefiltercpp
Example of custom image filter for MRTech IFF C++ SDK
camera cpp cuda demosaicing dng genicam gpu h264 h265 image-processing jetson json low-latency machine-vision mipi rest-api rtsp sdk tiff vulkan
Last synced: 26 Feb 2026
https://github.com/trilliwon/cuda-examples
CUDA examples
cuda gpu-computing nvidia-cuda parallel parallel-computing parallel-programming
Last synced: 25 Mar 2025
https://github.com/dpbm/qml-course
Minicurso de quantum Machine learning
cuda cuda-q cuquantum docker ml python qml quantum quantum-computing tensorflow
Last synced: 31 Jan 2026
https://github.com/gjbex/gpu-programming
Material for a training on portable GPU programming
cuda gpu kokkos openmp openmp-off stl thrust
Last synced: 08 Feb 2026
https://github.com/elftausend/sliced
Array operations with automatic differentiation on CPU and GPU
autograd automatic-differentiation cuda custos matrix opencl
Last synced: 31 Jan 2026
https://github.com/seungjaelim/cuda.tutorial
References content from the OLCF CUDA Training Series. (https://github.com/olcf/cuda-training-series)
cuda gpu-programming nsight-compute nsight-systems
Last synced: 07 Feb 2026
https://github.com/copperfr/blendervxkex
Windows 7 CUDA & OptiX support for Blender 4.x
blender cuda cycles-renderer optix vxkex windows-7
Last synced: 20 Jan 2026
https://github.com/infotrend-inc/ctpo-demo_projects
Jupyter Notebook examples using CTPO as their source container.
cuda opencv pytroch tensorflow2
Last synced: 14 Apr 2026
https://github.com/trahay/mpi-wattmeter
MPI-Wattmeter measures the power consumption of MPI programs
carbon-emissions cuda energy-consumption energy-monitor gpu hpc mpi
Last synced: 17 May 2026
https://github.com/tawssie/zmpy3d_pt
Python implementation of 3D Zernike moments with PyTorch
3d-zernike cuda gpu protein-structure python pytorch structural-bioinformatics superposition zernike-moments
Last synced: 24 Oct 2025
https://github.com/openspeedshop/cbtf-argonavis-gui
Baseline for next generation Open|SpeedShop Graphical User Interface (GUI). The primary focus of this GUI will be the processing and display of CUDA collector performance data. However, there will be refactoring phases to adopt the GUI to support the processing and display of any collector performance data.
cuda performance profiler profiling
Last synced: 18 Apr 2026
https://github.com/hadv/vaneth
GPU-accelerated CREATE2 vanity address miner for Ethereum
create2-contract-deployment cuda ethereum gpu gpu-acceleration gpu-programming open-cl vanity-address
Last synced: 21 Jan 2026
https://github.com/kpetridis24/four-russians-algorithm
Boolean matrix multiplication accelerated by the four-Russians algorithm
c cuda gpu high-performance matrix-multiplication preprocess
Last synced: 29 May 2026
https://github.com/bdwhst/fluora
A CUDA PBR path tracer
cpp cuda pathtracing pbr rendering
Last synced: 13 Feb 2026
https://github.com/matthewfeickert/cuda-tf-torch
An Ubuntu 18.04 NVIDIA Docker image with CUDA 10.1 CuDNN 7 with TensorFlow and PyTorch
cuda cuda-101 cudnn cudnn-v7 docker docker-image gpu nvidia-docker nvidia-gpu pytorch tensorflow torch
Last synced: 07 Jan 2026
https://github.com/toxy4ny/artaxerxes
Artaxerxes - Adaptive High-Performance Stress Tester v.1.0. Rebuild old version Xerxes DDoS. Supports GPU+io_uring, DPDK, eBPF/XDP with intelligent fallbacks. Educational tool for advanced cybersecurity labs
cuda cuda-programming cybersecurity cybersecurity-education cybersecurity-tools dpdk ebpf educational high-performance network-security network-security-tool penetration-testing penetration-testing-framework penetration-testing-tools security-tools stress-testing
Last synced: 08 Oct 2025
https://github.com/alpha74/cuda_basics
Nvidia NVCC CUDA programs for begineers.
c cpp cuda cuda-programs nvcc nvidia parallel-computing parallel-programming
Last synced: 08 May 2026
https://github.com/tvanfossen/entropic
Local-first agentic inference engine in C/C++. Multi-tier model routing, grammar-constrained output, MCP tool servers. Embeddable via C ABI.
agentic-ai agentic-framework cpp cpp20 cuda edge-ai embedded-ai gbnf gguf grammar-constrained-decoding inference-engine llama-cpp llm local-llm mcp on-device-ai privacy-first tool-calling
Last synced: 30 May 2026
https://github.com/boltzmannentropy/vllm-5090
vLLM-5090: Docker Container for RTX 5090 on WSL2/Windows
Last synced: 08 Oct 2025
https://github.com/dujonwalker/nixos-config-x86_64-cuda
This repository contains my NixOS configuration optimized for 64-bit x86 systems with NVIDIA CUDA support, featuring a Plasma 6 desktop environment and a variety of essential applications for development, multimedia, and productivity. It serves as a backup for easy restoration and setup on new installations.
cuda flatpak nix nixos nixos-configuration ollama
Last synced: 17 Jan 2026
https://github.com/szymon423/tsp-cpu-vs-gpu
Simple brute force approach to solve travelling salesman problem with CPU and GPU
Last synced: 11 Mar 2025
https://github.com/kar-dim/watermarking-gpu
Code for my Diploma thesis at Information and Communication Systems Engineering (University of the Aegean, School of Engineering) with title "Efficient implementation of watermark and watermark detection algorithms for image and video using the graphics processing unit". Part 2 / GPU
arrayfire cpp cuda ffmpeg gpu image-processing opencl parallel-computing video-processing watermark-image watermarking
Last synced: 09 Apr 2025
https://github.com/yosh-matsuda/gpu-ptr
Cross-platform GPU smart pointer with C++20 range support
cpp cpp20 cuda gpu header-only hip
Last synced: 17 Jan 2026
https://github.com/nekon69/fastnoiselitecuda
A wrapper around C++ FastNoiseLite library for CUDA
cellular-noise computer-graphics cpp cuda fastnoiselite gamedev generative-art gpgpu gpu header-only noise opensimplex2-noise pcg perlin-noise procedural-generation simplex-noise terrain-generation texture-generation worley-noise
Last synced: 02 Oct 2025
https://github.com/brocbyte/realtime-deformations
Snow simulation (Material Point Method)
cuda glm material-point-method opengl
Last synced: 10 Aug 2025
https://github.com/kagof/julia-image-processing
Image processing programs written in Julia
Last synced: 18 May 2026
https://github.com/muhac/jupyter-pytorch-docker
JupyterLab for AI in Docker! Anaconda and PyTorch GPU supported.
conda-environment cuda docker jupyterlab pytorch
Last synced: 01 Oct 2025
https://github.com/eshibusawa/cupy-cuda
Learn CUDA programming essentials with CuPy, from basic kernels to advanced memory patterns
cooperative-thread-array cub cuda cupy gpu parallel-computing python
Last synced: 15 Jun 2025
https://github.com/ehsanmok/cs-521
UBC CS 521: Parallel Computing and Architectures
cuda erlang parallel-algorithm parallel-computing
Last synced: 16 May 2026
https://github.com/kchristin22/ising_model
Implementation of a cellular automaton on GPU using different features of CUDA
cellular-automaton cuda gpu-programming hpc ising-model parallel-computing
Last synced: 15 Mar 2025
https://github.com/emmanuelmess/firstcollisiontimesteprarefiedgassimulator
This simulator computes all possible intersections for a very small timestep for a particle model
Last synced: 17 Apr 2026
https://github.com/emilienmendes/gpgpu
Parallélisation et optimisation de reconnaissance de point dans une image
cuda gpgpu parallel-programming
Last synced: 28 Oct 2025
https://github.com/yooodleee/hello-cuda
👽Nice to meet you, CUDA!👽
c cc cuda gpgpu multiprocessing
Last synced: 09 Apr 2026
https://github.com/amirbroker/cudadtw
Use CUDA with numba for Dynamic Time Warping
cuda dtw dynamic-time-warping gpu numba
Last synced: 16 Apr 2026
https://github.com/umer-farooq-cs/canny-edge-detector
High-performance Canny edge detector with CPU and CUDA implementations. Loads PGM images, performs Gaussian smoothing, gradients, non-max suppression, and hysteresis. Benchmarks both paths, outputs edge maps, and reports speedup. Simple Makefile, sample images included.
c canny-edge-detection computer-vision cpp cuda gpu high-performance-computing image-processing nvcc pgm
Last synced: 18 Apr 2026
https://github.com/mre/talks
...mostly Computer Science related.
computer-science cuda talks tech-talks
Last synced: 28 Apr 2026
https://github.com/piyush26c/cuda-programming
c cuda ipynb-jupyter-notebook mathematics sppu-computer-engineering
Last synced: 03 Mar 2026
https://github.com/sd7campeon/yelp-sentiment-analysis-with-python-bs4-and-llm
A scalable pipeline for automated extraction, preprocessing, and sentiment analysis of Yelp reviews. Uses advanced HTTP requests, HTML parsing, and text normalization (tokenization, stopword removal, lemmatization) to enable precise polarity and subjectivity analysis for consumer insights and business analytics.
beautifulsoup beautifulsoup4 business-analytics cuda data-analysis nlp-machine-learning nltk opinion-mining pandas python python3 requests-library-python sentiment-analysis text-preprocessing textblob torch web-scraping yelp-reviews
Last synced: 06 May 2026
https://github.com/hyunjinno/multicore_computing
A repository of multicore programming in Java and C.
c cpp cuda java multithreading openmp thread thrust
Last synced: 18 Apr 2026
https://github.com/wallneradam/docker-ccminer
CCMiner (tpruvot version) Docker Builder
ccminer cuda docker gpu litecoin miner monero nvidia nvidia-docker
Last synced: 18 Apr 2026
https://github.com/jtompuri/weighted-voronoi-stippling
High-performance weighted Voronoi stippling implementation. Exports PNG and TSP files. Visualizes TSP tours as continuous line drawings.
computer-graphics cuda gpu-acceleration lloyd-relaxation numba python stippling traveling-salesman tsp voronoi
Last synced: 18 May 2026
https://github.com/microo8/micronn
Simple neural network library with backpropagation using CUDA
Last synced: 19 May 2026
https://github.com/steleman/openai-triton
Fork of OpenAI's Triton compiler v3.4.0 using LLVM 21.1.0 / 21.1.1 on Fedora 41+
cuda fedora linux llvm mlir mlir-dialect openai rocm triton
Last synced: 08 Apr 2026
https://github.com/senli1073/docker-gpu-monitor
A lightweight GPU monitor designed for real-time web-based viewing of GPU server status.
container cuda docker flask gpu gpu-monitoring linux memory-usage nvidia-smi web
Last synced: 05 Apr 2026
https://github.com/mayukhdeb/patrick
Tiny neural net library written from scratch with cupy :warning: under construction :warning:
cuda deep-learning gpu-computing machine-learning neural-network regression
Last synced: 08 May 2026
https://github.com/greg-tarr/fastsimplex
CUDA/MPS accelerated 2D & 3D simplex noise generation.
cuda mps noise-generator python simplex-noise
Last synced: 20 Apr 2026
https://github.com/malolm/jupyter-ml-with-gpu-support
Jupyter with GPU acceleration for Windows 10/11
cuda cudnn jupternotebook jupyter jupyterlab nvidia-gpu windows-10 windows-11
Last synced: 09 Apr 2026
https://github.com/shivendrra/axgrad
lightweight tensor library that contains it's own auto-diff engine like pytorch
autograd cuda pytorch scratch-implementation tinygrad
Last synced: 08 May 2026
https://github.com/rajarsheya/real-time-audio-feature-extraction-with-cuda-for-speech-recognition
This project accelerates MFCC extraction using CUDA for real-time speech recognition. Offloading the process to the GPU reduces latency and speeds up processing, enabling fast, local speech-to-text transcription for applications like virtual assistants, without cloud reliance.
audio-processing cpp cuda fourier-transform python
Last synced: 10 May 2026
https://github.com/5had3z/torch-discounted-cumsum-nd
PyTorch Discounted Cumsum with Autograd (CPU + CUDA)
Last synced: 18 Apr 2026
https://github.com/xihuai18/image-processing-in-cuda
Implementation of Image Processing Method
Last synced: 04 Oct 2025
https://github.com/sohhamseal/scalable-systems-programs
A little less effort to learn parallel programming...
Last synced: 18 Apr 2026
https://github.com/kichappa/spy-sim
Simulate a spying strategy on a topography
combat-modeling cuda differential-equations julia modeling-and-simulation topography-simulation
Last synced: 09 Mar 2026
https://github.com/rkv0id/automata-vtk
Multi-dimensional Cellular Automata visualization using Python's VTK bindings on top of a CUDA-parallel grid updates.
cellular-automata cuda game-of-life python vtk
Last synced: 19 Apr 2026
https://github.com/bl33h/pythagoreantheorem
A program that calculates the Pythagorean theorem for a large number of elements using GPU parallel processing.
arrays cuda kernel parallel-programming pythagoras pythagorean-theorem
Last synced: 19 May 2026
https://github.com/m-torhan/cuda-stl-renderer
CUDA C++ implementation of STL file renderer using ray tracing method
Last synced: 25 Feb 2026
https://github.com/makischristou/mandelbrot
Mandelbrot set visualizer using CUDA.
cpp cuda gpu mandelbrot nvidia renderer rust
Last synced: 09 Apr 2026
https://github.com/franciscoda/psvm
R package and C++ library that allows training SVM models in a GPU using CUDA and predicting out-of-sample data. A support vector machine (SVM) is a type of machine learning model that is trained using supervised data to classify samples.
cpp cpp17 cuda machine-learning r svm-classifier svm-training
Last synced: 18 Apr 2026
https://github.com/bokutotu/cudnn_graph_api_example
cudnn graph api example
Last synced: 04 May 2026
https://github.com/hatamiarash7/cuda-python
GPU programming using CUDA & Python
cuda gpu gpu-computing gpu-programming python
Last synced: 29 Apr 2026
https://github.com/rajarsheya/real-time-traffic-analysis-with-cuda-object-detection
Implemented CUDA-accelerated object detection (YOLO) to analyze a sample image dataset. Performed vehicle counting and simulated speed estimation to demonstrate real-time traffic analysis capabilities.
Last synced: 12 Apr 2026
https://github.com/bl33h/productoftwovectors
This code utilizes CUDA for parallel vector multiplication on a GPU, demonstrating GPU's acceleration capabilities.
cuda gpu kernel paralelism parallel-programming product vector
Last synced: 16 May 2026
https://github.com/jorgedavyd/nsight.nvim
A developer oriented Neovim framework for CUDA performance profiling and analysis.
cuda cuda-kernels cuda-profiler cuda-programming cuda-support cuda-toolkit deep-learning machine-learning neovim neovim-plugin performance-engineering
Last synced: 13 Apr 2026
https://github.com/subatomicplanets/simplebitcoinminer
A simple Bitcoin C++ and CUDA solo miner
bitcoin cpp cryptocurrency cuda miner
Last synced: 19 Apr 2026
https://github.com/manishklach/thermal-observatory
A generic thermal observability framework for CPU, GPU, board, and platform telemetry across vendor APIs, kernel interfaces, and runtime correlation layers.
amd arm64 cuda linux nvidia nvml observability rocm telemetry thermal-framework thermal-monitoring x86-64
Last synced: 09 Jun 2026