CUDA
CUDA® is a parallel computing platform and programming model developed by NVIDIA for general computing on graphical processing units (GPUs). With CUDA, developers are able to dramatically speed up computing applications by harnessing the power of GPUs.
- GitHub: https://github.com/topics/cuda
- Wikipedia: https://en.wikipedia.org/wiki/CUDA
- Created by: Nvidia
- Released: June 23, 2007
- Related Topics: nvcc,
- Last updated: 2026-06-23 00:07:15 UTC
- JSON Representation
https://github.com/deepankaracharyya/6th_sem_assignments
c cuda data-mining postgresql-database python
Last synced: 02 May 2026
https://github.com/xavierjiezou/gpu-compute-capability
An application for querying the computing power of each gpu released by NVIDIA.
Last synced: 28 Apr 2026
https://github.com/gunrock/template
Template repository for essentials applications to get you started asap!
cpp cuda essentials gpu graph-algorithms graph-analytics gunrock
Last synced: 15 May 2026
https://github.com/blazekill/hello-cuda
Cpp + Vcpkg + CUDA + VsCode starter project.
Last synced: 18 May 2026
https://github.com/mhaseeb123/gcb
GCB includes a suite of benchmarks and basic tests for CUDA-aware MPI and C++ compilers.
cpp cpp23 cuda mpi partitioned-communication st-mpi
Last synced: 17 May 2026
https://github.com/memergamer/cuda-fluid-simulation-with-interactive-visualization
A real-time fluid dynamics simulation implemented in Python using CUDA for GPU acceleration, featuring interactive ASCII visualization and automated movement patterns.
colab-notebook cuda liquid-simulations navier-stokes
Last synced: 18 May 2026
https://github.com/thomasonzhou/minitorch
rebuilding pytorch: from autograd to convolutions in CUDA
Last synced: 02 Feb 2026
https://github.com/ssoehdata/cuda_fortran_sci_eng
Working through examples from the Cuda Fortran for Scientists and Engineers 2nd Edition Book
cuda cuda-fortran fortran hpc nvfortran
Last synced: 21 Aug 2025
https://github.com/leocelente/basic_cuda
My CUDA source files while learning
Last synced: 29 Apr 2026
https://github.com/asadiahmad/gesture-detection
Real-time Gesture Detection using CUDA-accelerated OpenCV in Python.
computer-vision cuda gesture-recognition gpu-acceleration open-pose opencv opencv-cuda pose-detection real-time
Last synced: 29 Apr 2026
https://github.com/nofaralfasi/parallel-sequence-alignment
A parallelized version of multiple DNA sequence alignment algorithm with MPI, OpenMP and CUDA
cuda mpi openmp sequence-alignment
Last synced: 29 Apr 2026
https://github.com/bl33h/pythagoreantheorem
A program that calculates the Pythagorean theorem for a large number of elements using GPU parallel processing.
arrays cuda kernel parallel-programming pythagoras pythagorean-theorem
Last synced: 19 May 2026
https://github.com/nickolasrm/gpuvscpumatrixmultiplication
CPU and GPU optimized matrix multiplication (AVX, transposition, CUDA and other)
avx comparison cuda hpc matrix multiplication
Last synced: 06 Sep 2025
https://github.com/brosnanyuen/raybnn_graph
Graph Manipulation Library For GPUs, CPUs, and FPGAs via CUDA, OpenCL, and oneAPI
cuda gpu graph graph-algorithms neural-network neural-networks opencl raybnn rust
Last synced: 06 Feb 2026
https://github.com/tommaso-dognini/polimi_gpu101_courseproject
Polimi Passion In Action GPU101 course project. Implementation in CUDA of BFS algorithm
cpp cuda cuda-programming parallel-computing
Last synced: 10 Apr 2026
https://github.com/sarah627/horus_eye_fcih_graduation_project
An AI-powered tourism website using YOLOv7 for real-time landmark detection in images. Built with Flask, PyTorch, and Roboflow for seamless tourist interaction.
computer-vision cuda flask jupyter-notebook kaggle matplotlib object-detection opencv python pytorch roboflow
Last synced: 14 Apr 2026
https://github.com/duskvirkus/ofxarrayfire
An openFrameworks addon with pre-compiled binaries of ArrayFire.
arrayfire cuda ofxaddon openframeworks openframeworks-addon
Last synced: 09 May 2026
https://github.com/orgh0/highperformancecnn
Implementation of a High Performance CNN for MNIST dataset
Last synced: 18 May 2026
https://github.com/pintamonas4575/tfg-classification-model-customdataset
Modelo de clasificación en Tensorflow y Keras sobre un Dataset propio.
cnn cnn-classification cuda deep-learning efficientnet gpu image-classification keras tensorflow transfer-learning
Last synced: 02 May 2026
https://github.com/aliyoussef97/triton-hub
A container of various PyTorch neural network modules written in Triton.
cuda deep-learning openai pytorch triton triton-lang
Last synced: 30 Mar 2025
https://github.com/ismailtekin05/caloriedetectingai
🍎🔍 Smart AI system that identifies food items in photos and calculates their calorie content automatically. Built with TensorFlow, YOLOv8, CUDA and computer vision for accurate nutrition tracking.
ai aimodel calorie-calculator computer-vision cuda data-analysis data-science data-segmentation data-visualization dataset dataset-generation image-processing image-recognition python segmentation-models tensorflow ultralytics yaml yolo yolov8
Last synced: 29 Apr 2026
https://github.com/fynv/cudainline
A CUDA interface for Python. A distillation of the engine part of ThrustRTC.
Last synced: 18 May 2026
https://github.com/unvercan/ssd300-model-pytorch
SSD300 Model using PyTorch
cnn computer-vision convolutional-neural-networks cuda deep-learning image-processing neural-network object-detection opencv python pytorch single-shot-detection ssd ssd300
Last synced: 17 Mar 2025
https://github.com/rhysdg/whisper-onnx-python
A low-footprint GPU accelerated Speech to Text Python package for the Jetpack 5 era bolstered by an optimized graph
ai chatbot cuda machine-learning onnxruntime speech-to-text whisper
Last synced: 16 Feb 2026
https://github.com/kartavyaantani/cuda_image_processing
A CUDA-accelerated image processing project featuring multiple GPU-based filters and enhancement techniques. Implements convolution, edge detection, Non-Local Means (NLM) denoising, K-Nearest Neighbors (KNN), and pixelization. Each operation is optimized using CUDA kernels for real-time performance on large images. The project supports command-line
cuda cuda-kernels cuda-programming cuda-toolkit gpu-programming high-performance-computing image-manipulation image-processing nvidia-cuda nvidia-gpu
Last synced: 30 Apr 2026
https://github.com/hatamiarash7/cuda-python
GPU programming using CUDA & Python
cuda gpu gpu-computing gpu-programming python
Last synced: 29 Apr 2026
https://github.com/SanaeProject/Matrix-for-Cpp
This repository has types that handle matrices.
cpp14 cpp14-library cuda matrix-library
Last synced: 15 May 2025
https://github.com/xihuai18/image-processing-in-cuda
Implementation of Image Processing Method
Last synced: 04 Oct 2025
https://github.com/gvvsnrnaveen/cuda
this repository contains the various programs that can written using CUDA Toolkit.
c cpp cuda nvcc nvidia-cuda nvidia-gpu
Last synced: 17 Jan 2026
https://github.com/fblupi/grado_informatica-ppr
Prácticas de la asignatura Programación Paralela de la UGR
cuda mpi openmp parallel-computing
Last synced: 22 Apr 2026
https://github.com/subatomicplanets/simplebitcoinminer
A simple Bitcoin C++ and CUDA solo miner
bitcoin cpp cryptocurrency cuda miner
Last synced: 19 Apr 2026
https://github.com/makischristou/mandelbrot
Mandelbrot set visualizer using CUDA.
cpp cuda gpu mandelbrot nvidia renderer rust
Last synced: 09 Apr 2026
https://github.com/piyush26c/cuda-programming
c cuda ipynb-jupyter-notebook mathematics sppu-computer-engineering
Last synced: 03 Mar 2026
https://github.com/exprays/atlas
Atlas is a specialized convolutional neural network designed for satellite image change detection
alembic celery cnn-for-visual-recognition cuda geospatial-visualization python pytorch tensors
Last synced: 28 Feb 2026
https://github.com/chintak/theano-lasagne-docker
Dockerfile for Lasagne with Cuda support. Look at the branches for relevant Dockerfiles - ``cpu`` and ``gpu``.
caffe cuda docker dockerfile install-script lasagne machine-learning machine-learning-library theano
Last synced: 10 Apr 2025
https://github.com/arakiss/hecate-os
Linux distro with automatic hardware detection and per-system optimization. Ubuntu 24.04 base. Alpha.
ai cuda docker gpu hardware-optimization kernel-tuning linux linux-distribution machine-learning nvidia operating-system performance sysctl ubuntu workstation zram
Last synced: 16 Feb 2026
https://github.com/steleman/quadratic-assignment
Research on the Quadratic Assignment Problem with CUDA Acceleration
cuda cuda-kernels cuda-programming cuda-programming-project quadratic-assignment quadratic-assignment-problem
Last synced: 07 Apr 2026
https://github.com/viktor-shcherb/triage
Script running tool for optimizing GPU memory utilization
automation cli cuda deep-learning devops-tools experiment-runner gpu-monitoring gpu-scheduler hyperparameter-sweep job-queue machine-learning nvidia-smi pypi-package python resource-management script-runner
Last synced: 12 Feb 2026
https://github.com/syncush/cifar-10-pytorch
cifar10 cuda deep-learning machine-learning python python3 pytroch
Last synced: 19 May 2026
https://github.com/jorgedavyd/nsight.nvim
A developer oriented Neovim framework for CUDA performance profiling and analysis.
cuda cuda-kernels cuda-profiler cuda-programming cuda-support cuda-toolkit deep-learning machine-learning neovim neovim-plugin performance-engineering
Last synced: 13 Apr 2026
https://github.com/m15kh/cuda_programming
CUDA programming enables parallel computing on NVIDIA GPUs for high-performance tasks like deep learning and scientific computing
cuda cuda-programming gpu nvidia parallel-computing practice-programming
Last synced: 03 Apr 2025
https://github.com/andrewboessen/bitonic-merge-sort
Bitonic Merge Sort algorithm optimized for GPU execution
bitonic-merge-sort cuda sorting-network
Last synced: 16 May 2026
https://github.com/amirbroker/cudadtw
Use CUDA with numba for Dynamic Time Warping
cuda dtw dynamic-time-warping gpu numba
Last synced: 16 Apr 2026
https://github.com/kchristin22/ising_model
Implementation of a cellular automaton on GPU using different features of CUDA
cellular-automaton cuda gpu-programming hpc ising-model parallel-computing
Last synced: 15 Mar 2025
https://github.com/thisalmandula/gpu_accelerated_lpt_cfd_code
This repository contains GPU accelerated version of the particle tracking model developed by Merel Kooi for biofouled microplastic particles ( available at: https://pubs.acs.org/doi/10.1021/acs.est.6b04702) written in CUDA Fortran and CUDA Python. This repository is intended as a learning tool for GPU programming.
biofouling computational-fluid-dynamics cuda fortran lagrangian-particle-tracking microplastics python
Last synced: 02 May 2026
https://github.com/naidezhujimo/cuda-rewrite-fast-matrix-multiplication
This repository contains an optimized implementation of matrix multiplication using CUDA. The goal of this project is to provide a high-performance solution for matrix multiplication operations on NVIDIA GPUs.
Last synced: 26 Mar 2025
https://github.com/brosnanyuen/raybnn_sparse
Sparse Matrix Library for GPUs, CPUs, and FPGAs via CUDA, OpenCL, and oneAPI
arrayfire cpu cuda gpu gpu-computing opencl parallel parallel-computing parallel-programming raybnn rust sparse sparse-coding sparse-matrix sparse-neural-networks
Last synced: 19 Jan 2026
https://github.com/miniex/maidenx
Rust-based CUDA library designed for learning purposes and building my AI engines named Maiden Engine
Last synced: 20 Mar 2025
https://github.com/mayukhdeb/patrick
Tiny neural net library written from scratch with cupy :warning: under construction :warning:
cuda deep-learning gpu-computing machine-learning neural-network regression
Last synced: 08 May 2026
https://github.com/stanczakdominik/cuda_poisson
A 2D poisson solver via CUDA
Last synced: 29 Jun 2025
https://github.com/steleman/openai-triton
Fork of OpenAI's Triton compiler v3.4.0 using LLVM 21.1.0 / 21.1.1 on Fedora 41+
cuda fedora linux llvm mlir mlir-dialect openai rocm triton
Last synced: 08 Apr 2026
https://github.com/gabrielmaialva33/enton
Autonomous AI Robot Assistant — Vision, Voice, and Soul
ai autonomous-agent computer-vision cuda llm python pytorch robot stt tts whisper yolo
Last synced: 01 Apr 2026
https://github.com/denyskryvytskyi/capgemini-cuda
CUDA implementation of vector additon, matrix multiplication, reduction and sorting
bitonic-sort cpp cuda cuda-kernels gpgpu matrix matrix-multiplication matrix-multiplication-parallel matrix-transpose nvidia nvidia-cuda nvidia-gpu reduction-dimension sort sorting-algorithms-implemented vector vector-addition vectorization
Last synced: 14 May 2026
https://github.com/gogolb/ee147
Intro to GPU Computing
c cuda cuda-kernels cuda-toolkit gpu-computing gpu-programming university-course
Last synced: 01 May 2026
https://github.com/andih/cuda-fortran-stream
Variant of STREAM Benchmark in CUDA Fortran
cuda cuda-fortran gpu stream-benchmarks variants
Last synced: 02 Mar 2025
https://github.com/iag-geo/image-classification
Image classification scripts using YOLOv5 with aerial imagery
cuda image-classification python pytorch swimming-pools yolov5
Last synced: 22 Feb 2026
https://github.com/ezroot/gacc
GIACC - Generate Images, Art, Code and Conversations
ai codegen cuda huggingface image imagegeneration python rust stablediffusion
Last synced: 06 Apr 2026
https://github.com/raumberg/hypervision
Neural Network based real-time aimbot system, operating on TensorRT with custom CUDA kernel and C FFI extensions
ai aim cuda cython neural-networks python tensorrt yolo
Last synced: 20 May 2026
https://github.com/matteogianferrari/qr-decomposition
Tthis project implements different methods to exploit caches usage, the multicore CPU and the GPU architectures, on the Gram-Schmidt QR Decomposition algorithm and measure the performance of the different implementations.
cuda openmp parallel-computing
Last synced: 12 Apr 2026
https://github.com/greg-tarr/fastsimplex
CUDA/MPS accelerated 2D & 3D simplex noise generation.
cuda mps noise-generator python simplex-noise
Last synced: 20 Apr 2026
https://github.com/hartorn/docker-python
Repository to build python image, based on ubuntu and CUDA
cuda docker mkl-dnn onednn python3 ubuntu ubuntu1804
Last synced: 05 May 2026
https://github.com/enriquebdel/clases-cuda-programacion-paralela-en-c-
En este repositorio encontrarás varias lecciones creadas por mí sobre la librería CUDA en C. El programa que utilizo para programar es MobaXterm.
c cuda cuda-programming gnu-linux googlecolab mobaxterm nvidia parallel-programming ubuntu university
Last synced: 19 May 2026
https://github.com/pratikvn/nla4hpc-exercises-framework
The exercises framework for the Numerical Linear Algebra for HPC course at Karlsruhe Institute of Technology.
cuda ginkgo homeworks hpc-course teaching
Last synced: 19 May 2026
https://github.com/applicative-systems/nixos-gpu-tests
GPU-enabled tests with CUDA in the NixOS integration test driver
amd cuda nix nixos nvidia nvidia-gpu radeon sandbox test test-automation test-automation-framework test-framework zluda
Last synced: 02 Apr 2026
https://github.com/dpetrosy/fractal
This project is a Fractal Visualizer developed in C++ with SFML and CUDA.
burning-ship cmake cmakelists cpp cpp-programming cpp-project cuda cuda-opengl cuda-programming fractal fractal-generation fractal-visualization julia mandelbox mandelbrot opengl opengl-project sfml sfml-library tricorn
Last synced: 21 Feb 2026
https://github.com/fardinsabid/aleam
Aleam: True randomness for AI. Non-recursive, stateless, cryptographically secure random number generator.
ai aleam cryptographic-random cuda cupy deep-learning distributions entropy gpu-acceleration jax machine-learning opensource probability pypi python pytorch random-number-generator statistics tensorflow true-randomness
Last synced: 06 Apr 2026
https://github.com/kichappa/spy-sim
Simulate a spying strategy on a topography
combat-modeling cuda differential-equations julia modeling-and-simulation topography-simulation
Last synced: 09 Mar 2026
https://github.com/yooodleee/hello-cuda
👽Nice to meet you, CUDA!👽
c cc cuda gpgpu multiprocessing
Last synced: 09 Apr 2026
https://github.com/manishklach/thermal-observatory
A generic thermal observability framework for CPU, GPU, board, and platform telemetry across vendor APIs, kernel interfaces, and runtime correlation layers.
amd arm64 cuda linux nvidia nvml observability rocm telemetry thermal-framework thermal-monitoring x86-64
Last synced: 09 Jun 2026
https://github.com/malolm/jupyter-ml-with-gpu-support
Jupyter with GPU acceleration for Windows 10/11
cuda cudnn jupternotebook jupyter jupyterlab nvidia-gpu windows-10 windows-11
Last synced: 09 Apr 2026
https://github.com/programmer-rd-ai/object-detection-framework
A Pythonic approach to object detection using Detectron2, a clean, modular framework for training and deploying computer vision models. DetectX simplifies the complexity of object detection while maintaining high performance and extensibility.
coco-dataset computer-vision computer-vision-library cuda deep-learning detectron2 faster-rcnn gpu-accelerated machine-learning ml-framework object-detection object-recognition python3 pytorch retinanet
Last synced: 24 Sep 2025
https://github.com/pintamonas4575/tfg-diffusion-model-customdataset
Creación en Pytorch de un modelo de difusión para generación incondicional de imágenes con un dataset propio.
attention-mechanism cnn cosine-scheduler cuda custom-dataset ddim deep-learning diffusion-models gpu image-generation pytorch
Last synced: 17 Apr 2026
https://github.com/sbstndb/grayscott_k
A simple 3D GrayScott simulation using Kokkos enabling CUDA or OpenMP backend
cuda finite-difference grayscott grid kokkos laplacian openmp simulation visualisation
Last synced: 16 May 2026
https://github.com/xza85hrf/ml-framework_checker
ML Framework and CUDA Checker is a Python-based GUI application for checking PyTorch, TensorFlow, and CUDA installations. It provides detailed system specs, compatibility checks, advanced GPU management, and offers options to view instructions, export logs, and update machine learning frameworks.
compatibility cuda gpu-management gui-application machine-learning python pytorch system-checker system-specs tensorflow
Last synced: 28 Apr 2026
https://github.com/quantum-integrated-technologies/deepforge
DeepForge : framework for working with machine learning.
ai artificial-intelligence cuda library machine-learning ml neural-network
Last synced: 31 Jul 2025
https://github.com/dhruvsrikanth/fastconv
Distributed and serial implementations of the 2D Convolution operation in c++ and CUDA.
convolution-filters cpp cuda gpu-programming high-performance-computing hpc image-editor image-processing nvidia parallel-programming
Last synced: 04 May 2026
https://github.com/anne-andresen/multi-modal-cuda-c-gan
Raw C/cuda implementation of 3d GAN
3d 3d-models attention-mechanism c cross-attention cross-attention-c cuda gan gan-models low-level-programming medical-imaging multimodal-deep-learning pytorch transformer-pytorch transformers transformers-c
Last synced: 06 Jan 2026
https://github.com/bl33h/productoftwovectors
This code utilizes CUDA for parallel vector multiplication on a GPU, demonstrating GPU's acceleration capabilities.
cuda gpu kernel paralelism parallel-programming product vector
Last synced: 16 May 2026
https://github.com/straightchlorine/quantum-pipeline
A Python module for executing and monitoring quantum algorithms across local simulators and IBM Quantum platforms. Seamlessly handles data collection, organization, and streaming to Apache Kafka
apache-kafka apache-spark aws-s3 cuda docker gpu-acceleration ibm-cloud ibm-quantum minio qiskit qiskit-aer qiskit-nature quantum-computing visualizations vqe
Last synced: 08 Oct 2025
https://github.com/bhattbhavesh91/rapids-cudf-cuml-example
Running KNN algorithm much faster on GPU for free using RAPIDS packages like cuML and cuDF
cuda cuml deep-learning nvidia-gpu rapids rapidsai
Last synced: 17 Apr 2026
https://github.com/eric900115/parallelprogramming
The repository contains the coursework for CS5422, NTHU's Parallel Programming Course.
Last synced: 26 May 2026
https://github.com/sd7campeon/yelp-sentiment-analysis-with-python-bs4-and-llm
A scalable pipeline for automated extraction, preprocessing, and sentiment analysis of Yelp reviews. Uses advanced HTTP requests, HTML parsing, and text normalization (tokenization, stopword removal, lemmatization) to enable precise polarity and subjectivity analysis for consumer insights and business analytics.
beautifulsoup beautifulsoup4 business-analytics cuda data-analysis nlp-machine-learning nltk opinion-mining pandas python python3 requests-library-python sentiment-analysis text-preprocessing textblob torch web-scraping yelp-reviews
Last synced: 06 May 2026
https://github.com/m-torhan/cuda-stl-renderer
CUDA C++ implementation of STL file renderer using ray tracing method
Last synced: 25 Feb 2026
https://github.com/jessetg/cuda-practice
Working through the chapters of Cuda by Example
c cpp cuda cuda-by-example gpgpu
Last synced: 01 May 2026
https://github.com/ehsanmok/cs-521
UBC CS 521: Parallel Computing and Architectures
cuda erlang parallel-algorithm parallel-computing
Last synced: 16 May 2026
https://github.com/hdelan/msc-hpc-final-project
In this project I implement a CUDA Lanczos method to approximate the matrix exponential. The matrix exponential is an important centrality measure for large, sparse graphs.
cuda graph-algorithms krylov-methods
Last synced: 12 Apr 2025