CUDA
CUDA® is a parallel computing platform and programming model developed by NVIDIA for general computing on graphical processing units (GPUs). With CUDA, developers are able to dramatically speed up computing applications by harnessing the power of GPUs.
- GitHub: https://github.com/topics/cuda
- Wikipedia: https://en.wikipedia.org/wiki/CUDA
- Created by: Nvidia
- Released: June 23, 2007
- Related Topics: nvcc,
- Last updated: 2026-07-01 00:07:09 UTC
- JSON Representation
https://github.com/maltsev-andrey/cuda-nn-inference
GPU-accelerated neural network inference using custom CUDA kernels. Achieves 97.82% accuracy on MNIST.
cuda deep-learning gpu-programming neural-networks numba nvidia parallel-computing parallel-programming performance-optimization python3 pytorch rhel9 tesla-p100
Last synced: 07 Mar 2026
https://github.com/voduchuy/cudafsp
CUDA-based implementation of the Finite State Projection (FSP) algorithm.
chemical-master-equation cuda stochastic-reaction-networks sundials
Last synced: 20 Jan 2026
https://github.com/branebb/nn-framework
Framework for creating neural networks using C++ and CUDA platform. This project is part of my final university assignment for bachelor's degree.
cmake cpp cuda cuda-programming
Last synced: 20 Jan 2026
https://github.com/ergus/cuda-ts-mode
An emacs Cuda mode supported by tree-sitter
Last synced: 20 May 2026
https://github.com/airvzxf/c-plus-plus-understanding-cuda
Understanding CUDA with C++
cuda hacktoberfest hacktoberfest-accepted
Last synced: 22 Mar 2025
https://github.com/yangfengzzz/tardis
Travel space and time by using autodiff and codegen
Last synced: 03 May 2026
https://github.com/ojaswithag/opencv-doc
OpenCV ile görüntü ve video işleme, makine öğrenmesi ve proje uygulamaları için Türkçe kapsamlı bir rehber. 🐙 Adım adım kod örnekleriyle öğrenin ve projeler geliştirin.
arm-architecture cuda cuda-support deployment django docker-image docker-images heroku image-processing javascript nodejs nvidia opencv-contrib opencv3 production python scanner tutorial
Last synced: 08 Apr 2026
https://github.com/lanceberge/cuda-newton-fractals
Parallelize and visualize the Newton Iteration
cpp cuda mathematical-modelling visualization
Last synced: 16 May 2026
https://github.com/tylerfaulkner/n-body_simulation
CUDA N-Body Gravitational Simulation with rendering in Python with MatPlotLib
Last synced: 20 May 2026
https://github.com/larygwil/cuda-samples-old
nvidia cuda samples old (5.0 - 7.5)
Last synced: 03 May 2026
https://github.com/alkaifaftab000/autonomous-maze-solver
Building an Autonomous Maze Solver using reinforcement learning to train agents for decision-making in dynamic grid-based environments
agent criticism cuda gymnasium-environment maze-solving-bot pytorch reinforcement-learning reward-functions
Last synced: 12 Apr 2026
https://github.com/adesoji1/youtubesummaryai
Python script for YouTube summary. The service should summarize an YouTube video by url. It should works for long video and for different languages.
cuda googleapi python3 speech-recognition transformers youtube-api-v3 youtube-dl
Last synced: 04 Apr 2025
https://github.com/AndreasKaratzas/orin
Setting up the NVIDIA Jetson Orin Nano Developer Kit
cuda cudnn jetpack6 nvidia-jetson nvidia-sdkmanager orin-nano
Last synced: 25 Feb 2025
https://github.com/stephanmg/cuda-playground
CUDA playground
cpu cuda gp100 gpu gv100 openmp parallel-computing parallel-programming
Last synced: 30 Mar 2025
https://github.com/voschezang/holographic-projector-simulations
Optimizations of Simulations of Holographic Projectors using CUDA
cuda gpu holography parallel-computing photonics
Last synced: 16 May 2026
https://github.com/sergeipapina/color2graycuda
color to gray image conversion nvidia CUDA kernel implementation using make or cmake to compile and link
cmake cuda cuda-kernels cuda-programming link makefile nvidia
Last synced: 06 Apr 2025
https://github.com/uefi-code/bachelorgraduationdesign
I developed a PyTorch_For_PoorGuys framework and Let it train LLM on NVIDIA GeForce 2080Ti GPU as my Bachelor's Graduation Design Project
chatbot cuda gpu hacking large-language-models pytorch
Last synced: 03 May 2026
https://github.com/djenriquez/ccminer
Dockerized ccminer
cuda docker ethereum mining nvidia nvidia-docker
Last synced: 05 May 2026
https://github.com/belrbez/ship-graphic-qt-qml-cuda-c
Client-Server application for Rocket driving in QML graphics
c client-server cpp cuda qml qt5 rocket
Last synced: 08 Apr 2026
https://github.com/td99/ai-sandbox
A collection of AI tools and prototypes.
ai cuda docker image-generation-ai nvidia python
Last synced: 08 Apr 2026
https://github.com/cs550-epfl/review
Review of the paper A Formal Analysis of the NVIDIA PTX Memory Consistency Model
cuda formal-verification gpu memory-consistency ptx simt
Last synced: 30 Mar 2025
https://github.com/fedesky25/hpc-project-2024
Project for the 2024 course of HPC: generator of streamplot of complex-valued functions
Last synced: 30 Mar 2025
https://github.com/anne-andresen/autoencoder_3d_c_cuda
3D Autoencoder training in raw C/CUDA
Last synced: 28 Apr 2026
https://github.com/daelsepara/hipnewton
GPU Implementation of Newton Fractal Generator with Benchmarking
amd cuda fractal gpu gpu-compute gpu-computing hip newton parallel-computing rocm sdk
Last synced: 03 May 2026
https://github.com/dasbd72/nthu-ipc-2022
National Tsing Hua University - Introduction to Parallel Computing - 2022
cuda cuda-programming hpc mpi openmp pthreads
Last synced: 30 Mar 2025
https://github.com/bjornmelin/ml-algorithm-playground
🧪 Core ML algorithm implementations with GPU acceleration. Featuring optimized implementations across various libraries with comprehensive analysis. 📈
algorithms cuda gpu-computing lightgbm machine-learning python scikit-learn xgboost
Last synced: 13 May 2026
https://github.com/shermanlo77/poisson_icing
Gibbs sampling on the Poisson-Ising model. The Poisson-Ising model is a 2D image of Poisson distributed random variables but has a dependency on their four neighbours. This causes the Poisson random variables to be similar (or dissimilar) to their neighbours.
cuda cupy gibbs-sampling gpu ising-model mcmc monte-carlo poisson poisson-ising
Last synced: 21 May 2026
https://github.com/sbstndb/nbody_k
A simple 3D naïve NBody simulation using Kokkos enabling CUDA or OpenMP backend
cuda kokkos nbody openmp simulation
Last synced: 21 May 2026
https://github.com/maxenceleguery/jare
3D Render engine accelerated with CUDA
Last synced: 21 May 2026
https://github.com/nmicic/k-tuplet-search
k-tuplet-search
computational-number-theory cuda experimental-mathematics gmp gpu-computing high-performance-computing hpc k-tuplets number-theory primality-testing prime-numbers prime-tuples sieve
Last synced: 21 May 2026
https://github.com/riciokzz/computer-vision
Computer Vision project
cuda data-cleaning data-engineering data-science exploratory-data-analysis machine-learning neural-network
Last synced: 20 May 2026
https://github.com/rbuj-uoc/m1.209
PAC 1, PAC 2, PAC 3 i PAC 4 de l'assignatura Computació d'altes prestacions del MUEI
Last synced: 21 May 2026
https://github.com/maltsev-andrey/julia_set_cuda
High-performance Julia set fractal computation in pure CUDA C, achieving 2.78 billion pixels/second on Tesla P100. Demonstrates GPU kernel programming, memory optimization, and massive parallelization (16M+ threads)."
cuda fractals gpu-programming high-performance-computing nvidia parallel-computing science visualization
Last synced: 03 Nov 2025
https://github.com/shermanlo77/oxwasp_phd
Code for the PhD thesis. The topic was on defect detection of 3D printing using x-rays. The repository includes an implementation of the mode filter and empirical null filter.
3d-printing applied-statistics computational-statistics cuda empirical-null imagej mode-filter statistics xray-projection
Last synced: 27 Mar 2025
https://github.com/bergolho/sycl
Repository with simple programs to learn SYCL.
Last synced: 16 May 2026
https://github.com/dragonscypher/prompty
Tool for generating smart and secure prompts for language models!
autotokenizer bert-model cuda google-t5 llm python3 tensorflow threading
Last synced: 02 Jan 2026
https://github.com/thesoenke/deeplearning-docker
Setup for Deep Learning experiments in Docker with Cuda
Last synced: 11 May 2026
https://github.com/grindelfp/cuda-texture-memory
Exercise on using texture memory in CUDA.
Last synced: 30 Mar 2025
https://github.com/thesupercd/cuda_sort
A simple project implementing and measuring the runtime performance metrics related to massively parallel algorithms (radix sort) on an NVIDIA GPU device.
benchmarking c cpp cuda cuda-programming gpu-acceleration gpu-programming multithreading parallel-processing radix-sort sorting-algorithms
Last synced: 10 May 2026
https://github.com/fanziyang-v/parallel-computing
Parallel Computing course materials from Harbin Institute of Technology(Shenzhen).
cuda openmp openmpi parallel-computing
Last synced: 27 Mar 2025
https://github.com/tzervas/unsloth-rs
Memory-optimized GPU kernels for LLM fine-tuning in Rust (2-5x speedup, 70-80% less VRAM)
cuda gpu machine-learning optimization rust
Last synced: 25 Jan 2026
https://github.com/awikramanayake/optimized-matrix-mult
Optimizing matrix multiplication using parallelism and SIMD (AVX2, CUDA)
avx2 cuda matrix-multiplication
Last synced: 22 May 2026
https://github.com/lttofu/cosmic
Fast, lightweight GUI-based C++ Ethereum ERC918 token miner for Win64 | CUDA GPUs | CPUs | Pool | Solo Mining
0xbitcoin 0xbtc cplusplus cplusplus-cli cpuminer cuda erc20 erc918 ethereum ethereum-token gpuminer gui pool-mining solo-mining windows windows-10 windows-7 windows-gui winforms
Last synced: 08 Apr 2026
https://github.com/illagrenan/cuda-80-cudnn6-runtime-1604-py36
Ubuntu 16.04 with Python 3.6 and CUDA Dockerfile
Last synced: 22 Jun 2025
https://github.com/danieljvickers/fluid_simulation
An educational example for learning the Navier-Stoke equations. Also included is a C++ and CUDA shared object library, buildable with CMake, for use in your personal projects.
cpp cuda differential-equations navier-stokes numpy physics python simulation
Last synced: 04 May 2026
https://github.com/shineiarakawa/particle-stabilizer
A C++ and CUDA-based program for simulating the motion of particles.
Last synced: 12 May 2026
https://github.com/faresargus/artaxerxes
Adaptive high-performance stress tester "artaxerxes" supports GPU, io_uring, DPDK, and eBPF/XDP for advanced cybersecurity labs. Ideal for network testing. 🚀🛠️
cuda cuda-programming cybersecurity cybersecurity-education cybersecurity-tools dpdk ebpf educational github-config high-performance network-security network-security-tool penetration-testing penetration-testing-framework penetration-testing-tools stress-testing
Last synced: 24 Jul 2025
https://github.com/malolm/football-player-detection-with-yolov8
Football player detection YOLOv8 fine-tuning
cuda jupyterlab python3 yolov8-detection
Last synced: 07 May 2026
https://github.com/macaycz/nn
A lightweight, GPU-accelerated machine learning library built with CUDA.
cuda deep-learning gpu machine-learning neural-network
Last synced: 25 Jul 2025
https://github.com/luis-kr/depthmap
Depth map estimation tool using Depth-Anything-V2. Generate accurate depth maps from images with support for both relative and metric depth measurements.
cuda depth-anything depth-estimation depth-map image-processing python pytorch
Last synced: 08 Feb 2026
https://github.com/shambac/shamboflow
Fierce tensorflow competitor
cuda cupy machine-learning numpy pypi-package
Last synced: 19 Feb 2026
https://github.com/jarmak-personal/vibespatial
GPU-first spatial analytics for Python. Drop-in GeoPandas replacement powered by runtime-compiled CUDA kernels
cccl cuda geodataframe geopandas geospatial gpu gpu-computing nvrtc python spatial-analytics
Last synced: 21 Apr 2026
https://github.com/9prady9/imageconvolve
Qt app for previewing Image convolution. Uses CUDA for convolution.
c-plus-plus convolution cuda desktop-app qt
Last synced: 03 May 2026
https://github.com/tornikeo/sample-openmp-in-cuda
Sample of using OpenMP and CUDA: single GPU, multiple CPU
Last synced: 01 Aug 2025
https://github.com/empenoso/doorcam-face-report
Пример проекта по распознаванию лиц с CUDA-ускорением. Включает скрипты для автоматической сборки dlib и анализа видео на GPU
Last synced: 19 May 2026
https://github.com/cerit-sc/scipion-docker
Scipion (Cryo em image processing framework (https://scipion.i2pc.es/)) adapted to run in Kubernetes.
cryo-em cryoem cuda desktop kubernetes scipion vnc
Last synced: 02 Aug 2025
https://github.com/f-koehler/itesol
WIP: Iterative eigensolvers for C++20, Python and CUDA
cpp20 cuda eigenvalues linear-algebra python
Last synced: 08 Nov 2025
https://github.com/oaslananka/cv_cuda_cpp_sample
This is a sample project demonstrating how to use OpenCV and CUDA in C++ for detecting people in drone footage with YOLO. The project aims to be simple and understandable for those who want to learn how to use OpenCV and CUDA in C++.
computervision cpp cuda opencv
Last synced: 01 May 2026
https://github.com/sergiomarquezdev/yt-transcriber
🛠️ CLI tool to transcribe YouTube videos using OpenAI Whisper with CUDA acceleration, generate AI summaries (EN/ES) with Gemini, and create LinkedIn/Twitter content. Supports YouTube, Google Drive, and local files.
ai cli cuda gemini python transcription whisper youtube
Last synced: 15 May 2026
https://github.com/nvaranki/cmmx
CUDA matrix multiplication (official guide, modified)
Last synced: 08 Aug 2025
https://github.com/desmondjs/cuda_mceliece_kem
CUDA-Accelerated McEliece KEM 🔑 | Post-Quantum Cryptography on GPU Implementation of Classic McEliece key encapsulation, encryption, decryption, and decapsulation on CPU & GPU with CUDA, including benchmarking scripts and full FYP2 report
academic-project benchmarking classic-mceliece cuda fyp gpu-acceleration kem pqc
Last synced: 02 Oct 2025
https://github.com/sankeer28/pptx-text-audio-transcriber
Extract text and transcribe audio from PowerPoint presentations using OpenAI Whisper.
audio-transcription cuda openai-whisper powerpoint pptx-parser
Last synced: 02 Oct 2025
https://github.com/brave-tarnished/gpu-accelerated-opc
Optical Proximity Correction (OPC) is a photolithography technique that modifies photomask geometry to counteract diffraction and process effects, ensuring accurate printing of patterns on the wafer. This work demonstrates a proof of concept showing how using a GPU-based approach can significantly speed up these modifications compared to a CPU.
cpp cuda gpu-acceleration photolithography semiconductors
Last synced: 02 Oct 2025
https://github.com/conan-kiln/kiln
An actively maintained fork of ConanCenter with an emphasis on CV, ML and robotics capabilities on edge devices
computer-vision conan cuda machine-learning oneapi packaging robotics rust scientific-computing
Last synced: 02 Oct 2025
https://github.com/dmitryyurov/bitonic-cuda
An implementation of bitonic search on CUDA
cuda gpu-programming sorting-algorithms
Last synced: 02 Oct 2025
https://github.com/datasagess/fic
NLP Hackaton \w NN + FastAPI + Docker
catboost cuda docker fastapi lstm python pytorch rapidfuzz tensorflow
Last synced: 08 Aug 2025
https://github.com/timdev-r/cv-ground-truth-extraction
(Dump) Helper for ground truth extraction, movement analytics and silhouette visual demonstration
computer-vision cuda ground-truth intel-realsense pandas python
Last synced: 18 Apr 2026
https://github.com/separatrixxx/pgp_labs_7_sem
👓 Laboratory work for the 7 semester of MAI on PGP and PDP
Last synced: 15 May 2026
https://github.com/ibrar-syed/complete_deep-learning-nvidia_gpu-setup-linux
Full setup for a deep learning environment on Ubuntu Linux with CUDA, cuDNN, TensorRT, and TensorFlow GPU. Includes scripts, test code, and environment configuration
ai bash conda cuda cudnn deep-learning environment-setup gcc gpu jupyter linux machine-learning nvidia-cuda nvidia-gpu pytorch setup-script tensorflow tensorrt
Last synced: 09 Apr 2026
https://github.com/nwpu66/cookiekiss-engine
CookieKiss Engine include a render and other small tech related to compute graphic.
compute-graphics cpp cuda opengl vulkan
Last synced: 09 Apr 2026
https://github.com/andreeo/parallel-computing-cuda
Programs in terminal applying the parallel programming model with the CUDA arquitecture
c cpp cuda docker lineal-search parallel-computing parallel-reduction rank-sort-algorithm
Last synced: 09 Apr 2026
https://github.com/i-m-iron-man/abmax
Abmax is an agent-based modelling framework in Jax, focused on dynamic population size
abm agent agent-based agent-based-modeling agent-based-simulation agents cuda jax python
Last synced: 04 Oct 2025
https://github.com/alessiobugetti/integral-image-processing
Implements sequential and parallel integral image computation in C++ and Python, utilizing CUDA for parallel computation on GPU
cuda gpu-acceleration integral-image numba parallel-computing pycuda
Last synced: 24 May 2026
https://github.com/ojeda-e/fokker-planck
Numerical solution of the Fokker-Planck equation in large times using CUDA/C.
Last synced: 17 Aug 2025
https://github.com/dmalexx/cuda_check
How can you check if CUDA is available in Tensorflow
Last synced: 10 Apr 2026
https://github.com/pvgupta24/parallel-programming
Basic algorithms for parallel programming in CUDA C++, Java and OpenMP
cuda openmp parallel-programming
Last synced: 19 Aug 2025
https://github.com/sid911/neuralnetworkcpp
A small experiment to learn about neural networks and their runtimes in cpp
cpp cuda machine-learning neural-network
Last synced: 20 Aug 2025
https://github.com/camille-004/cusprec
🏁 Sparse signal recovery library written in PyCUDA.
cuda ml python signal-processing sparse-recovery
Last synced: 18 Jan 2026
https://github.com/sshoecraft/shepherd
An interactive multi-backend LLM runtime with intelligent cache eviction and persistent retrieval-augmented memory.
anthropic cli cpp cuda gemini grok inference kv-cache llama-cpp llm mcp ollama openai openai-server rag smart-evictions tensorrt tool-calling ulimited-context
Last synced: 10 Apr 2026
https://github.com/maneeshsit/pcie
Modify run:ai and other FOSS projects code for use with PCIe card-based AI accelerators for both inference and training
cuda cxl cxl-mem distro exo k3s k8s kestra llamacpp llm-d mpi4py mpio onnxoptimizer opentelemetry-ebpf-profiler paxos-cluster pcie photonics-computing runai visualize vllm
Last synced: 24 Aug 2025
https://github.com/quik-fe/node-nvidia-smi
Node wrapper around nvidia-smi.
cuda gpu nodejs nvidia nvidia-smi typescript
Last synced: 19 Feb 2026
https://github.com/elymsyr/auv_ws
An open-source simulation and control workspace for an Autonomous Underwater Vehicle (AUV) built on ROS 2 Humble and Gazebo. It features a high-fidelity dynamics model and an advanced AI-based motion controller (FossenNet) that uses a pre-trained LibTorch model to imitate a NL-MPC for real-time, high-performance manoeuvring.
autonomous-vehicles auv control-systems cpp cuda deep-learning gazebo imitation-learning libtorch mpc python robotics ros2 simulation
Last synced: 15 Apr 2026
https://github.com/derek-palmer/dvr-scan-file-organizer
DVR-Scan-Organizer is a Dockerized extension for DVR-Scan, designed to process multiple video files and organize output in a structured format.
cuda dvr dvr-scan multimedia opencv opencv-python python video video-processing
Last synced: 01 May 2026
https://github.com/bjornmelin/ai-system-design
🎨 Large-scale AI system architectures and implementations. Features distributed training systems, multi-GPU pipelines, and efficient resource management. 🏗️
architecture cuda distributed-systems engineering gpu-computing production scalability system-design
Last synced: 23 Jul 2025
https://github.com/kis-balazs/cuda-research
CUDA Research & Code. Course-style structured. Inspiration from @Infatoshi.
Last synced: 14 May 2025
https://github.com/kylesayrs/pttp
PyTorch Tensor Profiler with fully-supported memory timelines and events
Last synced: 07 May 2026
https://github.com/m-torhan/advent-of-code
🎄 Solutions for the Advent of Code
advent-of-code advent-of-code-2024 cuda
Last synced: 07 Apr 2025
https://github.com/followthesapper/atlas-q
GPU-accelerated quantum tensor network simulator with adaptive MPS
ai cuda gpu-acceleration high-performance-computing matrix-product-states nisq python pytorch qaoa quantum-algorithms quantum-computing quantum-simulator scientific-computing shors-algorithm tensor-networks triton vqe
Last synced: 20 Jan 2026
https://github.com/abhiksark/gluon-by-example
Learn Triton's Gluon by example — the same GPU kernels written in Triton and Gluon, benchmarked
cuda deep-learning gluon gpu gpu-kernels triton tutorial
Last synced: 01 Jul 2026
https://github.com/tornikeo/minimal-vscode-cuda-meson
Minimal sample of using VSCode and Meson to build CUDA applications
Last synced: 08 Sep 2025
https://github.com/moshidev/acap
Prácticas de la asignatura Arquitectura y Computación de Altas Prestaciones
cuda homework-assignments mpi pthreads
Last synced: 30 Mar 2025