CUDA
CUDA® is a parallel computing platform and programming model developed by NVIDIA for general computing on graphical processing units (GPUs). With CUDA, developers are able to dramatically speed up computing applications by harnessing the power of GPUs.
- GitHub: https://github.com/topics/cuda
- Wikipedia: https://en.wikipedia.org/wiki/CUDA
- Created by: Nvidia
- Released: June 23, 2007
- Related Topics: nvcc,
- Last updated: 2026-07-01 00:07:09 UTC
- JSON Representation
https://github.com/phrutis/bip39scan.com
Collective search for old coins
bip39 brute-force client-server cuda gpu mnemonic pass passphrase passphrase-generator passwords
Last synced: 04 Sep 2025
https://github.com/pintamonas4575/rlgan-project-maadm-upm
Neuroevolution to learn the Lunar Lander from Gymnasium and a GAN to learn to color images. Subject from the ML and BD master´s degree of UPM.
cifar10 cuda dcgan deep-learning flappy-bird gan genetic-algorithm lunar-lander machine-learning mlp python3 pytorch reinforcement-learning tensorflow wgan-gp
Last synced: 12 Apr 2026
https://github.com/marcorentap/kokkos-docker-cluster
Deploy Docker containers with Kokkos, OpenMP, OpenMPI and CUDA as a Docker swarm.
Last synced: 10 Mar 2025
https://github.com/boohohoo/shamining
Shamining is a cloud mining service that allows users to mine cryptocurrencies without the need for personal hardware. By renting computing power from eco-friendly data centers, users can mine efficiently. The platform offers easy-to-use interface, flexible contracts, and daily payouts.
cryptocurrency cryptomining cuda gpu-mining mining mining-software open-source opencl
Last synced: 04 Jul 2025
https://github.com/prdai/mnist-digit-recognition
A PyTorch-based deep learning implementation for MNIST digit recognition featuring CNNs, GPU acceleration, experiment tracking, and comprehensive testing capabilities.
cnn computer-vision cuda data-science deep-learning digit-recognition image-classification machine-learning mnist neural-networks python pytorch wandb
Last synced: 12 Apr 2026
https://github.com/occisor2/fluidsimulation
Second project of my parallel algorithms course
cuda high-performance-computing
Last synced: 28 Feb 2025
https://github.com/kentakoong/mtnlog
A simple multinode performance logger for Python
cuda lanta nvitop python slurm-cluster
Last synced: 11 Jan 2026
https://github.com/boned-fruitwood759/whisperx-asr-with-fastapi
🎤 Enable real-time speech recognition with WhisperX using FastAPI for efficient, scalable audio processing.
asr ctranslate2 cuda fastapi openai python speech-recognition torch transformers whisper whisperx
Last synced: 12 Apr 2026
https://github.com/fmigneault/dockers
Collection of docker setup with common libraries for image processing and machine learning.
boost cuda docker image-processing opencv python
Last synced: 12 Apr 2026
https://github.com/emanuelemessina/gigacheck
ABFT Matrix Multiplication of any size in CUDA
abft cuda matrix-multiplication
Last synced: 28 Feb 2025
https://github.com/karusb/2dca-cuda
2 Dimensional Cellular Automata Visualisation (Game of Life)
algorithm-flowchart cellular-automata cuda game game-of-life glut visual-studio
Last synced: 12 Apr 2026
https://github.com/redhat-et/triton-cache-performance-comparison
amd-gpu cache cuda gpu nvidia-gpu performance rocm triton
Last synced: 12 Apr 2026
https://github.com/bjornmelin/cuda-core-projects
🎯 Essential CUDA programming patterns and optimizations. Showcasing parallel computing expertise through matrix operations, memory management, and advanced kernel implementations. 💻
cpp cuda cuda-kernels gpu-computing high-performance-computing nvidia optimization parallel-computing
Last synced: 12 Apr 2026
https://github.com/lionpsiuc/cflow
A computational model for heat propagation in a cylindrical radiator using both CPU and GPU parallel processing. The simulation uses finite difference methods to model the directional flow of heat through a cylindrical pipe system with specific boundary conditions and cyclic connections between pipe segments.
Last synced: 29 May 2026
https://github.com/hrolive/data-analytics-in-the-era-of-large-scale-machine-learning
Slides and other material for the Cyprus NCC training event about "Data analytics in the era of large-scale machine learning".
cuda deep-learning gpu-acceleration gradient-boosting large-language-models machine-learning preprocessing python pytorch
Last synced: 13 Apr 2026
https://github.com/matthewfeickert/report-urssi-fellowship-2025
Report on URSSI 2025 Early-Career Fellowship
Last synced: 17 Jan 2026
https://github.com/equiel-1703/cuhip
Wrapper tool to convert CUDA source code to HIP code and compile it with HIPCC. Useful for learning CUDA programming using AMD devices..
Last synced: 14 May 2026
https://github.com/ray-chew/modified_ch
Density functional theory (DFT) and self-consistent field theory (SCFT) simulation of diblock copolymers
cuda density-functional-theory diblock-copolymer numerical-analysis numerical-methods self-consistent-field-theory
Last synced: 11 May 2026
https://github.com/hr-fahim/transformer-model-optimization
Sample GPT Transformer Model from Scratch.
cuda few-shot-learning transfomers
Last synced: 02 May 2026
https://github.com/eminem5410/devmind-platform
Linux-first CLI for AI environment diagnostics, repair & automation
ai automation cli cuda developer-tools devops docker generative-ai linux local-llm observability ollama python self-hosted system-monitoring
Last synced: 30 May 2026
https://github.com/xza85hrf/flux_pipeline
FluxPipeline is a prototype experimental project that provides a framework for working with the FLUX.1-schnell image generation model. This project is intended for educational and experimental purposes only.
ai cuda docker educational experimental flux1 flux1-schnell flux1ai gradio image-generation model non-commercial python pytorch research transformer-model
Last synced: 05 Jul 2025
https://github.com/doxakis/cosinesimilaritydistancesongpu
Compute cosine similarity distances for all combinations of the dataset on the gpu with CUDA
Last synced: 13 Apr 2026
https://github.com/eyelor/text-to-image-item-generator
A Python workflow for generating random item images using models from Hugging Face.
ai conda cuda flux-schnell generator huggingface item llama python pytorch text-to-image
Last synced: 13 Apr 2026
https://github.com/aledinola/ifp_cuda_mex
Solve the income fluctuation problem on the GPU
Last synced: 14 May 2026
https://github.com/efecaliskannn/pneumonia-detection-with-cnn--vgg16--and-resnet50-deep-learning-models
In this project, pneumonia detection using deep learning, a subset of artificial intelligence, is aimed. The performance of deep learning algorithms, including CNN, VGG16, and ResNet50 models, in detecting pneumonia has been examined.(Bu projede yapay zekanın alt kümesi olan derin öğrenme ile zatürre tespiti amaçlanmaktadır.)
artificial-intelligence convolutional-neural-networks cuda deep-learning keras-tensorflow nvidia-cuda pyhton transfer-learning
Last synced: 13 Jun 2025
https://github.com/ronaldsg20/compu-paralela
Códigos de ejemplo para computación paralela y distribuida
cuda opencv openmp posix-threads
Last synced: 14 May 2026
https://github.com/mrgkanev/tensorflow-gpu-docker-setup
A Docker environment for TensorFlow GPU development with optimized configurations for WSL2, troubleshooting guides, and common error fixes
cuda cuda-toolkit deep-learning dev-environment development-tools docker gpu-acceleration machine-learning nvidia-docker nvidia-docker-support python tensorflow
Last synced: 13 Apr 2026
https://github.com/hrshl212/custom-cuda-kernels-with-neural-network-implementation
The repository contains custom CUDA kernels for linear layer, softmax and relu which are integrated with python to develop a Neural Network
cuda neural-network python pytorch
Last synced: 08 May 2026
https://github.com/lord-turmoil/cudacmakedemo
A demo for building CUDA program with CMake
Last synced: 16 Mar 2025
https://github.com/delusionary/histoptimizer
Solves a minimum variance cost of the partition problem.
Last synced: 14 Jan 2026
https://github.com/dgcnz/nvtx-vscode
Create NVIDIA NVTX ranges directly in VS Code, then profile with Nsight Systems without modifying source code.
Last synced: 13 Apr 2026
https://github.com/ran-2012/cuda-practice
cuda practice code for nvidia programming guide
Last synced: 27 Feb 2025
https://github.com/avicted/hip_fm_synthesis
This project demonstrates FM Synthesis (Frequency Modulation) using HIP (Heterogeneous Compute Interface), enabling high-performance sound generation on both AMD and NVIDIA GPUs.
amd audio-processing cuda fm-synthesis hip nvidia rocm
Last synced: 16 Mar 2025
https://github.com/nel-s/vein-cracker
Recovers which internal generator states could have generated a provided set of Minecraft Java b1.6-1.12.2 veins. Those can then be used to recover 3/4ths of any worldseeds that could have generated them.
cuda minecraft seedcracking veins
Last synced: 16 Mar 2025
https://github.com/arya2004/parallel-computing
Parallel Computing Uni Course
Last synced: 18 May 2026
https://github.com/cripterhack/business-address-scrapper
Python+Scrapy - Distributed scraping system with cache for business information extraction.
cuda ollama postgresql python redis scraper scraping scrapy tesseract
Last synced: 14 Jun 2025
https://github.com/rugleb/cuda
A simple example of a program that uses parallel GPU computing on an NVIDIA graphics card using CUDA technology
Last synced: 10 Apr 2025
https://github.com/naidezhujimo/cuda-learning-just-record-the-learning-process-
just record the learning process,There are notes,Welcome to learn.
Last synced: 26 Mar 2025
https://github.com/azdavis/parallel-portrait-mode
Parallel Portrait Mode
cuda image-processing ispc openmp
Last synced: 13 Apr 2026
https://github.com/sevilze/folderesque
Python Script to process and upscale images in specified folders using RRDB models.
Last synced: 02 Mar 2026
https://github.com/kenwuqianghao/c4ai-cuda-birds
Homework assignments for C4AI Beginners in Research-Driven Studies
Last synced: 18 Apr 2026
https://github.com/TheodoreAI/monte-carlo-simulator
CUDA application for Monte Carlo simulation is used to determine the range of outcomes for a series of parameters, each of which has a probability distribution showing how likely each option is to happen. This is using CUDA.
cuda gpu-computing monte-carlo-simulation parallel-computing
Last synced: 06 Oct 2025
https://github.com/juntyr/necsim-rust-docs
Documentation of the spatially explicit biodiversity simulation necsim-rust
biodiversity cuda docs mpi necsim rust simulation
Last synced: 14 May 2026
https://github.com/hshshshshsh12e/gpumkat
Gpumkat is a shader debugger for metal which is designed to do what instruments can't do
alternative api control cuda darwin debugger debugging gpumkat macos management profiler release shaders threads
Last synced: 14 Apr 2026
https://github.com/mmz33/practice-cuda
c cpp cuda cuda-programming gpu-programming parallel-programming
Last synced: 14 Apr 2026
https://github.com/kanttouchthis/cuda_schem
script for voxelization of 3d models to minecraft .schem schematics with texture support powered by numba cuda.
cuda minecraft numba voxelization
Last synced: 07 Oct 2025
https://github.com/dreoporto/tensorflow-gpu-docker
An example project to run TensorFlow with CUDA-enabled GPU acceleration using Windows, Docker and WSL2.
artificial-intelligence cuda deep-learning docker docker-compose jupyter machine-learning nvidia-docker python windows wsl2
Last synced: 27 Jan 2026
https://github.com/nguyenpanda/gemm
Parallel Computing Assignment - K251 - HCMUT - VNU
cpp23 cuda forkjoin matrix-multiplication mpi openmp openmpi parallel-computing simd simd-instructions strassen-multiplication
Last synced: 14 May 2026
https://github.com/brainlesslabs/jalebi
C++ String algorithms for maximum performance
c-plus-plus cplusplus cpp cpp-library cpu cuda library parallel performance simd sse string string-matching vectorization
Last synced: 14 May 2026
https://github.com/thanduriel/cuda_hip_comparison
performance study of atomics on GPUs
Last synced: 09 Oct 2025
https://github.com/enesdoruk/opencv-cpp
Opencv CPP tutorials
computer-vision cpp cuda opencv
Last synced: 09 Oct 2025
https://github.com/mradovic38/pycuda-simulated-annealing
Simulated annealing process for finding the 'minimum energy' of an image.
cuda image-energy parallel-computing parallel-programming pycuda python simulated-annealing
Last synced: 09 Oct 2025
https://github.com/skyguy126/cuda-learnings
Collection of personal CUDA learnings.
Last synced: 10 Oct 2025
https://github.com/bhavinpatel4199/image-processing-with-opencv-and-cuda-on-google-colab
This repository demonstrates image processing using OpenCV with CUDA for GPU acceleration on Google Colab. It includes basics like displaying and manipulating images, alongside advanced techniques using CUDA to enhance performance. Ideal for learning GPU-accelerated image processing in Python.
computer-vision cuda google-colab gpu-acceleration high-performance-computing image-processing opencv pixel-manupulation
Last synced: 19 Jan 2026
https://github.com/adesoji1/visis_backend_assessment_submission-adesoji
Create a backend API to handle book information requests, and summary generation.
bart cache cuda data-extraction fastapi flask hugging-face hugging-face-hub llama postman-api python3 pytorch spacy sqlite3-database swagger-api tensorboard-visualizations transformer ubuntu2304
Last synced: 19 Jan 2026
https://github.com/zcemycl/distributecompute
Parallel Computing and Distributed Computing with C++ threads, Python threads+asyncio+multiprocessing and Spark, and Cuda.
asyncio boost cpp cuda global-interpreter-lock jthread multiprocessing python spark thread
Last synced: 14 Apr 2026
https://github.com/ericrihm/yt-whisper
Fast, local YouTube transcription with speaker diarization and a keyboard-first Textual TUI. YouTube-subs fast path, faster-whisper on CUDA, opt-in pyannote diarization, prompt profile auto-detection.
cli cuda faster-whisper pyannote python speaker-diarization textual transcription tui whisper youtube
Last synced: 31 May 2026
https://github.com/isaurabhmeshram28/cuda-examples
This repository contains examples and experiments with CUDA programming to explore GPU computing and parallel processing using NVIDIA's CUDA framework.
Last synced: 19 May 2026
https://github.com/boostibot/bachelors
My bachelors thesis at CTU in Prague, Faculty of Nuclear Sciences and Physical Engineering supervised by Ing. Pavel Strachota, Ph.D
crystal-growth cuda finite-volume-method parallel-programming phase-field-method
Last synced: 26 Oct 2025
https://github.com/centuriontheman/parallelsortingalgorithms
Cpp/CUDA application for benchmarking sorting algorithms
benchamark cpp cuda multithreading sorting-algorithms
Last synced: 18 Feb 2026
https://github.com/fabricsoul/pytorch-jupyter-cuda
The docker image that can run pytorch and jupyterlab
ai artificial-intelligence cuda deep-learning deep-neural-networks docker docker-container docker-image gpu-computing jupyterlab machine-learning machinelearning nvidia nvidia-cuda nvidia-docker nvidia-smi pytorch
Last synced: 13 Mar 2026
https://github.com/anthongretter/spmv-cuda-analysis
A analysis on different approaches on Sparse Matrix-Vector Multiplication (SpMV) on GPU using CUDA
cuda gpu matrix-computations spmv unitn
Last synced: 14 Oct 2025
https://github.com/sjmonson/tdr-inverse
A set of CUDA programs that invert matrices
cuda gpu matrix-inverse matrix-inversion tdr
Last synced: 14 Oct 2025
https://github.com/seanwevans/damnati
A CUDA-accelerated iterated prisoner's dilemma arena
arena cuda iterated-prisoners-dilemma prisoners-dilemma tournament
Last synced: 14 May 2026
https://github.com/obitech/tuc-ki-gpu-docker
cuda docker machine-learning nvidia-docker nvidia-gpu tensorflow tuc
Last synced: 14 Apr 2026
https://github.com/lu1smgb/ppr
Asignatura de Programacion Paralela. Curso 2024/2025. Universidad de Granada
Last synced: 15 Oct 2025
https://github.com/masterskepticista/parallel_reductions_cuda
Iteratively optimizing parallel reductions in CUDA.
Last synced: 16 Oct 2025
https://github.com/puzzlef/vector-sum-cuda
Comparing performance of sequential vs CUDA-based vector element sum.
cuda element experiment gpu sum vector
Last synced: 14 Apr 2026
https://github.com/dwain-barnes/llm-gguf-auto-converter
Automated Jupyter notebook solution for batch converting Large Language Models to GGUF format with multiple quantization options. Built on llama.cpp with HuggingFace integration.
auto-converter batch-processing cuda gguf huggingface jupyter-notebook llama-cpp llm model-quantization
Last synced: 17 Jun 2025
https://github.com/sakurabtc888/1000_btc_bitcoin_challenge
🔥 针对 [privatekeys.pw] 160个比特币的CPU+GPU碰撞工具
Last synced: 21 Oct 2025
https://github.com/paranoia55/env-setup
🚀 Set up a complete, production-ready JavaScript/TypeScript development environment on macOS with AI tools using a single command.
cuda deep-learning ethical-hacking-tools ios kali-linux kali-linux-tools make makefile next-14 next-appdir next-starter open-source-project prettier radix-ui shell tensorflow-gpu typescript ubuntu
Last synced: 29 Apr 2026
https://github.com/chibby0ne/cuda_by_example
Old notes (and new ones) of the Cuda by Example book
cuda cuda-programming gpgpu gpu-computing gpu-programming
Last synced: 15 Mar 2026
https://github.com/dhakalnirajan/baghchal-rl
C/CUDA implementation of Baagh Chaal Game with Neural Network
bagh-chal baghchal c clang cuda cuda-kernels neural-network reinforcement-learning
Last synced: 14 Apr 2026
https://github.com/islam-hady9/deep-cuda
Image Classification with CNN in CUDA C++
artificial-intelligence cnn cpp cuda deep-learning gpu-programming image-classification machine-learning mnist-dataset neural-networks-from-scratch parallel-computing
Last synced: 02 May 2026
https://github.com/soloangema/nvidia-8z6ms
🐱 Generate randomized README files to enhance DX farming, powered by NVIDIA technology for an engaging development experience.
artificial-intelligence computer-vision cuda data-science deep-learning gpu image-processing machine-learning neural-networks nvidia parallel-computing performance-optimization pytorch tensor video-processing
Last synced: 02 May 2026
https://github.com/chad24dev/gpu-agent-opt
🧠 Optimize GPU workflows with `gpu-agent-opt`, a Python package for profiling, scientific computing, and efficient CUDA exploration.
ai-agents autotuning cuda edge-ai geospatial gpu hpc nvidia optimization performance pytorch
Last synced: 07 May 2026
https://github.com/kichappa/videosift
CUDA based 3D Computer Vision for Exoskins
computer-vision convolution-filter cuda hpc julia sift-algorithm
Last synced: 15 May 2026
https://github.com/thephiltacular/voice-ai-pipeline
A containerized AI pipeline for real-time speech-to-text and text-to-speech conversion, leveraging Whisper ASR and Coqui TTS models with Kubernetes orchestration. Features a Gradio web interface, GPU acceleration, and local processing for privacy-focused voice applications. 🚀🎤📝
ai asr containerization coqui-tts cuda docker fastapi gpu gradio kubernetes machine-learning nvidia orchestration python real-time-processing speech-recognition text-to-speech tts voice-assistant whisper
Last synced: 15 Apr 2026
https://github.com/sedflix/cuda_pattern_matching
Getting words frequency using the concepts of pattern matching in CUDA
Last synced: 17 Mar 2026
https://github.com/matiasvlevi/cuno
Provides cuda bindings, kernel maps and device memory managment for Dannjs computations. [Experimental and not complete]
addon cuda dann dannjs machine-learning nodejs
Last synced: 15 Apr 2026
https://github.com/mohamedsamirx/yolov12-tensorrt-cpp
YOLOv12 Inference Using CPP, Tensorrt, And CUDA
cpp cuda tensorrt tensorrt-inference yolo yolov12
Last synced: 15 Apr 2026
https://github.com/manu-sh/cuda-mandelbrot
how to use cuda acceleration to compute mandelbrot set
Last synced: 15 Apr 2026
https://github.com/ahmed5827/image_generation
This application provides a graphical user interface (GUI) for generating images using the Stable Diffusion model. The GUI allows users to input a text prompt, and the application generates an image based on the prompt.
ai cuda generative-ai image-generation
Last synced: 15 Apr 2026
https://github.com/lyynn777/cuda-bitonic-sort
Simple CUDA project to implement Bitonic Sort and compare it with normal CPU sorting.
bitonic-sort cuda gpu-computing gpu-vs-cpu parallel-computing performance-testing pycuda python
Last synced: 15 Apr 2026
https://github.com/tkemmer/cunessie.jl
CUDA-accelerated Nonlocal Electrostatics in Structured Solvents
bioinformatics boundary-element-method cuda electrostatics gpu-computing julia proteins
Last synced: 31 Jan 2026
https://github.com/snandasena/courseera_gpu_specilization
Example for Cuda streaming
Last synced: 15 Apr 2026
https://github.com/starlitdreams/pacman-convolutional-q-learning
This project implements a Deep Q-Network (DQN) using PyTorch to train an agent to play Atari's Ms. Pac-Man. It utilizes reinforcement learning with a convolutional neural network (CNN) for image processing. Features include experience replay, frame preprocessing, and CUDA support, with trained model saving and video rendering of gameplay.
artificial-intelligence artificial-neural-networks atari cuda deep-learning deep-learning-algorithms deep-q-learning deeplearning gymnasium gymnasium-environment python pytorch
Last synced: 15 Apr 2026