An open API service indexing awesome lists of open source software.

CUDA

CUDA® is a parallel computing platform and programming model developed by NVIDIA for general computing on graphical processing units (GPUs). With CUDA, developers are able to dramatically speed up computing applications by harnessing the power of GPUs.

https://github.com/isquicha/cuda-parallel-studies

Learning CUDA programming here =D

cuda cuda-programming cuda-toolkit

Last synced: 03 Jul 2025

https://github.com/yutakseo/docker_ubuntu-cuda_environment

🐳 A ready-to-use Docker environment for deep learning development with Ubuntu 22.04 and CUDA 11.8.

container cuda docker environment ubuntu

Last synced: 12 Apr 2026

https://github.com/jaderock/cuda-by-example

Sample CUDA projects for the CUDA by Example book

bazel c cpp cuda gpu

Last synced: 05 May 2026

https://github.com/akira4o4/cuda-program

CUDA YOLO Processing

cuda yolo

Last synced: 22 Jul 2025

https://github.com/matthewfeickert/report-urssi-fellowship-2025

Report on URSSI 2025 Early-Career Fellowship

cuda pixi urssi

Last synced: 17 Jan 2026

https://github.com/nikhilrout/thetensorcoreproject

Microarchitecture implementation of Nvidia's Tensor Cores

cuda floating-point gpgpu hybrid-precision-training tensorcore

Last synced: 01 Apr 2025

https://github.com/dlzou/rt-weekend

Ray Tracing in One Weekend, using CUDA

cuda ray-tracing

Last synced: 28 Apr 2026

https://github.com/ray-chew/modified_ch

Density functional theory (DFT) and self-consistent field theory (SCFT) simulation of diblock copolymers

cuda density-functional-theory diblock-copolymer numerical-analysis numerical-methods self-consistent-field-theory

Last synced: 11 May 2026

https://github.com/hr-fahim/transformer-model-optimization

Sample GPT Transformer Model from Scratch.

cuda few-shot-learning transfomers

Last synced: 02 May 2026

https://github.com/rog0d/gpuss_watchers

"The GPU Watchers swore upon their shared memory hierarchy, from L1 to global memory, which also served as their mandate as lords of parallel computation."

cuda gpu-acceleration gpu-monitoring gpu-profiling

Last synced: 28 Apr 2026

https://github.com/aurelienperez/gpu-heston-monte-carlo

GPU-accelerated Monte Carlo simulation for option pricing under the Heston model using CUDA.

cuda gpu heston-model

Last synced: 01 Apr 2025

https://github.com/naetherm/derelictcurand

Dynamic bindings to the CuRAND library for the D Programming Language.

cuda curand d derelict dlang

Last synced: 27 Mar 2025

https://github.com/drilonaliu/parallel-mandelbrot-set

GPU-accelerated Mandelbrot Set generation with CUDA and OpenGL interoperability.

cuda fractals gpu mandelbrot-fractal parallel-programming

Last synced: 12 Apr 2026

https://github.com/juliankarrer/reyn

CUDA-based Implementation of Smoothed Particle Hydrodynamics for Fluid Simulation

cuda fluid lagrangian simulation sph

Last synced: 02 Jul 2026

https://github.com/TeamBipartite/bipartite-gemm

High throughput data-parallel GEMM implementations in Cuda using Cuda cores and Tensor cores

cuda data-parallelism gemm

Last synced: 14 Jan 2026

https://github.com/sephiroth7712/k-nearest-neigbours

Implementation of K-Nearest Neighbors algorithm using multiple parallel computing approaches: CUDA (GPU), Hadoop, Spark, MPI, OpenMP, and PThreads. Demonstrates scalable machine learning across different parallel computing paradigms from GPU to distributed frameworks.

cuda cuda-programming hadoop-mapreduce java mpi multiprocessing multithreading openmp pthreads scala spark

Last synced: 12 Apr 2026

https://github.com/tomtolleson/cuda-kernel-benchmarking-tool

A benchmarking tool in C++ that creates Cuda kernels and tests the overall system performance between CPU and GPU

cuda cuda-kernels cuda-support cuda-toolkit nvidia nvidia-cuda nvidia-gpu

Last synced: 30 Mar 2025

https://github.com/ne0nwinds/gpupuzzles

My solutions to srush/GPU-Puzzles using CUDA

cpp cuda gpgpu

Last synced: 16 May 2026

https://github.com/xza85hrf/flux_pipeline

FluxPipeline is a prototype experimental project that provides a framework for working with the FLUX.1-schnell image generation model. This project is intended for educational and experimental purposes only.

ai cuda docker educational experimental flux1 flux1-schnell flux1ai gradio image-generation model non-commercial python pytorch research transformer-model

Last synced: 05 Jul 2025

https://github.com/doxakis/cosinesimilaritydistancesongpu

Compute cosine similarity distances for all combinations of the dataset on the gpu with CUDA

cuda

Last synced: 13 Apr 2026

https://github.com/deep-1704/coa_lab_repo_grp01

COA Lab assignments

cuda gpgpu-sim

Last synced: 24 Dec 2025

https://github.com/githubfoam/cuda-travisci

cuda miniconda pytorch

cuda miniconda pytroch

Last synced: 02 Jul 2026

https://github.com/ysl1016/cudadigitfilter

CUDA-based parallel image filtering system for MNIST dataset

computer-vision cuda deep-learning gpu-acceleration image-processing mnist parallel-computing

Last synced: 28 Mar 2025

https://github.com/abhiksark/gluon-by-example

Learn Triton's Gluon by example — the same GPU kernels written in Triton and Gluon, benchmarked

cuda deep-learning gluon gpu gpu-kernels triton tutorial

Last synced: 01 Jul 2026

https://github.com/eyelor/text-to-image-item-generator

A Python workflow for generating random item images using models from Hugging Face.

ai conda cuda flux-schnell generator huggingface item llama python pytorch text-to-image

Last synced: 13 Apr 2026

https://github.com/axeloooo/pytorch

Collection of deep learning workflows in PyTorch, from fundamentals and classification to transfer learning and experiment tracking.

cuda python pytorch

Last synced: 28 Apr 2026

https://github.com/ltsyk/smart-snake-ai

Advanced Deep Q-Network AI for Snake Game with CUDA support and 700% performance boost

artificial-intelligence cuda deep-q-network dqn game-ai machine-learning pytorch reinforcement-learning snake-game

Last synced: 28 Apr 2026

https://github.com/asadiahmad/100_sports_image_classification

A deep learning project for sport image classification using a custom VGG19-based architecture with integrated Grad-CAM heatmap visualization for model interpretability.

computer-vision cuda data-augmentation deep-learning explainable-ai gpu-acceleration grad-cam heatmap-visualization image-classification mixed-precision-training pytorch pytorch-grad-cam sports-analytics sports-classification transfer-learning vgg19

Last synced: 11 Jun 2025

https://github.com/kronbii/thermal-super-resolution

State-of-the-art thermal super-resolution system (IMDN) with RGB→thermal adaptation, custom multi-component loss, 29.6 dB PSNR, 0.713 SSIM, 250+ FPS, production-ready PyTorch + CUDA implementation.

computer-vision cuda deep-learning image-enhancement imdn model-optimization production-machine-learning pytorch real-time real-time-processing research super-resolution thermal-imaging

Last synced: 18 Apr 2026

https://github.com/tornikeo/minimal-vscode-cuda-meson

Minimal sample of using VSCode and Meson to build CUDA applications

cuda meson template vscode

Last synced: 08 Sep 2025

https://github.com/morristai/kvik-rs

KvikIO Rust implementation

cuda cufile gds kvikio nvidia rust

Last synced: 02 Apr 2026

https://github.com/aeyage/intraday_prices

GPU-accelerated portfolio optimisation

cuda cupy nvidia-gpu

Last synced: 05 Apr 2025

https://github.com/voltr0x/raytracing-cuda

Raytracing in a weekend using CUDA

cpp11 cuda raytracing sdl2

Last synced: 01 Apr 2026

https://github.com/codename-detective/cuda_gpgpus_shared_memory_systems_pdp

CUDA GPGPUs Shared Memory Systems Parallel & Distributed Programming

cuda cuda-programming numa parallel-programming

Last synced: 30 Mar 2025

https://github.com/kataglyphis/machinelearningalgorithms

Basic Machine Learning Algorithms

cuda machine-learning python tensorflow

Last synced: 31 Mar 2025

https://github.com/Neuro-Mechatronics-Interfaces/python-intan

Tools and demos for working with EMG data from intan using python

circuitpython cuda emg pico python realtime tensorflow

Last synced: 13 Jan 2026

https://github.com/eastonman/tensorrt-pytorch-wrapper

A wrapper makes TensorRT engine accept PyTorch Cuda Tensor.

cuda pytorch tensorrt

Last synced: 06 May 2026

https://github.com/efecaliskannn/pneumonia-detection-with-cnn--vgg16--and-resnet50-deep-learning-models

In this project, pneumonia detection using deep learning, a subset of artificial intelligence, is aimed. The performance of deep learning algorithms, including CNN, VGG16, and ResNet50 models, in detecting pneumonia has been examined.(Bu projede yapay zekanın alt kümesi olan derin öğrenme ile zatürre tespiti amaçlanmaktadır.)

artificial-intelligence convolutional-neural-networks cuda deep-learning keras-tensorflow nvidia-cuda pyhton transfer-learning

Last synced: 13 Jun 2025

https://github.com/shtrophic/wicuvanity

Generate wireguard vanity keys on your Nvidia GPU

cuda gpu vanity-address vanity-addresses vanitygen wireguard

Last synced: 10 Mar 2025

https://github.com/lablup/backend.ai-accelerator-cuda

The Backend.AI CUDA Accelerator Plugin

backendai cuda

Last synced: 16 May 2026

https://github.com/cuda8/brainwords2

GPU brainflayer for sale $250

brain brainflayer brainwords cuda gpu key pass passphrase private

Last synced: 10 Mar 2025

https://github.com/elcruzo/cuda-conv

Lightweight CUDA kernel for 2D image convolution achieving 20x+ speedup. Built with CuPy for the NVIDIA Hackathon.

computer-vision convolution cuda cupy gpu-computing hackathon high-performance-computing image-processing nvidia python

Last synced: 15 May 2026

https://github.com/drbh/quemer

GPU accelerated k-mer counter

biology cuda gpu

Last synced: 07 May 2025

https://github.com/ionmich/cs149-local-dev

Provides `conda` installation instructions for Stanford's CS149 (Parallel Computing) programming assignments

conda cs149 cuda ispc parallel-computing

Last synced: 31 Mar 2025

https://github.com/ndgigliotti/torch-ipca

GPU-accelerated Incremental PCA for PyTorch

cuda dimensionality-reduction gpu incremental-pca machine-learning pca pytorch

Last synced: 26 Jan 2026

https://github.com/mrgkanev/tensorflow-gpu-docker-setup

A Docker environment for TensorFlow GPU development with optimized configurations for WSL2, troubleshooting guides, and common error fixes

cuda cuda-toolkit deep-learning dev-environment development-tools docker gpu-acceleration machine-learning nvidia-docker nvidia-docker-support python tensorflow

Last synced: 13 Apr 2026

https://github.com/moesio-f/cla

C Linear Algebra (CLA) library. A simple toy library for basic vector/matrix operations with CUDA support and Python bindings.

c cuda linear-algebra python

Last synced: 09 May 2026

https://github.com/minseoc03/cuda-100-days

A 100-day journey to master CUDA programming, inspired by the CUDA-120-DAYS--CHALLENGE project. This repo contains daily CUDA exercises and code folders, with learning notes hosted on Notion. Practicing on leetgpu.com due to lack of local NVIDIA GPU.

100daysofcode cuda deeplearning gpgpu gpu hpc nvidia parallel-computing

Last synced: 19 Apr 2025

https://github.com/hrshl212/custom-cuda-kernels-with-neural-network-implementation

The repository contains custom CUDA kernels for linear layer, softmax and relu which are integrated with python to develop a Neural Network

cuda neural-network python pytorch

Last synced: 08 May 2026

https://github.com/parxd/cuda-optim

optimizing CUDA kernels

cuda machine-learning

Last synced: 26 Mar 2025

https://github.com/atelierarith/julia_gpu_playground

For those who want use Julia with GPU

cuda docker docker-compose julia

Last synced: 28 Apr 2026

https://github.com/bolu-atx/cuda-dojo

Level up your CUDA skills - RPG style. Do or do not, there is no try.

cuda examples learning tutorial

Last synced: 01 Jul 2026

https://github.com/himeyama/cuda-convolve

convolve + cuda + ruby (1次元のみ対応)

cuda filter gem ruby

Last synced: 19 Apr 2026

https://github.com/mattjesc/federated-learning-simulation-1gpu-mi-is

Federated Learning Simulation on a Single GPU with Model Interpretability and Interactive Visualization

ai cuda deep-learning distributed-systems federated-learning gpu hpc keras machine-learning ml model-interpretability python pytorch simulation streamlit tensorflow

Last synced: 05 Jan 2026

https://github.com/lord-turmoil/cudacmakedemo

A demo for building CUDA program with CMake

cuda tutorial

Last synced: 16 Mar 2025

https://github.com/delusionary/histoptimizer

Solves a minimum variance cost of the partition problem.

cuda numba python

Last synced: 14 Jan 2026

https://github.com/dgcnz/nvtx-vscode

Create NVIDIA NVTX ranges directly in VS Code, then profile with Nsight Systems without modifying source code.

cuda nvtx pytorch vscode

Last synced: 13 Apr 2026