Projects in Awesome Lists tagged with cuda-programming
A curated list of projects in awesome lists tagged with cuda-programming .
https://github.com/taskflow/taskflow
A General-purpose Task-parallel Programming System using Modern C++
concurrent-programming cuda-programming gpu-programming heterogeneous-parallel-programming high-performance-computing multi-threading multicore-programming multithreading parallel parallel-computing parallel-programming taskflow taskparallelism threadpool work-stealing
Last synced: 12 Jan 2026
https://github.com/rust-gpu/rust-cuda
Ecosystem of libraries and tools for writing and executing fast GPU code fully in Rust.
cuda cuda-kernels cuda-programming gpgpu gpu gpu-programming rust rust-lang
Last synced: 14 May 2025
https://github.com/Rust-GPU/Rust-CUDA
Ecosystem of libraries and tools for writing and executing fast GPU code fully in Rust.
cuda cuda-kernels cuda-programming gpgpu gpu gpu-programming rust rust-lang
Last synced: 27 Mar 2025
https://github.com/xlite-dev/cuda-learn-notes
📚Modern CUDA Learn Notes: 200+ Tensor/CUDA Cores Kernels🎉, HGEMM, FA2 via MMA and CuTe, 98~100% TFLOPS of cuBLAS/FA2.
cuda cuda-kernels cuda-programming cuda-toolkit cudnn cutlass flash-attention flash-mla gemm gemv hgemm
Last synced: 15 Apr 2025
https://github.com/xlite-dev/CUDA-Learn-Notes
📚200+ Tensor/CUDA Cores Kernels, ⚡️flash-attn-mma, ⚡️hgemm with WMMA, MMA and CuTe (98%~100% TFLOPS of cuBLAS/FA2 🎉🎉).
cuda cuda-kernels cuda-programming cuda-toolkit cudnn cutlass flash-attention flash-mla gemm gemv hgemm
Last synced: 26 Mar 2025
https://github.com/DefTruth/CUDA-Learn-Notes
📚200+ Tensor/CUDA Cores Kernels, ⚡️flash-attn-mma, ⚡️hgemm with WMMA, MMA and CuTe (98%~100% TFLOPS of cuBLAS/FA2 🎉🎉).
cuda cuda-kernels cuda-programming cuda-toolkit cudnn cutlass flash-attention flash-mla gemm gemv hgemm
Last synced: 20 Mar 2025
https://github.com/nvidia/cccl
CUDA Core Compute Libraries
accelerated-computing cpp cpp-programming cuda cuda-cpp cuda-kernels cuda-library cuda-programming gpu gpu-acceleration gpu-computing gpu-programming hpc modern-cpp nvidia nvidia-gpu parallel-algorithm parallel-computing parallel-programming
Last synced: 05 Feb 2026
https://github.com/NVIDIA/cccl
CUDA Core Compute Libraries
accelerated-computing cpp cpp-programming cuda cuda-cpp cuda-kernels cuda-library cuda-programming gpu gpu-acceleration gpu-computing gpu-programming hpc modern-cpp nvidia nvidia-gpu parallel-algorithm parallel-computing parallel-programming
Last synced: 14 May 2025
https://github.com/brucefan1983/CUDA-Programming
Sample codes for my CUDA programming book
cuda-programming gpu-programming molecular-dynamics-simulation
Last synced: 14 May 2025
https://github.com/chelsea0x3b/cudarc
Safe rust wrapper around CUDA toolkit
cublas cuda cuda-kernels cuda-programming cuda-toolkit cudnn curand gpu gpu-acceleration nccl nvrtc rust
Last synced: 09 Feb 2026
https://github.com/eyalroz/cuda-api-wrappers
Thin, unified, C++-flavored wrappers for the CUDA APIs
api-wrapper cuda cuda-api-wrappers cuda-device cuda-driver cuda-driver-api cuda-programming cuda-runtime-api cuda-toolkit gpgpu gpgpu-computing gpu gpu-computing gpu-memory modern-cpp
Last synced: 22 Jan 2026
https://github.com/mit-han-lab/tinychatengine
TinyChatEngine: On-Device LLM Inference Library
arm c cpp cuda-programming deep-learning edge-computing large-language-models on-device-ai quantization x86-64
Last synced: 13 May 2025
https://github.com/harleyszhang/llm_note
LLM notes, including model inference, transformer model structure, and llm framework code analysis notes.
cuda-programming kv-cache llm llm-inference transformer-models triton-kernels vllm
Last synced: 23 Aug 2025
https://github.com/sail-sg/adan
Adan: Adaptive Nesterov Momentum Algorithm for Faster Optimizing Deep Models
adan artificial-intelligence bert-model convnext cuda-programming deep-learning diffusion dreamfusion fairseq gpt2 llm-training llms mae moe optimizer pytorch resnet timm transformer-xl vit
Last synced: 07 Jul 2025
https://github.com/sail-sg/Adan
Adan: Adaptive Nesterov Momentum Algorithm for Faster Optimizing Deep Models
adan artificial-intelligence bert-model convnext cuda-programming deep-learning diffusion dreamfusion fairseq gpt2 llm-training llms mae moe optimizer pytorch resnet timm transformer-xl vit
Last synced: 05 Apr 2025
https://github.com/mit-han-lab/TinyChatEngine
TinyChatEngine: On-Device LLM Inference Library
arm c cpp cuda-programming deep-learning edge-computing large-language-models on-device-ai quantization x86-64
Last synced: 07 May 2025
https://github.com/laugh12321/tensorrt-yolo
🚀 你的YOLO部署神器。TensorRT Plugin、CUDA Kernel、CUDA Graphs三管齐下,享受闪电般的推理速度。| Your YOLO Deployment Powerhouse. With the synergy of TensorRT Plugins, CUDA Kernels, and CUDA Graphs, experience lightning-fast inference speeds.
cuda cuda-graph cuda-kernels cuda-programming detection onnx ppyoloe tensorrt yolov10 yolov3 yolov5 yolov6 yolov7 yolov8 yolov9
Last synced: 14 May 2025
https://github.com/laugh12321/TensorRT-YOLO
🚀 你的YOLO部署神器。TensorRT Plugin、CUDA Kernel、CUDA Graphs三管齐下,享受闪电般的推理速度。| Your YOLO Deployment Powerhouse. With the synergy of TensorRT Plugins, CUDA Kernels, and CUDA Graphs, experience lightning-fast inference speeds.
cuda cuda-graph cuda-kernels cuda-programming detection onnx ppyoloe tensorrt yolov10 yolov3 yolov5 yolov6 yolov7 yolov8 yolov9
Last synced: 18 Mar 2025
https://github.com/PaddleJitLab/CUDATutorial
A self-learning tutorail for CUDA High Performance Programing.
cuda-programming deep-learning
Last synced: 14 May 2025
https://github.com/nosferalatu/simplegpuhashtable
A simple GPU hash table implemented in CUDA using lock free techniques
cuda cuda-programming data-structures gpu gpu-cuda-programs
Last synced: 27 Dec 2025
https://github.com/nosferalatu/SimpleGPUHashTable
A simple GPU hash table implemented in CUDA using lock free techniques
cuda cuda-programming data-structures gpu gpu-cuda-programs
Last synced: 06 May 2025
https://github.com/HMUNACHI/CUDATutorials
Zero to Hero GPU and CUDA for Maths & ML tutorials with examples.
cuda cuda-kernels cuda-programming machine-learning maths
Last synced: 24 Apr 2025
https://github.com/HMUNACHI/henry-vjp
From zero to hero CUDA for accelerating maths and machine learning on GPU.
cuda cuda-kernels cuda-programming machine-learning maths
Last synced: 05 Apr 2025
https://github.com/hmunachi/henry-vjp
From zero to hero CUDA for accelerating maths and machine learning on GPU.
cuda cuda-kernels cuda-programming machine-learning maths
Last synced: 08 Apr 2025
https://github.com/HMUNACHI/cuda-tutorials
CUDA tutorials or Maths & ML tutorials with examples, covers multi-gpus, fused attention, winograd convolution, reinforcement learning.
cuda cuda-kernels cuda-programming machine-learning maths
Last synced: 13 May 2025
https://github.com/MuGdxy/muda
μ-Cuda, COVER THE LAST MILE OF CUDA. With features: intellisense-friendly, structured launch, automatic cuda graph generation and updating.
cuda cuda-cpp cuda-programming
Last synced: 09 Jul 2025
https://github.com/rocm/hip-cpu
An implementation of HIP that works on CPUs, across OSes.
cpp17 cuda cuda-programming hip hip-kernel-language hip-portability hip-runtime parallel-algorithms spmd stl-algorithms
Last synced: 12 Apr 2025
https://github.com/tgautam03/xgemm
Accelerated General (FP32) Matrix Multiplication from scratch in CUDA
cuda-programming gpu-programming matrix-multiplication sgemm
Last synced: 06 Apr 2025
https://github.com/sunsetquest/cudapad
CudaPAD is a PTX/SASS viewer for NVIDIA Cuda kernels and provides an on-the-fly view of the assembly.
cuda cuda-programming gpu nvidia ptx ptx-utils windows
Last synced: 25 Jul 2025
https://github.com/fahimfba/cuda-wsl2-ubuntu
Install CUDA on Windows11 using WSL2
cuda cuda-programming cuda-support cuda-toolkit cuda-wsl deep-learning deep-reinforcement-learning deeplearning deeplearning-ai machine-learning machinelearning machinelearning-python wsl wsl-environment wsl-ubuntu wsl2
Last synced: 14 Apr 2025
https://github.com/emptysoal/cuda-image-preprocess
Speed up image preprocess with cuda when handle image or tensorrt inference
cnn cuda cuda-demo cuda-kernels cuda-programming deep-learning image-processing tensorrt
Last synced: 01 Aug 2025
https://github.com/huangcongqing/cuda-learning
cuda编程学习入门
cuda cuda-kernels cuda-programming
Last synced: 15 Apr 2025
https://github.com/LinhanDai/yolov9-tensorrt
YOLOv9 Tensorrt deployment acceleration,provide two implementation methods: C++and Python🔥🔥🔥
cpp cuda-programming python tensorrt yolov9
Last synced: 18 Mar 2025
https://github.com/coderonion/cuda-beginner-course-cpp-version
bilibili视频【CUDA 12.x 并行编程入门(C++版)】配套代码
cpp cublas cuda cuda-programming cudnn gpu gpu-programming nvcc nvidia parallel-programming python rust
Last synced: 15 Jun 2025
https://github.com/ashvardanian/cuda-python-starter-kit
Parallel Computing starter project to build GPU & CPU kernels in CUDA & C++ and call them from Python without a single line of CMake using PyBind11
cmake cuda cuda-programming hip hpc matrix-multiplication openmp parallel-computing parallel-programming pybind pybind11 python starter-kit starter-template tutorial
Last synced: 13 Jul 2025
https://github.com/Lin-Mao/DrGPUM
A memory profiler for NVIDIA GPUs to explore memory inefficiencies in GPU-accelerated applications.
cuda-programming gpu-memory gpu-memory-profiler gpu-profiler memory-management
Last synced: 22 Jul 2025
https://github.com/koushikphy/intro-to-cuda-fortran
A Complete beginner's introduction to programming with CUDA Fortran
cuda cuda-fortran cuda-kernels cuda-programming fortran fortran90 gpgpu gpu gpu-computing high-performance-computing hpc nvidia nvidia-cuda parallel-computing parallel-programming
Last synced: 28 Oct 2025
https://github.com/ahmetfurkandemir/nvidia-gpu-benchmark
NVIDIA GPU benchmark
aws c colab-notebook cpp cuda cuda-programming gpu gpu-computing gpu-programming linux nvidia nvidia-gpu tesla
Last synced: 15 Apr 2025
https://github.com/yichengdwu/flashattention.jl
Julia implementation of the Flash Attention algorithm
cuda-programming deeplearning transfomers
Last synced: 17 Jun 2025
https://github.com/imsanjoykb/cuda-bootcamp
CUDA Programming Practices
computer-vision crypto-mining crypto-mining-program cuda cuda-api cuda-development cuda-device cuda-driver cuda-kernels cuda-library cuda-opengl cuda-programming cuda-resource cuda-support cuda-toolkit jetson jetson-inference jetson-xavier nvidia-cuda nvidia-jetson-nano
Last synced: 05 Jul 2025
https://github.com/rrze-hpc/md-bench
A performance-oriented prototyping harness for state of the art Molecular Dynamics algorithms
benchmark cuda-programming hpc molecular-dynamics scientific-computing
Last synced: 24 Apr 2025
https://github.com/tgautam03/tgemm
General Matrix Multiplication using NVIDIA Tensor Cores
cuda-kernels cuda-programming gpu-computing gpu-programming matrix-multiplication nvidia-cuda nvidia-gpu nvidia-tensor-cores sgemm tensor-cores
Last synced: 15 Apr 2025
https://github.com/littlebearsama/xxCu3Dlibrary
cuda 加速3D点云算法库,持续更新(含cudaicp,glfw点云可视化等)
cuda-programming glfw3 pointcloud
Last synced: 18 Mar 2025
https://github.com/l3lackcurtains/dbscan-kdtree-cuda
:fries: Massively parallel DBSCAN algorithm implemented in CUDA along with a KD-Tree for searching neighbors.
cuda-programming dbscan kd-tree
Last synced: 18 Aug 2025
https://github.com/minnukota381/cuda-parallel-c-programming
This repository contains various CUDA C programs demonstrating parallel computing techniques using NVIDIA's CUDA platform.
cuda cuda-programming hpc nvcc nvidia
Last synced: 30 Jun 2025
https://github.com/professorcode1/event-analysis
Library for Event Synchronization and Event Coincidence Analysis
cuda cuda-kernels cuda-library cuda-programming event-analysis event-coincidence event-coincidence-analysis event-series event-series-analysis event-synchronization time-series-analysis
Last synced: 24 Oct 2025
https://github.com/nssharmaofficial/kmeans-in-cuda
K-Means algorithm parallelized in CUDA
cpp cuda cuda-programming high-performance high-performance-computing k-means k-means-algorithm k-means-clustering parallel parallel-computing
Last synced: 27 Apr 2025
https://github.com/tgautam03/xfilters
GPU (CUDA) accelerated filters using 2D convolution for high resolution images.
2d-convolution c cpp cuda cuda-programming gpu-acceleration gpu-computing gpu-programming image-filters image-processing
Last synced: 10 Oct 2025
https://github.com/neoblizz/hip_template
🖤 Template for starting HIP/C++ project using CMake with Github Action for CI.
cpp cuda cuda-programming gpgpu gpu hip rocm template-project template-repository
Last synced: 26 Mar 2025
https://github.com/seieric/gst-dsobjectsmosaic
📀NVIDIA DeepStream integrated GStreamer Plugin. It can blur objects with cuda cores on Jetson boards. Fast and smooth since everything is done on NVMM.🏎
cuda-programming deepstream gstreamer gstreamer-plugins jetson-agx-orin jetson-agx-xavier jetson-tx1 jetson-tx2 jetson-xavier jetson-xavier-nx nvidia-jetson nvidia-jetson-nano opencv opencv4
Last synced: 15 Jul 2025
https://github.com/coderonion/cuda-beginner-course-python-version
bilibili视频【CUDA 12.x 并行编程入门(Python版)】配套代码
cpp cublas cuda cuda-programming cudnn cupy gpu gpu-programming nvcc nvidia parallel-programming python rust
Last synced: 19 Oct 2025
https://github.com/coderonion/cuda-beginner-course-rust-version
bilibili视频【CUDA 12.x 并行编程入门(Rust版)】配套代码
candle cpp cublas cuda cuda-programming cudarc cudnn gpu gpu-programming nvcc nvidia parellel-programming python rust
Last synced: 15 Jun 2025
https://github.com/evanmcclure/hello_gpu
Hello world example for Rust on GPU
apple apple-silicon cuda cuda-programming example-project gpu gpu-programming gpu-support metal rust rust-lang
Last synced: 12 Apr 2025
https://github.com/gpuengineering/gputils
A C++ header-only library for parallel linear algebra on GPUs (CUDA/cuBLAS under the hood)
cplusplus-17 cplusplus-20 cpp cuda cuda-c cuda-cpp cuda-programming header-only linear-algebra
Last synced: 13 Aug 2025
https://github.com/yashkathe/image-noise-reduction-with-cuda
This project conducts an analysis of image denoising technique - median blur, comparing GPU-accelerated (Numba) and CPU-based (OpenCV) processing speeds.
cuda cuda-programming gpu-programming hardware-speed-analysis image-analysis image-processing numba nvidia nvidia-cuda nvidia-gpu opencv parallel-programming
Last synced: 14 May 2025
https://github.com/lawmurray/gpu-gemm
CUDA kernel for matrix-matrix multiplication on Nvidia GPUs, using a Hilbert curve to improve L2 cache utilization.
cplusplus cuda cuda-kernels cuda-programming gpu gpu-computing gpu-programming matrix-multiplication numerical-methods scientific-computing
Last synced: 14 Apr 2025
https://github.com/hrolive/fundamentals-of-accelerated-computing-with-cuda-c-cpp
Accelerate and optimize existing C/C++ CPU-only applications using the most essential CUDA tools and techniques.
cpp cuda cuda-kernels cuda-programming nsight nvidia profilling
Last synced: 10 Apr 2025
https://github.com/jaredhoberock/ubu
circlelang cuda cuda-programming gpu-computing gpu-programming
Last synced: 19 Jan 2026
https://github.com/marcoplaitano/counting-sort-cuda
Parallelized version of Counting Sort using CUDA
counting-sort cuda cuda-kernels cuda-programming gpu gpu-programming sort sorting sorting-algorithms
Last synced: 20 Jun 2025
https://github.com/toxy4ny/artaxerxes
Artaxerxes - Adaptive High-Performance Stress Tester v.1.0. Rebuild old version Xerxes DDoS. Supports GPU+io_uring, DPDK, eBPF/XDP with intelligent fallbacks. Educational tool for advanced cybersecurity labs
cuda cuda-programming cybersecurity cybersecurity-education cybersecurity-tools dpdk ebpf educational high-performance network-security network-security-tool penetration-testing penetration-testing-framework penetration-testing-tools security-tools stress-testing
Last synced: 08 Oct 2025
https://github.com/shikha-code36/cuda-programming-beginner-guide
A beginner's guide to CUDA programming
cuda cuda-basic cuda-basics cuda-cpp cuda-demo cuda-kernel cuda-kernels cuda-library cuda-programming cuda-support cuda-toolkit
Last synced: 05 Jan 2026
https://github.com/qin-yu/julia-svm-gpu-cuda
2019 [Julia] GPU CUDAnative SVM: a stochastic decomposition implementation of support-vector machine training
cpp cuda cuda-programming gpu gpu-computing gpu-programming julia julia-language julia-package machine-learning machine-learning-algorithms machine-learning-library online-learning supervised-learning svm svm-classifier svm-learning svm-library svm-model svm-training
Last synced: 18 Oct 2025
https://github.com/l1cacheDell/CUDA_Code
Codes for learning cuda. Implementation of multiple kernels.
Last synced: 10 Mar 2025
https://github.com/orlandopalmeira/trabalho-cp-2023-2024
Repositório do trabalho prático no âmbito da UC de Computação Paralela (CP) - Mestrado em Engenharia Informática (MEI/MIEI) - Universidade do Minho (UMinho)
computacao-paralela cp cuda cuda-programming mei miei nvidia nvidia-cuda openmp optimization optimization-problem parallelism performance uminho uminho-mei uminho-miei
Last synced: 20 Mar 2025
https://github.com/alexjmercer/fractal-art
Generating Fractals in C++ using SFML. For the ultimate visual stimulation and in-depth code!
cmake cmakelists cpp20 cuda cuda-programming fractal-rendering graphics mandelbrot multithreading sfml2
Last synced: 13 Sep 2025
https://github.com/babak2/optimizedsum
Optimized Parallel Sum program demonstrating CPU vs GPU performance
cuda cuda-programming gpu-acceleration gpu-computing gpu-parallelism visual-studio
Last synced: 27 Mar 2025
https://github.com/Awrsha/Advanced-CUDA-Programming-GPU-Architecture
This repository provides a comprehensive guide to optimizing GPU kernels for performance, with a focus on NVIDIA GPUs. It covers key tools and techniques such as CUDA, PyTorch, and Triton, aimed at improving computational efficiency for deep learning and scientific computing tasks.
cuda-programming gpu-programming jit kernels matmul mojo-language multiprocessing multithreading torchquantum triton
Last synced: 19 Sep 2025
https://github.com/pastekaztekastor/crowd-simulation
Le projet consiste en une simulation de foule sur une grille, avec des versions parallélisées sur carte graphique. L'objectif est de modéliser le mouvement des individus dans un environnement en utilisant des paramètres tels que la dimension de la grille, le nombre d'individus et exporte de résultat de chaque frame dans unfichier bin pour analyse.
c cmake cpp crowdsimulation cuda-programming graphicscard grid-layout ipynb make nvidia-gpu parallelization
Last synced: 02 Mar 2025
https://github.com/pei-mao/cuda-erodeanddilate
This is a small implementation example of image processing using CUDA technology, demonstrating basic operation methods.
cuda-programming image-processing
Last synced: 25 Jul 2025
https://github.com/cat-gawr/ai-python
Una piccola AI che il suo picco massimo di risposta è stato di 0.02 secondi di risposta | Konata ~ 2025
cpp cuda-programming golang java python3 tex vhdl-modules
Last synced: 16 Jun 2025
https://github.com/saiccoumar/cuda-programming-exercises
Brief collection of GPU exercises (my reimplementation). Comes with relevant resources.
cuda cuda-programming nvcc nvidia
Last synced: 25 Dec 2025
https://github.com/loukmane-lok/cuda-prgoramming-basics
This repository contains CUDA-based implementations of several parallel computing algorithms and operations, focusing on high-performance GPU computations using NVIDIA's CUDA framework.
cuda-programming gpu-computing kernel-computation parallel-programming
Last synced: 23 Aug 2025
https://github.com/alextmjugador/rust-cuda-quickstart
Bring the Rust-CUDA project back to life under modern Linux environments.
cuda cuda-programming cuda-rust cuda-support docker rust
Last synced: 06 Jul 2025
https://github.com/satyajitghana/gpu-programming
Contains the contents of GPU Architecture and Programming course done on NPTEL
c cpp cuda cuda-programming gpu-programming nptel nvidia
Last synced: 05 May 2025
https://github.com/liberxue/parallel_computing
CUDA Algorithm && Hacker's Delight
algorithms cuda cuda-kernels cuda-programming hacker-s-delight nvidia
Last synced: 20 Feb 2025
https://github.com/GCaptainNemo/Cuda-Image-Processing
Using CUDA GPU Programming to speed up image processing.
cuda-programming image-processing
Last synced: 20 Mar 2025
https://github.com/vietdoo/seam-carving-cuda
CUDA Seam Carving: Accelerating Image Resizing with GPU Computing
cc cuda cuda-programming gpu-computing parrallel-computing seam-carving
Last synced: 31 Mar 2025
https://github.com/jadenmeyer/fourier-fft-project
Documentation of final project for Fourier Analysis
cuda-programming fft heat-equ matlab
Last synced: 22 Mar 2025
https://github.com/dpetrosy/fractal
This project is a Fractal Visualizer developed in C++ with SFML and CUDA.
burning-ship cmake cmakelists cpp cpp-programming cpp-project cuda cuda-opengl cuda-programming fractal fractal-generation fractal-visualization julia mandelbox mandelbrot opengl opengl-project sfml sfml-library tricorn
Last synced: 15 Oct 2025
https://github.com/enriquebdel/clases-cuda-programacion-paralela-en-c-
En este repositorio encontrarás varias lecciones creadas por mí sobre la librería CUDA en C. El programa que utilizo para programar es MobaXterm.
c cuda cuda-programming gnu-linux googlecolab mobaxterm nvidia parallel-programming ubuntu university
Last synced: 21 Mar 2025
https://github.com/kartavyaantani/cuda_image_processing
A CUDA-accelerated image processing project featuring multiple GPU-based filters and enhancement techniques. Implements convolution, edge detection, Non-Local Means (NLM) denoising, K-Nearest Neighbors (KNN), and pixelization. Each operation is optimized using CUDA kernels for real-time performance on large images. The project supports command-line
cuda cuda-kernels cuda-programming cuda-toolkit gpu-programming high-performance-computing image-manipulation image-processing nvidia-cuda nvidia-gpu
Last synced: 19 Apr 2025
https://github.com/giorgiogamba/parallel_programming
Experimenting with parallel programming
cuda cuda-kernels cuda-programming cuda-toolkit parallel parallel-computing parallel-processing parallel-programming visual-studio
Last synced: 10 Oct 2025
https://github.com/sartajbhuvaji/cuda
Deloped CUDA kernel functions to load and train a Convolution Neural Network from scratch.
cuda cuda-programming gpu-programming neural-network nvidia-cuda
Last synced: 30 Mar 2025
https://github.com/tommaso-dognini/polimi_gpu101_courseproject
Polimi Passion In Action GPU101 course project. Implementation in CUDA of BFS algorithm
cpp cuda cuda-programming parallel-computing
Last synced: 01 Nov 2025
https://github.com/headless-start/fashion-mnist-classifier
This repository contains Fashion MNIST Image Classification.
cuda-programming gpu keras mnist-dataset object-detection opencv-python python3 tensorflow tensorflow-models
Last synced: 13 Oct 2025
https://github.com/jorgedavyd/nsight.nvim
A developer oriented Neovim framework for CUDA performance profiling and analysis.
cuda cuda-kernels cuda-profiler cuda-programming cuda-support cuda-toolkit deep-learning machine-learning neovim neovim-plugin performance-engineering
Last synced: 21 Mar 2025
https://github.com/djdhairya/nut-bolt-classification
The "NutBoltClassifier" system represents a significant leap forward in automated fastener classification, harnessing deep learning and computer vision techniques.
aritificial-intelligence cnn cuda-programming deep-learning machine-learning nvidia-gpu rnn tensorflow
Last synced: 13 Sep 2025
https://github.com/gravitytwog/electromagneticfield
Electro-magnetic field simulation made with CUDA
c cuda cuda-kernels cuda-programming
Last synced: 14 Apr 2025
https://github.com/nrmancuso/big-bang
CUDA and OpenMp NBody simulation based on data from the Milky Way and Andromeda Galaxies
c cuda-kernels cuda-programming nbody-simulation openmp-parallelization parallel-computing space
Last synced: 13 Jun 2025
https://github.com/m15kh/cuda_programming
CUDA programming enables parallel computing on NVIDIA GPUs for high-performance tasks like deep learning and scientific computing
cuda cuda-programming gpu nvidia parallel-computing practice-programming
Last synced: 03 Apr 2025
https://github.com/inventwithdean/cuda_mlp
Implementation of a simple Multilayer Perceptron in pure CUDA
cuda cuda-programming deep-learning neural-networks
Last synced: 30 Mar 2025
https://github.com/binarybrainiacs/nexly
Deep Tech R&D Research
artificial-intelligence cpp20 cuda cuda-kernels cuda-programming deep-learning deep-neural-networks experimental hpc-systems infrastructure java maven natural-language-processing reccomendation-system research-and-development visual-studio
Last synced: 08 Apr 2025
https://github.com/0x778/gaussian_filter_using_cuda
Implemention of gaussain filter using CUDA
cuda cuda-kernels cuda-programming image-processing
Last synced: 09 Apr 2025
https://github.com/chibby0ne/cuda_by_example
Old notes (and new ones) of the Cuda by Example book
cuda cuda-programming gpgpu gpu-computing gpu-programming
Last synced: 20 Feb 2025
https://github.com/ibra-kdbra/parallel-programming
My research, playground, techniques with Parallel Programming
access-control cpp cpu-scheduling cuda-programming exception-handling multithreading parallel-computing parallel-programming task-scheduler
Last synced: 02 Feb 2026
https://github.com/aarid/cuda_operations
This project compares performance between CPU and GPU with CUDA operations. Two simples cases are used: matrix multiplication and 2d convolution.
conv2d cuda cuda-programming gpu gpu-computing matrix-multiplication
Last synced: 20 Feb 2025
https://github.com/sueszli/julia-gone-wild
parallel rendering of julia sets with CUDA and OpenMP
cuda-programming fractal-rendering julia-set openmp-parallelization
Last synced: 05 Oct 2025
https://github.com/yash-1335/qwen600
🚀 Build a fast inference engine for the QWEN3-0.6B model using CUDA, optimizing performance with minimal dependencies for efficient learning and practice.
cuda cuda-programming gpu llamacpp llm llm-inference qwen qwen3 transformer
Last synced: 07 Oct 2025
https://github.com/shivamiitk21/k-medoids-parallel
Algorithms for K-Medoids Clustering
cuda-programming kmedoids-clustering parallel
Last synced: 27 Jan 2026