An open API service indexing awesome lists of open source software.

Projects in Awesome Lists tagged with gpu-programming

A curated list of projects in awesome lists tagged with gpu-programming .

https://github.com/taichi-dev/taichi

Productive, portable, and performant GPU programming in Python.

computer-graphics differentiable-programming gpu gpu-programming sparse-computation taichi

Last synced: 12 May 2025

https://github.com/exaloop/codon

A high-performance, zero-overhead, extensible Python compiler with built-in NumPy support

compiler gpu-programming high-performance llvm numpy parallel-programming python

Last synced: 13 May 2025

https://github.com/plasma-umass/scalene

Scalene: a high-performance, high-precision CPU, GPU, and memory profiler for Python with AI-powered optimization proposals

cpu cpu-profiling gpu gpu-programming memory-allocation memory-consumption performance-analysis performance-cpu profiler profiles-memory profiling python python-profilers scalene

Last synced: 31 Jan 2026

https://github.com/qianmo/game-programmer-study-notes

:anchor: 我的游戏程序员生涯的读书笔记合辑。你可以把它看作一个加强版的Blog。涉及图形学、实时渲染、编程实践、GPU编程、设计模式、软件工程等内容。Keep Reading , Keep Writing , Keep Coding.

blog book books cg ebook ebooks game-developing-notes game-development game-programmer game-programming gpu-programming graphics notes real-time-rendering realtime-rendering rendering shader study-notes

Last synced: 28 Jan 2026

https://github.com/QianMo/Game-Programmer-Study-Notes

:anchor: 我的游戏程序员生涯的读书笔记合辑。你可以把它看作一个加强版的Blog。涉及图形学、实时渲染、编程实践、GPU编程、设计模式、软件工程等内容。Keep Reading , Keep Writing , Keep Coding.

blog book books cg ebook ebooks game-developing-notes game-development game-programmer game-programming gpu-programming graphics notes real-time-rendering realtime-rendering rendering shader study-notes

Last synced: 19 Mar 2025

https://github.com/embarkstudios/rust-gpu

🐉 Making Rust a first-class language and ecosystem for GPU shaders 🚧

gpu-programming rust shaders

Last synced: 12 May 2025

https://github.com/EmbarkStudios/rust-gpu

🐉 Making Rust a first-class language and ecosystem for GPU shaders 🚧

gpu-programming rust shaders

Last synced: 27 Mar 2025

https://github.com/rust-gpu/rust-cuda

Ecosystem of libraries and tools for writing and executing fast GPU code fully in Rust.

cuda cuda-kernels cuda-programming gpgpu gpu gpu-programming rust rust-lang

Last synced: 14 May 2025

https://github.com/Rust-GPU/Rust-CUDA

Ecosystem of libraries and tools for writing and executing fast GPU code fully in Rust.

cuda cuda-kernels cuda-programming gpgpu gpu gpu-programming rust rust-lang

Last synced: 27 Mar 2025

https://github.com/uber/aresdb

A GPU-powered real-time analytics storage and query engine.

analytics cgo cuda data database golang gpu-programming query real-time storage

Last synced: 14 May 2025

https://github.com/rust-gpu/rust-gpu

🐉 Making Rust a first-class language and ecosystem for GPU shaders 🚧

compiler gpu-programming graphics-programing rust shaders spirv vulkan

Last synced: 14 May 2025

https://github.com/calebwin/emu

The write-once-run-anywhere GPGPU library for Rust

emu gpgpu gpu gpu-acceleration gpu-computing gpu-programming rust

Last synced: 14 May 2025

https://calebwin.github.io/emu/

The write-once-run-anywhere GPGPU library for Rust

emu gpgpu gpu gpu-acceleration gpu-computing gpu-programming rust

Last synced: 30 Apr 2025

https://github.com/Rust-GPU/rust-gpu

🐉 Making Rust a first-class language and ecosystem for GPU shaders 🚧

compiler gpu-programming graphics-programing rust shaders spirv vulkan

Last synced: 26 Sep 2025

https://github.com/QianMo/GPU-Gems-Book-Source-Code

:cd: CD Content ( Source Code ) Collection of Book <GPU Gems > 1~ 3 | 《GPU精粹》 1~ 3 随书CD(源代码)珍藏

book-source-code glsl gpu gpu-gems gpu-programming graphics hlsl rendering shader

Last synced: 08 May 2025

https://github.com/qianmo/gpu-gems-book-source-code

:cd: CD Content ( Source Code ) Collection of Book <GPU Gems > 1~ 3 | 《GPU精粹》 1~ 3 随书CD(源代码)珍藏

book-source-code glsl gpu gpu-gems gpu-programming graphics hlsl rendering shader

Last synced: 12 Apr 2025

https://github.com/qianmo/gpu-pro-books-source-code

:cd: Source Code Collection of Book <GPU Pro> 1~ 7 | 《GPU Pro》1~ 7 书本源代码珍藏

book-source-code game-development gpu-pro gpu-programming graphics-programming rendering shader

Last synced: 26 Jan 2026

https://github.com/QianMo/GPU-Pro-Books-Source-Code

:cd: Source Code Collection of Book <GPU Pro> 1~ 7 | 《GPU Pro》1~ 7 书本源代码珍藏

book-source-code game-development gpu-pro gpu-programming graphics-programming rendering shader

Last synced: 01 May 2025

https://github.com/AmesingFlank/taichi.js

Modern GPU Compute and Rendering in Javascript

gpu gpu-computing gpu-programming javascript webgpu webgpu-api webgpu-shaders

Last synced: 24 Mar 2025

https://github.com/software-mansion/typegpu

TypeScript library that enhances the WebGPU API, allowing resource management in a type-safe, declarative way.

gpgpu gpu gpu-computing gpu-programming graphics javascript typesafe typescript webgpu webgpu-api wgsl wgsl-shader

Last synced: 14 Jun 2025

https://github.com/projectphysx/opencl-wrapper

OpenCL is the most powerful programming language ever created. Yet the OpenCL C++ bindings are cumbersome and the code overhead prevents many people from getting started. I created this lightweight OpenCL-Wrapper to greatly simplify OpenCL software development with C++ while keeping functionality and performance.

gpgpu gpgpu-computing gpu gpu-acceleration gpu-computing gpu-programming opencl vector-processor vectorization

Last synced: 16 May 2025

https://github.com/juliagpu/amdgpu.jl

AMD GPU (ROCm) programming in Julia

amdgpu gpu gpu-programming julia rocm

Last synced: 12 Jan 2026

https://github.com/fastflow/fastflow

FastFlow pattern-based parallel programming framework (formerly on sourceforge)

gpu-computing gpu-programming multicore parallel-algorithm parallel-programming parallelization patterns skeleton-framework

Last synced: 01 Apr 2025

https://github.com/JuliaGPU/CuArrays.jl

A Curious Cumulation of CUDA Cuisine

cuda gpu-programming julia

Last synced: 22 Jul 2025

https://github.com/lucidrains/triton-transformer

Implementation of a Transformer, but completely in Triton

artificial-intelligence attention-mechanism deep-learning gpu-programming transformers

Last synced: 06 Apr 2025

https://github.com/stetre/moonlibs

Lua libraries for graphics and audio programming

audio fsm gpu-programming graphics lua lua-bindings

Last synced: 11 Feb 2026

https://github.com/johannesugb/volumetriclinesunity

Source of the Volumetric Lines Asset from Unity's Asset Store

csharp gpu-programming shader unity-asset unity3d

Last synced: 09 Oct 2025

https://github.com/johannesugb/VolumetricLinesUnity

Source of the Volumetric Lines Asset from Unity's Asset Store

csharp gpu-programming shader unity-asset unity3d

Last synced: 11 May 2025

https://github.com/abeleinin/Metal-Puzzles

Solve Puzzles. Learn Metal 🤘

gpu-programming metal mlx puzzles

Last synced: 18 Oct 2025

https://github.com/alan-rock-gs/gpuscript

GpuScript allows you to write C# programs that run at supercomputer speeds on a single GPU. Learn it in 30 minutes. Write & debug large and complex projects specifically designed to run on the GPU.

artificial-intelligence csharp functional-programming gpgpu gpu-programming machine-learning neural-networks object-oriented-programming unity unity3d

Last synced: 05 Apr 2025

https://github.com/eedalong/ece408

Code base and slides for ECE408:Applied Parallel Programming On GPU.

ece408 gpu-programming machine-learning parallel-programming

Last synced: 13 Apr 2025

https://github.com/tgautam03/xgemm

Accelerated General (FP32) Matrix Multiplication from scratch in CUDA

cuda-programming gpu-programming matrix-multiplication sgemm

Last synced: 06 Apr 2025

https://github.com/hollance/metal-gpgpu

Collection of notes on how to use Apple’s Metal API for compute tasks

deep-learning gpgpu gpu gpu-programming ios macos metal objective-c swift

Last synced: 24 Feb 2026

https://github.com/eomii/rules_ll

An Upstream Clang/LLVM-based toolchain for contemporary C++ and heterogeneous programming

bazel bleeding-edge build-system clang clang-tidy cpp cuda gpu-programming hermetic hip llvm nix openmp remote-caching remote-execution reproducible sanitizers

Last synced: 06 Apr 2025

https://github.com/michel-meneses/great-opencl-examples

Collection of easy, well-documented and useful OpenCL examples in C++.

c-plus-plus gpu-programming image-processing opencl parallel-programming

Last synced: 19 Mar 2025

https://github.com/ProjectPhysX/PTXprofiler

A simple profiler to count Nvidia PTX assembly instructions of OpenCL/SYCL/CUDA kernels for roofline model analysis.

cuda gpu gpu-acceleration gpu-computing gpu-programming hpc nvidia nvidia-cuda nvidia-gpu opencl profiler ptx ptx-utils roofline-model sycl

Last synced: 04 Apr 2025

https://github.com/projectphysx/ptxprofiler

A simple profiler to count Nvidia PTX assembly instructions of OpenCL/SYCL/CUDA kernels for roofline model analysis.

cuda gpu gpu-acceleration gpu-computing gpu-programming hpc nvidia nvidia-cuda nvidia-gpu opencl profiler ptx ptx-utils roofline-model sycl

Last synced: 10 Sep 2025

https://github.com/andi611/apriori-and-eclat-frequent-itemset-mining

Implementation of the Apriori and Eclat algorithms, two of the best-known basic algorithms for mining frequent item sets in a set of transactions, implementation in Python.

apriori apriori-algorithm cuda data-mining data-mining-algorithms eclat eclat-algorithm frequent-itemset-mining frequent-itemsets frequent-pattern-mining gcc gpu gpu-acceleration gpu-programming plot pycuda python transaction transactions

Last synced: 13 Apr 2025

https://github.com/yichengdwu/moye.jl

Programming Gemm Kernels on NVIDIA GPUs with Tensor Cores in Julia

gpu-programming layout-algorithm parallel-programming

Last synced: 07 Apr 2025

https://github.com/coderonion/cuda-beginner-course-cpp-version

bilibili视频【CUDA 12.x 并行编程入门(C++版)】配套代码

cpp cublas cuda cuda-programming cudnn gpu gpu-programming nvcc nvidia parallel-programming python rust

Last synced: 15 Jun 2025

https://github.com/llnl/care

CHAI and RAJA provide an excellent base on which to build portable codes. CARE expands that functionality, adding new features such as loop fusion capability and a portable interface for many numerical algorithms. It provides all the basics for anyone wanting to write portable code.

gpu gpu-acceleration gpu-computing gpu-programming hpc hpc-applications portability portable portable-apps portable-class-library portable-executable portable-library portable-object portableapps radiuss

Last synced: 29 Apr 2025

https://github.com/nvidia/optix-dev

OptiX SDK headers, everything needed to build & run OptiX applications. SDK samples not included.

cuda gpu gpu-acceleration gpu-programming nvidia optix ray-tracing raytracing

Last synced: 14 Apr 2025

https://github.com/i-Taylo/iUnlockerGL

iUnlocker GLTool is a Magisk module designed to spoof GPU information, allowing users to modify GPU informations for unlocking graphics in games and testing.

android android-app android-development android-gpu device-spoofing game-unlock games games-unlocker gpu gpu-programming gpu-spoofing gpu-testing gpu-tool magisk magisk-module spoof spoofer-for-games-github spoofing

Last synced: 20 Sep 2025

https://github.com/shadyboukhary/gpu-research-fft-openacc-cuda

Case studies constitute a modern interdisciplinary and valuable teaching practice which plays a critical and fundamental role in the development of new skills and the formation of new knowledge. This research studies the behavior and performance of two interdisciplinary and widely adopted scientific kernels, a Fast Fourier Transform and Matrix Multiplication. Both routines are implemented in the two current most popular many-core programming models CUDA and OpenACC. A Fast Fourier Transform (FFT) samples a signal over a period of time and divides it into its frequency components, computing the Discrete Fourier Transform (DFT) of a sequence. Unlike the traditional approach to computing a DFT, FFT algorithms reduce the complexity of the problem from O(n2) to O(nLog2n). Matrix multiplication is a cornerstone routine in Mathematics, Artificial Intelligence and Machine Learning. This research also shows that the nature of the problem plays a crucial role in determining what many-core model will provide the highest benefit in performance.

acceleration cuda fast-fourier-transform fft gpu-acceleration gpu-computing gpu-programming nvcc openacc parallel-computing pgi pgi-compiler radix-2

Last synced: 07 Aug 2025

https://github.com/xframes-project/xframes

GPU-accelerated GUI development for the desktop and the browser

c cpp dear-imgui desktop glfw gpu-accelerated-library gpu-acceleration gpu-programming opengl ui wasm webgpu

Last synced: 12 Apr 2025

https://github.com/i-taylo/iunlockergl

iUnlocker GLTool is a Magisk module designed to spoof GPU information, allowing users to modify GPU informations for unlocking graphics in games and testing.

android android-app android-development android-gpu device-spoofing game-unlock games games-unlocker gpu gpu-programming gpu-spoofing gpu-testing gpu-tool magisk magisk-module spoof spoofer-for-games-github spoofing

Last synced: 28 Dec 2025

https://github.com/tgautam03/xfilters

GPU (CUDA) accelerated filters using 2D convolution for high resolution images.

2d-convolution c cpp cuda cuda-programming gpu-acceleration gpu-computing gpu-programming image-filters image-processing

Last synced: 10 Oct 2025

https://github.com/vanities/polarisbioseditor-1.6.7

AMD GPU Polaris Bios Editor

gpu-programming polaris

Last synced: 11 Apr 2025

https://github.com/thomasp85/shady

Compile and Execute Shaders from R

glsl gpu-programming opengl shader

Last synced: 11 Apr 2025

https://github.com/coderonion/cuda-beginner-course-python-version

bilibili视频【CUDA 12.x 并行编程入门(Python版)】配套代码

cpp cublas cuda cuda-programming cudnn cupy gpu gpu-programming nvcc nvidia parallel-programming python rust

Last synced: 19 Oct 2025

https://github.com/kai-kj/microcompute

A small library for gpu computing

c glsl gpgpu gpu gpu-computing gpu-programming lua luajit opengl

Last synced: 12 Apr 2025

https://github.com/juliawgpu/wgpucompute.jl

Compute shaders interface for WGPU from julia

compute gpu gpu-computing gpu-programming julia-lang machine-learning shader wgpu

Last synced: 26 Jan 2026

https://github.com/lawmurray/gpu-gemm

CUDA kernel for matrix-matrix multiplication on Nvidia GPUs, using a Hilbert curve to improve L2 cache utilization.

cplusplus cuda cuda-kernels cuda-programming gpu gpu-computing gpu-programming matrix-multiplication numerical-methods scientific-computing

Last synced: 01 Mar 2026

https://github.com/gurbaaz27/cs433a-design-exercises

Solutions of design exercises in CS433A: Parallel Programming, Spring Semester 2021-22

barriers cuda gpu-programming locks openmp parallel-programming posix-threads semaphores

Last synced: 29 Jan 2026

https://github.com/yashkathe/image-noise-reduction-with-cuda

This project conducts an analysis of image denoising technique - median blur, comparing GPU-accelerated (Numba) and CPU-based (OpenCV) processing speeds.

cuda cuda-programming gpu-programming hardware-speed-analysis image-analysis image-processing numba nvidia nvidia-cuda nvidia-gpu opencv parallel-programming

Last synced: 14 May 2025

https://github.com/dominiklindorfer/sycl-intelgpu-quickstart

Lightweight & simplified approach to SYCL development

c cpp dpp gpu gpu-programming intel oneapi oneapi-dpc sycl vscode

Last synced: 20 Oct 2025

https://github.com/shreyansh26/mlsys-experiments

A collection of scripts on experimenting and implementing MLSys-related stuff

cuda cuda-kernel gpu gpu-programming llm-inference profiling pytorch triton

Last synced: 30 Aug 2025

https://github.com/dhruvsrikanth/cudann

A distributed implementation of a deep learning framework in CUDA.

cpp cuda deep-learning deep-learning-framework gpu-programming high-performance-computing hpc parallel-programming

Last synced: 23 Apr 2025

https://github.com/Awrsha/Advanced-CUDA-Programming-GPU-Architecture

This repository provides a comprehensive guide to optimizing GPU kernels for performance, with a focus on NVIDIA GPUs. It covers key tools and techniques such as CUDA, PyTorch, and Triton, aimed at improving computational efficiency for deep learning and scientific computing tasks.

cuda-programming gpu-programming jit kernels matmul mojo-language multiprocessing multithreading torchquantum triton

Last synced: 19 Sep 2025

https://github.com/seungjaelim/cuda.tutorial

References content from the OLCF CUDA Training Series. (https://github.com/olcf/cuda-training-series)

cuda gpu-programming nsight-compute nsight-systems

Last synced: 07 Feb 2026

https://github.com/hadv/vaneth

GPU-accelerated CREATE2 vanity address miner for Ethereum

create2-contract-deployment cuda ethereum gpu gpu-acceleration gpu-programming open-cl vanity-address

Last synced: 21 Jan 2026

https://github.com/debowin/gpu-parallel-recommender-system

GPGPU Parallel User-User Collaborative Filtering System in CUDA C

collaborative-filtering cuda gpu-programming movielens-dataset recommender-system

Last synced: 24 Apr 2026

https://github.com/david-palma/cuda-programming

Educational CUDA C/C++ programming repository with commented examples on GPU parallel computing, matrix operations, and performance profiling. Requires a CUDA-enabled NVIDIA GPU.

c-cpp cpp cuda cuda-toolkit education gpu gpu-programming kernel matrix-operations nvcc nvidia parallel-computing parallel-programming practice profiling threads

Last synced: 25 Apr 2026

https://github.com/hatamiarash7/cuda-python

GPU programming using CUDA & Python

cuda gpu gpu-computing gpu-programming python

Last synced: 12 Jun 2025

https://github.com/sartajbhuvaji/cuda

Deloped CUDA kernel functions to load and train a Convolution Neural Network from scratch.

cuda cuda-programming gpu-programming neural-network nvidia-cuda

Last synced: 30 Mar 2025

https://github.com/engineersbox/clir

OpenCL interop rendering abstractions that simulate the OpenGL pipeline

gpu gpu-programming interoperability opencl opengl

Last synced: 22 Mar 2025

https://github.com/satyajitghana/gpu-programming

Contains the contents of GPU Architecture and Programming course done on NPTEL

c cpp cuda cuda-programming gpu-programming nptel nvidia

Last synced: 09 Mar 2026

https://github.com/morgwai/gpu-samples

some GPU processing using JOCL (openCL) and Aparapi

aparapi concurrency concurrent-programming gpu gpu-programming java multithreading pram

Last synced: 29 Jun 2025