An open API service indexing awesome lists of open source software.

Projects in Awesome Lists tagged with gpu-programming

A curated list of projects in awesome lists tagged with gpu-programming .

https://github.com/taichi-dev/taichi

Productive, portable, and performant GPU programming in Python.

computer-graphics differentiable-programming gpu gpu-programming sparse-computation taichi

Last synced: 12 May 2025

https://github.com/exaloop/codon

A high-performance, zero-overhead, extensible Python compiler with built-in NumPy support

compiler gpu-programming high-performance llvm numpy parallel-programming python

Last synced: 13 May 2025

https://github.com/plasma-umass/scalene

Scalene: a high-performance, high-precision CPU, GPU, and memory profiler for Python with AI-powered optimization proposals

cpu cpu-profiling gpu gpu-programming memory-allocation memory-consumption performance-analysis performance-cpu profiler profiles-memory profiling python python-profilers scalene

Last synced: 10 Jun 2025

https://github.com/QianMo/Game-Programmer-Study-Notes

:anchor: 我的游戏程序员生涯的读书笔记合辑。你可以把它看作一个加强版的Blog。涉及图形学、实时渲染、编程实践、GPU编程、设计模式、软件工程等内容。Keep Reading , Keep Writing , Keep Coding.

blog book books cg ebook ebooks game-developing-notes game-development game-programmer game-programming gpu-programming graphics notes real-time-rendering realtime-rendering rendering shader study-notes

Last synced: 19 Mar 2025

https://github.com/qianmo/game-programmer-study-notes

:anchor: 我的游戏程序员生涯的读书笔记合辑。你可以把它看作一个加强版的Blog。涉及图形学、实时渲染、编程实践、GPU编程、设计模式、软件工程等内容。Keep Reading , Keep Writing , Keep Coding.

blog book books cg ebook ebooks game-developing-notes game-development game-programmer game-programming gpu-programming graphics notes real-time-rendering realtime-rendering rendering shader study-notes

Last synced: 24 Feb 2025

https://github.com/embarkstudios/rust-gpu

🐉 Making Rust a first-class language and ecosystem for GPU shaders 🚧

gpu-programming rust shaders

Last synced: 12 May 2025

https://github.com/EmbarkStudios/rust-gpu

🐉 Making Rust a first-class language and ecosystem for GPU shaders 🚧

gpu-programming rust shaders

Last synced: 27 Mar 2025

https://github.com/rust-gpu/rust-cuda

Ecosystem of libraries and tools for writing and executing fast GPU code fully in Rust.

cuda cuda-kernels cuda-programming gpgpu gpu gpu-programming rust rust-lang

Last synced: 14 May 2025

https://github.com/Rust-GPU/Rust-CUDA

Ecosystem of libraries and tools for writing and executing fast GPU code fully in Rust.

cuda cuda-kernels cuda-programming gpgpu gpu gpu-programming rust rust-lang

Last synced: 27 Mar 2025

https://github.com/uber/aresdb

A GPU-powered real-time analytics storage and query engine.

analytics cgo cuda data database golang gpu-programming query real-time storage

Last synced: 14 May 2025

https://github.com/rust-gpu/rust-gpu

🐉 Making Rust a first-class language and ecosystem for GPU shaders 🚧

compiler gpu-programming graphics-programing rust shaders spirv vulkan

Last synced: 14 May 2025

https://github.com/calebwin/emu

The write-once-run-anywhere GPGPU library for Rust

emu gpgpu gpu gpu-acceleration gpu-computing gpu-programming rust

Last synced: 14 May 2025

https://calebwin.github.io/emu/

The write-once-run-anywhere GPGPU library for Rust

emu gpgpu gpu gpu-acceleration gpu-computing gpu-programming rust

Last synced: 30 Apr 2025

https://github.com/Rust-GPU/rust-gpu

🐉 Making Rust a first-class language and ecosystem for GPU shaders 🚧

compiler gpu-programming graphics-programing rust shaders spirv vulkan

Last synced: 17 Jan 2025

https://github.com/QianMo/GPU-Gems-Book-Source-Code

:cd: CD Content ( Source Code ) Collection of Book <GPU Gems > 1~ 3 | 《GPU精粹》 1~ 3 随书CD(源代码)珍藏

book-source-code glsl gpu gpu-gems gpu-programming graphics hlsl rendering shader

Last synced: 08 May 2025

https://github.com/qianmo/gpu-gems-book-source-code

:cd: CD Content ( Source Code ) Collection of Book <GPU Gems > 1~ 3 | 《GPU精粹》 1~ 3 随书CD(源代码)珍藏

book-source-code glsl gpu gpu-gems gpu-programming graphics hlsl rendering shader

Last synced: 12 Apr 2025

https://github.com/QianMo/GPU-Pro-Books-Source-Code

:cd: Source Code Collection of Book <GPU Pro> 1~ 7 | 《GPU Pro》1~ 7 书本源代码珍藏

book-source-code game-development gpu-pro gpu-programming graphics-programming rendering shader

Last synced: 01 May 2025

https://github.com/qianmo/gpu-pro-books-source-code

:cd: Source Code Collection of Book <GPU Pro> 1~ 7 | 《GPU Pro》1~ 7 书本源代码珍藏

book-source-code game-development gpu-pro gpu-programming graphics-programming rendering shader

Last synced: 24 Feb 2025

https://github.com/AmesingFlank/taichi.js

Modern GPU Compute and Rendering in Javascript

gpu gpu-computing gpu-programming javascript webgpu webgpu-api webgpu-shaders

Last synced: 24 Mar 2025

https://github.com/software-mansion/typegpu

TypeScript library that enhances the WebGPU API, allowing resource management in a type-safe, declarative way.

gpgpu gpu gpu-computing gpu-programming graphics javascript typesafe typescript webgpu webgpu-api wgsl wgsl-shader

Last synced: 14 Jun 2025

https://github.com/projectphysx/opencl-wrapper

OpenCL is the most powerful programming language ever created. Yet the OpenCL C++ bindings are cumbersome and the code overhead prevents many people from getting started. I created this lightweight OpenCL-Wrapper to greatly simplify OpenCL software development with C++ while keeping functionality and performance.

gpgpu gpgpu-computing gpu gpu-acceleration gpu-computing gpu-programming opencl vector-processor vectorization

Last synced: 16 May 2025

https://github.com/juliagpu/amdgpu.jl

AMD GPU (ROCm) programming in Julia

amdgpu gpu gpu-programming julia rocm

Last synced: 15 May 2025

https://github.com/fastflow/fastflow

FastFlow pattern-based parallel programming framework (formerly on sourceforge)

gpu-computing gpu-programming multicore parallel-algorithm parallel-programming parallelization patterns skeleton-framework

Last synced: 01 Apr 2025

https://github.com/JuliaGPU/CuArrays.jl

A Curious Cumulation of CUDA Cuisine

cuda gpu-programming julia

Last synced: 29 Nov 2024

https://github.com/lucidrains/triton-transformer

Implementation of a Transformer, but completely in Triton

artificial-intelligence attention-mechanism deep-learning gpu-programming transformers

Last synced: 06 Apr 2025

https://github.com/stetre/moonlibs

Lua libraries for graphics and audio programming

audio fsm gpu-programming graphics lua lua-bindings

Last synced: 25 Nov 2024

https://github.com/johannesugb/VolumetricLinesUnity

Source of the Volumetric Lines Asset from Unity's Asset Store

csharp gpu-programming shader unity-asset unity3d

Last synced: 11 May 2025

https://github.com/johannesugb/volumetriclinesunity

Source of the Volumetric Lines Asset from Unity's Asset Store

csharp gpu-programming shader unity-asset unity3d

Last synced: 19 Dec 2024

https://github.com/abeleinin/Metal-Puzzles

Solve Puzzles. Learn Metal 🤘

gpu-programming metal mlx puzzles

Last synced: 03 Feb 2025

https://github.com/alan-rock-gs/gpuscript

GpuScript allows you to write C# programs that run at supercomputer speeds on a single GPU. Learn it in 30 minutes. Write & debug large and complex projects specifically designed to run on the GPU.

artificial-intelligence csharp functional-programming gpgpu gpu-programming machine-learning neural-networks object-oriented-programming unity unity3d

Last synced: 05 Apr 2025

https://github.com/eedalong/ece408

Code base and slides for ECE408:Applied Parallel Programming On GPU.

ece408 gpu-programming machine-learning parallel-programming

Last synced: 13 Apr 2025

https://github.com/tgautam03/xgemm

Accelerated General (FP32) Matrix Multiplication from scratch in CUDA

cuda-programming gpu-programming matrix-multiplication sgemm

Last synced: 06 Apr 2025

https://github.com/eomii/rules_ll

An Upstream Clang/LLVM-based toolchain for contemporary C++ and heterogeneous programming

bazel bleeding-edge build-system clang clang-tidy cpp cuda gpu-programming hermetic hip llvm nix openmp remote-caching remote-execution reproducible sanitizers

Last synced: 06 Apr 2025

https://github.com/hollance/metal-gpgpu

Collection of notes on how to use Apple’s Metal API for compute tasks

deep-learning gpgpu gpu gpu-programming ios macos metal objective-c swift

Last synced: 25 Mar 2025

https://github.com/michel-meneses/great-opencl-examples

Collection of easy, well-documented and useful OpenCL examples in C++.

c-plus-plus gpu-programming image-processing opencl parallel-programming

Last synced: 19 Mar 2025

https://github.com/projectphysx/ptxprofiler

A simple profiler to count Nvidia PTX assembly instructions of OpenCL/SYCL/CUDA kernels for roofline model analysis.

cuda gpu gpu-acceleration gpu-computing gpu-programming hpc nvidia nvidia-cuda nvidia-gpu opencl profiler ptx ptx-utils roofline-model sycl

Last synced: 14 Apr 2025

https://github.com/ProjectPhysX/PTXprofiler

A simple profiler to count Nvidia PTX assembly instructions of OpenCL/SYCL/CUDA kernels for roofline model analysis.

cuda gpu gpu-acceleration gpu-computing gpu-programming hpc nvidia nvidia-cuda nvidia-gpu opencl profiler ptx ptx-utils roofline-model sycl

Last synced: 04 Apr 2025

https://github.com/andi611/apriori-and-eclat-frequent-itemset-mining

Implementation of the Apriori and Eclat algorithms, two of the best-known basic algorithms for mining frequent item sets in a set of transactions, implementation in Python.

apriori apriori-algorithm cuda data-mining data-mining-algorithms eclat eclat-algorithm frequent-itemset-mining frequent-itemsets frequent-pattern-mining gcc gpu gpu-acceleration gpu-programming plot pycuda python transaction transactions

Last synced: 13 Apr 2025

https://github.com/yichengdwu/moye.jl

Programming Gemm Kernels on NVIDIA GPUs with Tensor Cores in Julia

gpu-programming layout-algorithm parallel-programming

Last synced: 07 Apr 2025

https://github.com/llnl/care

CHAI and RAJA provide an excellent base on which to build portable codes. CARE expands that functionality, adding new features such as loop fusion capability and a portable interface for many numerical algorithms. It provides all the basics for anyone wanting to write portable code.

gpu gpu-acceleration gpu-computing gpu-programming hpc hpc-applications portability portable portable-apps portable-class-library portable-executable portable-library portable-object portableapps radiuss

Last synced: 29 Apr 2025

https://github.com/coderonion/cuda-beginner-course-cpp-version

bilibili视频【CUDA 12.x 并行编程入门(C++版)】配套代码

cpp cublas cuda cuda-programming cudnn gpu gpu-programming nvcc nvidia parallel-programming python rust

Last synced: 15 Jun 2025

https://github.com/nvidia/optix-dev

OptiX SDK headers, everything needed to build & run OptiX applications. SDK samples not included.

cuda gpu gpu-acceleration gpu-programming nvidia optix ray-tracing raytracing

Last synced: 14 Apr 2025

https://github.com/xframes-project/xframes

GPU-accelerated GUI development for the desktop and the browser

c cpp dear-imgui desktop glfw gpu-accelerated-library gpu-acceleration gpu-programming opengl ui wasm webgpu

Last synced: 12 Apr 2025

https://github.com/shadyboukhary/gpu-research-fft-openacc-cuda

Case studies constitute a modern interdisciplinary and valuable teaching practice which plays a critical and fundamental role in the development of new skills and the formation of new knowledge. This research studies the behavior and performance of two interdisciplinary and widely adopted scientific kernels, a Fast Fourier Transform and Matrix Multiplication. Both routines are implemented in the two current most popular many-core programming models CUDA and OpenACC. A Fast Fourier Transform (FFT) samples a signal over a period of time and divides it into its frequency components, computing the Discrete Fourier Transform (DFT) of a sequence. Unlike the traditional approach to computing a DFT, FFT algorithms reduce the complexity of the problem from O(n2) to O(nLog2n). Matrix multiplication is a cornerstone routine in Mathematics, Artificial Intelligence and Machine Learning. This research also shows that the nature of the problem plays a crucial role in determining what many-core model will provide the highest benefit in performance.

acceleration cuda fast-fourier-transform fft gpu-acceleration gpu-computing gpu-programming nvcc openacc parallel-computing pgi pgi-compiler radix-2

Last synced: 21 Apr 2025

https://github.com/i-taylo/iunlockergl

iUnlocker GLTool is a Magisk module designed to spoof GPU information, allowing users to modify GPU informations for unlocking graphics in games and testing.

android android-app android-development android-gpu device-spoofing game-unlock games games-unlocker gpu gpu-programming gpu-spoofing gpu-testing gpu-tool magisk magisk-module spoof spoofer-for-games-github spoofing

Last synced: 12 Apr 2025

https://github.com/vanities/polarisbioseditor-1.6.7

AMD GPU Polaris Bios Editor

gpu-programming polaris

Last synced: 11 Apr 2025

https://github.com/thomasp85/shady

Compile and Execute Shaders from R

glsl gpu-programming opengl shader

Last synced: 11 Apr 2025

https://github.com/coderonion/cuda-beginner-course-python-version

bilibili视频【CUDA 12.x 并行编程入门(Python版)】配套代码

cpp cublas cuda cuda-programming cudnn cupy gpu gpu-programming nvcc nvidia parallel-programming python rust

Last synced: 15 Jun 2025

https://github.com/kai-kj/microcompute

A small library for gpu computing

c glsl gpgpu gpu gpu-computing gpu-programming lua luajit opengl

Last synced: 12 Apr 2025

https://github.com/juliawgpu/wgpucompute.jl

Compute shaders interface for WGPU from julia

compute gpu gpu-computing gpu-programming julia-lang machine-learning shader wgpu

Last synced: 19 Feb 2025

https://github.com/yashkathe/image-noise-reduction-with-cuda

This project conducts an analysis of image denoising technique - median blur, comparing GPU-accelerated (Numba) and CPU-based (OpenCV) processing speeds.

cuda cuda-programming gpu-programming hardware-speed-analysis image-analysis image-processing numba nvidia nvidia-cuda nvidia-gpu opencv parallel-programming

Last synced: 14 May 2025

https://github.com/gurbaaz27/cs433a-design-exercises

Solutions of design exercises in CS433A: Parallel Programming, Spring Semester 2021-22

barriers cuda gpu-programming locks openmp parallel-programming posix-threads semaphores

Last synced: 07 May 2025

https://github.com/lawmurray/gpu-gemm

CUDA kernel for matrix-matrix multiplication on Nvidia GPUs, using a Hilbert curve to improve L2 cache utilization.

cplusplus cuda cuda-kernels cuda-programming gpu gpu-computing gpu-programming matrix-multiplication numerical-methods scientific-computing

Last synced: 14 Apr 2025

https://github.com/awrsha/cuda-gpus-and-triton-adcanced-review

This repository provides a comprehensive guide to optimizing GPU kernels for performance, with a focus on NVIDIA GPUs. It covers key tools and techniques such as CUDA, PyTorch, and Triton, aimed at improving computational efficiency for deep learning and scientific computing tasks.

cuda-programming gpu-programming jit kernels matmul mojo-language multiprocessing multithreading torchquantum triton

Last synced: 12 Jan 2025

https://github.com/dominiklindorfer/sycl-intelgpu-quickstart

Lightweight & simplified approach to SYCL development

c cpp dpp gpu gpu-programming intel oneapi oneapi-dpc sycl vscode

Last synced: 16 Mar 2025

https://github.com/dhruvsrikanth/cudann

A distributed implementation of a deep learning framework in CUDA.

cpp cuda deep-learning deep-learning-framework gpu-programming high-performance-computing hpc parallel-programming

Last synced: 23 Apr 2025

https://github.com/engineersbox/clir

OpenCL interop rendering abstractions that simulate the OpenGL pipeline

gpu gpu-programming interoperability opencl opengl

Last synced: 22 Mar 2025

https://github.com/hatamiarash7/cuda-python

GPU programming using CUDA & Python

cuda gpu gpu-computing gpu-programming python

Last synced: 12 Jun 2025

https://github.com/sartajbhuvaji/cuda

Deloped CUDA kernel functions to load and train a Convolution Neural Network from scratch.

cuda cuda-programming gpu-programming neural-network nvidia-cuda

Last synced: 30 Mar 2025

https://github.com/shreyansh26/mlsys-experiments

A collection of scripts on experimenting and implementing MLSys-related stuff

cuda cuda-kernel gpu gpu-programming llm-inference profiling pytorch triton

Last synced: 14 Jun 2025

https://github.com/kchristin22/ising_model

Implementation of a cellular automaton on GPU using different features of CUDA

cellular-automaton cuda gpu-programming hpc ising-model parallel-computing

Last synced: 15 Mar 2025

https://github.com/kartavyaantani/cuda_image_processing

A CUDA-accelerated image processing project featuring multiple GPU-based filters and enhancement techniques. Implements convolution, edge detection, Non-Local Means (NLM) denoising, K-Nearest Neighbors (KNN), and pixelization. Each operation is optimized using CUDA kernels for real-time performance on large images. The project supports command-line

cuda cuda-kernels cuda-programming cuda-toolkit gpu-programming high-performance-computing image-manipulation image-processing nvidia-cuda nvidia-gpu

Last synced: 19 Apr 2025

https://github.com/dhruvsrikanth/fastconv

Distributed and serial implementations of the 2D Convolution operation in c++ and CUDA.

convolution-filters cpp cuda gpu-programming high-performance-computing hpc image-editor image-processing nvidia parallel-programming

Last synced: 23 Apr 2025

https://github.com/dhruvsrikanth/monte-carlo-ray-tracing

In this repository, you will find a serial and distributed GPU-based implementation of the ray tracing simulation.

c cpp cuda gpu-computing gpu-programming high-performance-computing parallel-programming raytracing unified-memory-parallelism

Last synced: 23 Apr 2025

https://github.com/u-c-s/gpu-experiments

GPU and stuff. I want to go somewhere with this.

gpu gpu-programming vulkan

Last synced: 01 Mar 2025

https://github.com/david-palma/cuda-programming

Educational CUDA C/C++ programming repository with commented examples on GPU parallel computing, matrix operations, and performance profiling. Requires a CUDA-enabled NVIDIA GPU.

c-cpp cpp cuda cuda-toolkit education gpu gpu-programming kernel matrix-operations nvcc nvidia parallel-computing parallel-programming practice profiling threads

Last synced: 26 Mar 2025

https://github.com/nifets/meandering-depths

Developing and optimizing a 3D game engine from scratch using OpenGL.

game-engine gpu-programming graphics-programming marching-cubes opengl

Last synced: 03 Apr 2025

https://github.com/morgwai/gpu-samples

some GPU processing using JOCL (openCL) and Aparapi

aparapi concurrency concurrent-programming gpu gpu-programming java multithreading pram

Last synced: 12 Mar 2025

https://github.com/satyajitghana/gpu-programming

Contains the contents of GPU Architecture and Programming course done on NPTEL

c cpp cuda cuda-programming gpu-programming nptel nvidia

Last synced: 05 May 2025

https://github.com/alexkranias/triton_vs_cuda

Building Triton and CUDA kernels side-by-side to create a cuBLAS-performant GEMM kernel.

cuda cuda-kernels gpu gpu-programming parallel-programming python triton

Last synced: 30 Mar 2025