An open API service indexing awesome lists of open source software.

CUDA

CUDA® is a parallel computing platform and programming model developed by NVIDIA for general computing on graphical processing units (GPUs). With CUDA, developers are able to dramatically speed up computing applications by harnessing the power of GPUs.

https://github.com/pjueon/cuda_intellisense

A simple python script to fix cuda C++ intellisense for visual studio.

cuda visual-studio

Last synced: 09 Apr 2026

https://github.com/r3tr056/loc-ai-ly

Locaily - Making Large Language Model Inference Accessible on Consumer Hardware

cuda deepseek inference llama3 llamacpp llm

Last synced: 13 Apr 2026

https://github.com/mre/talks

...mostly Computer Science related.

computer-science cuda talks tech-talks

Last synced: 28 Apr 2026

https://github.com/saiccoumar/cuda-programming-exercises

Brief collection of GPU exercises (my reimplementation). Comes with relevant resources.

cuda cuda-programming nvcc nvidia

Last synced: 25 May 2026

https://github.com/xavierjiezou/gpu-compute-capability

An application for querying the computing power of each gpu released by NVIDIA.

cuda gpu nvidia

Last synced: 28 Apr 2026

https://github.com/jblaschke/pynvtx

Thin pybind11 wrapper for NVTX wrappers -- with some bells and whistles attached.

cuda nvtx nvtx-markers

Last synced: 21 Aug 2025

https://github.com/raumberg/hypervision

Neural Network based real-time aimbot system, operating on TensorRT with custom CUDA kernel and C FFI extensions

ai aim cuda cython neural-networks python tensorrt yolo

Last synced: 20 May 2026

https://github.com/0xhilsa/pynum

a small python library for 1D and 2D arrays with GPU support

array c cuda nvcc python3

Last synced: 18 Apr 2026

https://github.com/perl-openmp/p5-openmp-environment

Perl interface for manipulating OpenMP's environmental runtime execution variables

compiler cuda gcc gpu hpc openmp perl pthreads

Last synced: 19 Feb 2026

https://github.com/alpha74/hungarianalgocuda

Hungarian Algorithm for Linear Assignment Problem implemented using CUDA.

cuda nvcc parallel-computing parallel-programming

Last synced: 01 Jun 2026

https://github.com/leocelente/basic_cuda

My CUDA source files while learning

cpp cuda gpgpu

Last synced: 29 Apr 2026

https://github.com/asadiahmad/gesture-detection

Real-time Gesture Detection using CUDA-accelerated OpenCV in Python.

computer-vision cuda gesture-recognition gpu-acceleration open-pose opencv opencv-cuda pose-detection real-time

Last synced: 29 Apr 2026

https://github.com/anras5/parallel-computing

Comparing CPU and GPU

cuda gpu openmp

Last synced: 29 Apr 2026

https://github.com/nofaralfasi/parallel-sequence-alignment

A parallelized version of multiple DNA sequence alignment algorithm with MPI, OpenMP and CUDA

cuda mpi openmp sequence-alignment

Last synced: 29 Apr 2026

https://github.com/mortafix/quickshift

A working implementation of Quickshift algorithm in CUDA, GPU-compatible.

cuda gpu-computing quickshift

Last synced: 08 May 2026

https://github.com/emilienmendes/gpgpu

Parallélisation et optimisation de reconnaissance de point dans une image

cuda gpgpu parallel-programming

Last synced: 28 Oct 2025

https://github.com/mark0011astra/simplecuda

CUDAを使用したGPU演算をNumPyと同様のインターフェースで簡単行えるライブラリ。A library that allows users to easily perform GPU operations using CUDA with a NumPy-like interface.

cuda cupy gpu machine-learning numpy python vector

Last synced: 02 May 2026

https://github.com/xza85hrf/ml-framework_checker

ML Framework and CUDA Checker is a Python-based GUI application for checking PyTorch, TensorFlow, and CUDA installations. It provides detailed system specs, compatibility checks, advanced GPU management, and offers options to view instructions, export logs, and update machine learning frameworks.

compatibility cuda gpu-management gui-application machine-learning python pytorch system-checker system-specs tensorflow

Last synced: 28 Apr 2026

https://github.com/jonathanraiman/mini_cuda_rtc

Miniature CUDA Array library with Runtime Compilation

cpp11 cuda jit runtime-compilation

Last synced: 14 Apr 2026

https://github.com/enp1s0/curand_fp16

FP16 pseudo random number generator on GPU

cuda gpu half-precision random-number-generators

Last synced: 20 Aug 2025

https://github.com/hdelan/msc-hpc-final-project

In this project I implement a CUDA Lanczos method to approximate the matrix exponential. The matrix exponential is an important centrality measure for large, sparse graphs.

cuda graph-algorithms krylov-methods

Last synced: 12 Apr 2025

https://github.com/ismailtekin05/caloriedetectingai

🍎🔍 Smart AI system that identifies food items in photos and calculates their calorie content automatically. Built with TensorFlow, YOLOv8, CUDA and computer vision for accurate nutrition tracking.

ai aimodel calorie-calculator computer-vision cuda data-analysis data-science data-segmentation data-visualization dataset dataset-generation image-processing image-recognition python segmentation-models tensorflow ultralytics yaml yolo yolov8

Last synced: 29 Apr 2026

https://github.com/tommaso-dognini/polimi_gpu101_courseproject

Polimi Passion In Action GPU101 course project. Implementation in CUDA of BFS algorithm

cpp cuda cuda-programming parallel-computing

Last synced: 10 Apr 2026

https://github.com/kartavyaantani/cuda_image_processing

A CUDA-accelerated image processing project featuring multiple GPU-based filters and enhancement techniques. Implements convolution, edge detection, Non-Local Means (NLM) denoising, K-Nearest Neighbors (KNN), and pixelization. Each operation is optimized using CUDA kernels for real-time performance on large images. The project supports command-line

cuda cuda-kernels cuda-programming cuda-toolkit gpu-programming high-performance-computing image-manipulation image-processing nvidia-cuda nvidia-gpu

Last synced: 30 Apr 2026

https://github.com/jtompuri/weighted-voronoi-stippling

High-performance weighted Voronoi stippling implementation. Exports PNG and TSP files. Visualizes TSP tours as continuous line drawings.

computer-graphics cuda gpu-acceleration lloyd-relaxation numba python stippling traveling-salesman tsp voronoi

Last synced: 18 May 2026

https://github.com/romaingrx/ml-nix-flake

A simple nix flake to start ML env with uv and cuda out of the box

cuda ml nix nix-flake uv

Last synced: 30 Apr 2026

https://github.com/eric900115/parallelprogramming

The repository contains the coursework for CS5422, NTHU's Parallel Programming Course.

cuda mpi openmp ucx

Last synced: 26 May 2026

https://github.com/manishklach/thermal-observatory

A generic thermal observability framework for CPU, GPU, board, and platform telemetry across vendor APIs, kernel interfaces, and runtime correlation layers.

amd arm64 cuda linux nvidia nvml observability rocm telemetry thermal-framework thermal-monitoring x86-64

Last synced: 09 Jun 2026

https://github.com/thisalmandula/gpu_accelerated_lpt_cfd_code

This repository contains GPU accelerated version of the particle tracking model developed by Merel Kooi for biofouled microplastic particles ( available at: https://pubs.acs.org/doi/10.1021/acs.est.6b04702) written in CUDA Fortran and CUDA Python. This repository is intended as a learning tool for GPU programming.

biofouling computational-fluid-dynamics cuda fortran lagrangian-particle-tracking microplastics python

Last synced: 02 May 2026

https://github.com/xlisp/learn-vllm

vllm learning

cuda nvidia pytorch vllm

Last synced: 10 May 2026

https://github.com/jessetg/cuda-practice

Working through the chapters of Cuda by Example

c cpp cuda cuda-by-example gpgpu

Last synced: 01 May 2026

https://github.com/maawad/ptx_bcht

Bucketed Cuckoo hash set written in PTX and JIT-compiled.

cuckoo cuda gpu hash hashset ptx

Last synced: 01 May 2026

https://github.com/nickolasrm/gpuvscpumatrixmultiplication

CPU and GPU optimized matrix multiplication (AVX, transposition, CUDA and other)

avx comparison cuda hpc matrix multiplication

Last synced: 06 Sep 2025

https://github.com/vishwamartur/btc_recovery

High-performance Bitcoin wallet password recovery system with GPU acceleration and integrated graphics support. Recover Bitcoin Core wallet.dat files without blockchain download using advanced algorithms and blockchain APIs.

bitcoin bitcoin-core blockchain blockchain-api cpp cryptocurrency cuda electrum gpu-acceleration integrated-graphics multithreading opencl password-recovery private-keys recovery-tools wallet-dat wallet-recovery

Last synced: 14 Apr 2026

https://github.com/antonioberna/nn-gpu-logic-gates

Neural Network implementation on GPU using CUDA C++ to learn logic gates operations

cpp cuda gpu logic-gates neural-networks nvidia

Last synced: 01 May 2026

https://github.com/gvvsnrnaveen/cuda

this repository contains the various programs that can written using CUDA Toolkit.

c cpp cuda nvcc nvidia-cuda nvidia-gpu

Last synced: 17 Jan 2026

https://github.com/alwaysai/jetpack-46-hacky-hour

NVIDIA’s Jetpack 4.6 capabilities and how to use them with EdgeIQ, alwaysAI Computer Vision framework.

alwaysai computer-vision cuda edge-computing jetpack tensorrt

Last synced: 01 May 2026

https://github.com/dhruvsrikanth/monte-carlo-ray-tracing

In this repository, you will find a serial and distributed GPU-based implementation of the ray tracing simulation.

c cpp cuda gpu-computing gpu-programming high-performance-computing parallel-programming raytracing unified-memory-parallelism

Last synced: 01 May 2026

https://github.com/pratikvn/nla4hpc-exercises-framework

The exercises framework for the Numerical Linear Algebra for HPC course at Karlsruhe Institute of Technology.

cuda ginkgo homeworks hpc-course teaching

Last synced: 19 May 2026

https://github.com/dansolombrino/gphungarian

A GPU-accelerated implementation of the Hungarian Algorithm, written in CUDA

cuda gpu hpc opencl

Last synced: 31 Aug 2025

https://github.com/arakiss/hecate-os

Linux distro with automatic hardware detection and per-system optimization. Ubuntu 24.04 base. Alpha.

ai cuda docker gpu hardware-optimization kernel-tuning linux linux-distribution machine-learning nvidia operating-system performance sysctl ubuntu workstation zram

Last synced: 16 Feb 2026

https://github.com/dvhh/masscorrelation

An exercise in writing an efficient correlation calculator

calculations correlation-calculation cuda matrix multi-threading openmp

Last synced: 15 May 2026

https://github.com/rnabla/cuda-des

Bruteforcing DES using CUDA

bruteforce cuda data des encryption gpu parallel standard

Last synced: 27 Oct 2025

https://github.com/a-nau/python-cuda-envs

Script to automatically map a specific CUDA version to a Conda Python environment.

anaconda anaconda-environment cuda installation installation-script python python-environment python3

Last synced: 18 Apr 2026

https://github.com/liuyuweitarek/pytorch-docker-builder

Automate PyTorch Docker image builds with compatible Python, CUDA, and Poetry versions, including CI/CD for testing.

cicd containerd cuda docker docker-image poetry-python python python3 pytorch pytorch-docker

Last synced: 06 Feb 2026

https://github.com/hubenchang0515/fft-benchmark

一些 FFT 库的性能测试

cuda fft

Last synced: 27 Oct 2025

https://github.com/miniex/maidenx

Rust-based CUDA library designed for learning purposes and building my AI engines named Maiden Engine

ai cuda rust

Last synced: 20 Mar 2025

https://github.com/daelsepara/hipslm

CPU and GPU (using HIP) implementations of phase pattern generators for use with spatial light modulators

computer-generated-holography cuda gpu hip hologram holography phase phase-pattern slm spatial-light-modulator

Last synced: 09 Sep 2025

https://github.com/dafadey/GPGPU_OpenCL_vs_CUDA

This is a repository with sample codes for testing memory bandwidth, arithmetic latency hiding and shared/local memory performance on AMD and nVidia devices

cuda gpgpu gpgpu-computing opencl

Last synced: 16 May 2025

https://github.com/michaelfranzl/image_debian-gpgpu

Dockerfile for a Debian base image with AMD and Nvidia GPGPU support

amd container container-image cuda debian docker gpgpu nvidia opencl

Last synced: 10 May 2026

https://github.com/daelsepara/hipmandelbrot

GPU Implementation of Mandelbrot Fractal Generator with Benchmarking

amd cuda fractal gpu gpu-compute gpu-computing hip mandelbrot parallel-computing rocm sdk

Last synced: 20 Feb 2026

https://github.com/andih/cuda-fortran-stream

Variant of STREAM Benchmark in CUDA Fortran

cuda cuda-fortran gpu stream-benchmarks variants

Last synced: 02 Mar 2025

https://github.com/alekseyscorpi/vacancies_server

This is a server for vacancies generation using LLM (Saiga3)

code cuda cuda-toolkit docker dockerfile flask llama3 llamacpp llm ngrok pydantic saiga

Last synced: 06 Feb 2026

https://github.com/renatomaynard/a-multiple-population-coarse-grained-genetic-algorithm-to-solve-the-quadratic-assignment-problem-

A Multiple-population coarse-grained Genetic Algorithm to solve the Quadratic Assignment Problem

c cuda genetic-algorithm quadratic-assignment-problem

Last synced: 09 May 2026

https://github.com/rjected/cuda-timelock

Solving a large number of timelock puzzles in parallel using GPU acceleration

c cgbn concurrent cpp cuda gmp graphics nvidia parallel puzzle timelock

Last synced: 14 Apr 2026

https://github.com/lhldev/rust-neural-network

neural network implementation in rust

cuda feedforward-neural-network

Last synced: 16 May 2026

https://github.com/enriquebdel/clases-cuda-programacion-paralela-en-c-

En este repositorio encontrarás varias lecciones creadas por mí sobre la librería CUDA en C. El programa que utilizo para programar es MobaXterm.

c cuda cuda-programming gnu-linux googlecolab mobaxterm nvidia parallel-programming ubuntu university

Last synced: 19 May 2026

https://github.com/SanaeProject/Matrix-for-Cpp

This repository has types that handle matrices.

cpp14 cpp14-library cuda matrix-library

Last synced: 15 May 2025

https://github.com/vipaka2/sdforge-docker

latest sd forge docker image.

cuda docker nvidia python

Last synced: 24 Jul 2025

https://github.com/lcsb-biocore/cufluxsampler.jl

GPU-accelerated algorithms for flux sampling in CUDA.jl

cobra cuda gpu julia metabolic-network metabolism sampling

Last synced: 02 May 2026

https://github.com/bl33h/pythagoreantheorem

A program that calculates the Pythagorean theorem for a large number of elements using GPU parallel processing.

arrays cuda kernel parallel-programming pythagoras pythagorean-theorem

Last synced: 19 May 2026

https://github.com/microo8/micronn

Simple neural network library with backpropagation using CUDA

c cuda neural-network

Last synced: 19 May 2026

https://github.com/lightshade12/kittlespt

A hobby CUDA pathtracing renderer.

3d-graphics computer-graphics cuda gpu path-tracing ray-tracing

Last synced: 18 Mar 2025

https://github.com/blazekill/hello-cuda

Cpp + Vcpkg + CUDA + VsCode starter project.

cpp cuda vcpkg vscode

Last synced: 18 May 2026

https://github.com/graiphic/graiphic-documentation

Graiphic Toolkits for LabVIEW provide advanced AI, GPU, and graph-oriented computing capabilities directly inside LabVIEW. Built on ONNX Runtime, they enable seamless integration of SOTA, Accelerator, and Deep Learning Toolkit for high-performance execution across CPUs, GPUs, and edge devices.

accelerator-toolkit ai-orchestration computer-vision cuda deep-learning directml edge-ai graph-computing hardware-acceleration high-performance-computing inference labview neural-networks onednn onnx onnxruntime openvino sota tensorrt training

Last synced: 22 Nov 2025

https://github.com/shahed-chy-suzan/psd-to-html--cuda

Cuda is a single page creative portfolio psd to html template which is built with HTML5 & CSS3. The site can be customized easily to suit your needs.

cuda portfolio psd-to-html

Last synced: 18 Jan 2026

https://github.com/m-torhan/cuda-stl-renderer

CUDA C++ implementation of STL file renderer using ray tracing method

cuda

Last synced: 25 Feb 2026

https://github.com/ophoperhpo/dcgan-lentach-logo-generator

The Lentach logo generator. #MachineLearningFun

cuda dcgan dcgan-tensorflow keras lentach machinelearning ml

Last synced: 23 Feb 2025

https://github.com/matx64/rs-netbot

Old School Runescape bot with CNN for object identification

cuda numpy python pytorch

Last synced: 04 May 2026

https://github.com/greg-tarr/fastsimplex

CUDA/MPS accelerated 2D & 3D simplex noise generation.

cuda mps noise-generator python simplex-noise

Last synced: 20 Apr 2026

https://github.com/xihuai18/image-processing-in-cuda

Implementation of Image Processing Method

cuda imageprocessing

Last synced: 04 Oct 2025

https://github.com/nikolaydubina/basic-openai-pytorch-server

Minimal HTTP inference server in OpenAI API with Pytorch and CUDA

cuda docker llm openai pytorch server

Last synced: 12 Apr 2026

https://github.com/croko22/vit-cpp

An implementation of the Transformer model architecture ("Attention Is All You Need") in pure C++17 from scratch

cpp cuda deep-learning machine-learning neural-network transformer

Last synced: 17 Jan 2026

https://github.com/sartajbhuvaji/cuda

Deloped CUDA kernel functions to load and train a Convolution Neural Network from scratch.

cuda cuda-programming gpu-programming neural-network nvidia-cuda

Last synced: 30 Mar 2025

https://github.com/amirbroker/cudadtw

Use CUDA with numba for Dynamic Time Warping

cuda dtw dynamic-time-warping gpu numba

Last synced: 16 Apr 2026

https://github.com/iag-geo/image-classification

Image classification scripts using YOLOv5 with aerial imagery

cuda image-classification python pytorch swimming-pools yolov5

Last synced: 22 Feb 2026

https://github.com/matteogianferrari/qr-decomposition

Tthis project implements different methods to exploit caches usage, the multicore CPU and the GPU architectures, on the Gram-Schmidt QR Decomposition algorithm and measure the performance of the different implementations.

cuda openmp parallel-computing

Last synced: 12 Apr 2026

https://github.com/willigarneau/sobel-filter-cuda

🖼️ Assignment 1 in Intelligent Industrial System at Cégep Lévis-Lauzon. Learning Cuda and OpenCV by creating a sobel filter. 💻

cplusplus cuda filter opencv sobel

Last synced: 16 Apr 2026

https://github.com/hatamiarash7/cuda-python

GPU programming using CUDA & Python

cuda gpu gpu-computing gpu-programming python

Last synced: 29 Apr 2026

https://github.com/subatomicplanets/simplebitcoinminer

A simple Bitcoin C++ and CUDA solo miner

bitcoin cpp cryptocurrency cuda miner

Last synced: 19 Apr 2026

https://github.com/aliyoussef97/triton-hub

A container of various PyTorch neural network modules written in Triton.

cuda deep-learning openai pytorch triton triton-lang

Last synced: 30 Mar 2025