An open API service indexing awesome lists of open source software.

CUDA

CUDA® is a parallel computing platform and programming model developed by NVIDIA for general computing on graphical processing units (GPUs). With CUDA, developers are able to dramatically speed up computing applications by harnessing the power of GPUs.

https://github.com/miniex/maidenx

Rust-based CUDA library designed for learning purposes and building my AI engines named Maiden Engine

ai cuda rust

Last synced: 20 Mar 2025

https://github.com/ayoussf/triton-hub

A container of various PyTorch neural network modules written in Triton.

cuda deep-learning openai pytorch triton triton-lang

Last synced: 14 Apr 2025

https://github.com/gunrock/template

Template repository for essentials applications to get you started asap!

cpp cuda essentials gpu graph-algorithms graph-analytics gunrock

Last synced: 15 May 2026

https://github.com/satyajitghana/gpu-programming

Contains the contents of GPU Architecture and Programming course done on NPTEL

c cpp cuda cuda-programming gpu-programming nptel nvidia

Last synced: 09 Mar 2026

https://github.com/blazekill/hello-cuda

Cpp + Vcpkg + CUDA + VsCode starter project.

cpp cuda vcpkg vscode

Last synced: 18 May 2026

https://github.com/adamczykpiotr/cudamatrixlibrary

Matrix operation library using single, n-threads or CUDA supported GPU

agh agh-ust cpp cuda cuda-library matrix matrix-computations matrix-functions matrix-multiplication

Last synced: 19 Apr 2026

https://github.com/sbstndb/grayscott_k

A simple 3D GrayScott simulation using Kokkos enabling CUDA or OpenMP backend

cuda finite-difference grayscott grid kokkos laplacian openmp simulation visualisation

Last synced: 16 May 2026

https://github.com/xza85hrf/ml-framework_checker

ML Framework and CUDA Checker is a Python-based GUI application for checking PyTorch, TensorFlow, and CUDA installations. It provides detailed system specs, compatibility checks, advanced GPU management, and offers options to view instructions, export logs, and update machine learning frameworks.

compatibility cuda gpu-management gui-application machine-learning python pytorch system-checker system-specs tensorflow

Last synced: 28 Apr 2026

https://github.com/enkerewpo/talaria

AI Voice Assistant for Dialogue and IoT Control Powered by GPT4o

cuda gpt-4 python3 pytorch stt tts

Last synced: 16 Apr 2026

https://github.com/fandreuz/parallel-programming-for-hpc

Scientific codes in C/C++ with CUDA, OpenACC, FFTW, (cu)BLAS

cpp cuda hpc mpi

Last synced: 20 Apr 2026

https://github.com/sartajbhuvaji/cuda

Deloped CUDA kernel functions to load and train a Convolution Neural Network from scratch.

cuda cuda-programming gpu-programming neural-network nvidia-cuda

Last synced: 30 Mar 2025

https://github.com/ssoehdata/cuda_fortran_sci_eng

Working through examples from the Cuda Fortran for Scientists and Engineers 2nd Edition Book

cuda cuda-fortran fortran hpc nvfortran

Last synced: 21 Aug 2025

https://github.com/croko22/vit-cpp

An implementation of the Transformer model architecture ("Attention Is All You Need") in pure C++17 from scratch

cpp cuda deep-learning machine-learning neural-network transformer

Last synced: 17 Jan 2026

https://github.com/raumberg/hypervision

Neural Network based real-time aimbot system, operating on TensorRT with custom CUDA kernel and C FFI extensions

ai aim cuda cython neural-networks python tensorrt yolo

Last synced: 20 May 2026

https://github.com/dhruvsrikanth/fastconv

Distributed and serial implementations of the 2D Convolution operation in c++ and CUDA.

convolution-filters cpp cuda gpu-programming high-performance-computing hpc image-editor image-processing nvidia parallel-programming

Last synced: 04 May 2026

https://github.com/mhaseeb123/gcb

GCB includes a suite of benchmarks and basic tests for CUDA-aware MPI and C++ compilers.

cpp cpp23 cuda mpi partitioned-communication st-mpi

Last synced: 17 May 2026

https://github.com/arakiss/hecate-os

Linux distro with automatic hardware detection and per-system optimization. Ubuntu 24.04 base. Alpha.

ai cuda docker gpu hardware-optimization kernel-tuning linux linux-distribution machine-learning nvidia operating-system performance sysctl ubuntu workstation zram

Last synced: 16 Feb 2026

https://github.com/alekseyscorpi/vacancies_server

This is a server for vacancies generation using LLM (Saiga3)

code cuda cuda-toolkit docker dockerfile flask llama3 llamacpp llm ngrok pydantic saiga

Last synced: 06 Feb 2026

https://github.com/jakubriegel/game_of_life_3d

3D game of life implemented in CUDA

concurency cuda gameoflife nvidia put-poznan

Last synced: 21 Apr 2026

https://github.com/brosnanyuen/raybnn_graph

Graph Manipulation Library For GPUs, CPUs, and FPGAs via CUDA, OpenCL, and oneAPI

cuda gpu graph graph-algorithms neural-network neural-networks opencl raybnn rust

Last synced: 06 Feb 2026

https://github.com/patrickm663/localglmnet.jl

This is a WIP implementation of Richman & Wüthrich (2022) using Julia's Flux.jl + CUDA.jl

cuda deep-learning flux julia neural-networks symbolic-regression xai

Last synced: 22 Apr 2026

https://github.com/orgh0/highperformancecnn

Implementation of a High Performance CNN for MNIST dataset

cnn cpp cuda

Last synced: 18 May 2026

https://github.com/dafadey/GPGPU_OpenCL_vs_CUDA

This is a repository with sample codes for testing memory bandwidth, arithmetic latency hiding and shared/local memory performance on AMD and nVidia devices

cuda gpgpu gpgpu-computing opencl

Last synced: 16 May 2025

https://github.com/hubenchang0515/fft-benchmark

一些 FFT 库的性能测试

cuda fft

Last synced: 27 Oct 2025

https://github.com/liuyuweitarek/pytorch-docker-builder

Automate PyTorch Docker image builds with compatible Python, CUDA, and Poetry versions, including CI/CD for testing.

cicd containerd cuda docker docker-image poetry-python python python3 pytorch pytorch-docker

Last synced: 06 Feb 2026

https://github.com/rnabla/cuda-des

Bruteforcing DES using CUDA

bruteforce cuda data des encryption gpu parallel standard

Last synced: 27 Oct 2025

https://github.com/rjected/cuda-timelock

Solving a large number of timelock puzzles in parallel using GPU acceleration

c cgbn concurrent cpp cuda gmp graphics nvidia parallel puzzle timelock

Last synced: 14 Apr 2026

https://github.com/michaelfranzl/image_debian-gpgpu

Dockerfile for a Debian base image with AMD and Nvidia GPGPU support

amd container container-image cuda debian docker gpgpu nvidia opencl

Last synced: 10 May 2026

https://github.com/dvhh/masscorrelation

An exercise in writing an efficient correlation calculator

calculations correlation-calculation cuda matrix multi-threading openmp

Last synced: 15 May 2026

https://github.com/hariprashad-ravikumar/accelerated-computing-in-cuda-c

This repo contains my codes for problem sets in NVIDIA Getting Started with Accelerated Computing in CUDA C/C++

c cuda cuda-kernels cuda-toolkit

Last synced: 24 Apr 2026

https://github.com/alegau03/parallel-k-means

Implementation of C programs for the K-Means algorithm for parallel computing.

c c-programming cuda parallel parallel-programming

Last synced: 24 Apr 2026

https://github.com/andih/cuda-fortran-stream

Variant of STREAM Benchmark in CUDA Fortran

cuda cuda-fortran gpu stream-benchmarks variants

Last synced: 02 Mar 2025

https://github.com/david-palma/cuda-programming

Educational CUDA C/C++ programming repository with commented examples on GPU parallel computing, matrix operations, and performance profiling. Requires a CUDA-enabled NVIDIA GPU.

c-cpp cpp cuda cuda-toolkit education gpu gpu-programming kernel matrix-operations nvcc nvidia parallel-computing parallel-programming practice profiling threads

Last synced: 25 Apr 2026

https://github.com/crcrpar/dev-chainer

Dockerfile for Chainer Development in VSCode

chainer cuda docker nvidia-docker vscode

Last synced: 26 Apr 2026

https://github.com/vishwamartur/btc_recovery

High-performance Bitcoin wallet password recovery system with GPU acceleration and integrated graphics support. Recover Bitcoin Core wallet.dat files without blockchain download using advanced algorithms and blockchain APIs.

bitcoin bitcoin-core blockchain blockchain-api cpp cryptocurrency cuda electrum gpu-acceleration integrated-graphics multithreading opencl password-recovery private-keys recovery-tools wallet-dat wallet-recovery

Last synced: 14 Apr 2026

https://github.com/lhldev/rust-neural-network

neural network implementation in rust

cuda feedforward-neural-network

Last synced: 16 May 2026

https://github.com/0xhilsa/pynum

a small python library for 1D and 2D arrays with GPU support

array c cuda nvcc python3

Last synced: 18 Apr 2026

https://github.com/gravitytwog/electromagneticfield

Electro-magnetic field simulation made with CUDA

c cuda cuda-kernels cuda-programming

Last synced: 26 Apr 2026

https://github.com/pharmcat/metidacu.jl

CUDA solver for Metida.jl

cuda julia-language metida mixed-models

Last synced: 27 Apr 2026

https://github.com/enp1s0/curand_fp16

FP16 pseudo random number generator on GPU

cuda gpu half-precision random-number-generators

Last synced: 20 Aug 2025

https://github.com/codingrule/cuda-mbrot

Just another mandlebrot with cuda

cuda cuda-toolkit cupy fractal mandelbrot mathematics nvidia

Last synced: 27 Apr 2026

https://github.com/katpercent/raytracing

A foundation for ray tracing using CUDA and parallel computing techniques.

3d cuda engine game parrallel-computing ray raytracing

Last synced: 01 Nov 2025

https://github.com/iag-geo/image-classification

Image classification scripts using YOLOv5 with aerial imagery

cuda image-classification python pytorch swimming-pools yolov5

Last synced: 22 Feb 2026

https://github.com/pjueon/cuda_intellisense

A simple python script to fix cuda C++ intellisense for visual studio.

cuda visual-studio

Last synced: 09 Apr 2026

https://github.com/matteogianferrari/qr-decomposition

Tthis project implements different methods to exploit caches usage, the multicore CPU and the GPU architectures, on the Gram-Schmidt QR Decomposition algorithm and measure the performance of the different implementations.

cuda openmp parallel-computing

Last synced: 12 Apr 2026

https://github.com/axel-ex/seame-ads-autonomous-lane-detection-24-25

🚗 Real-time lane detection and autonomous steering for JetRacer, powered by ROS2 and GPU-accelerated CV on Jetson Nano.

cuda jetson-nano ros2 tensorrt

Last synced: 27 Apr 2026

https://github.com/enriquebdel/clases-cuda-programacion-paralela-en-c-

En este repositorio encontrarás varias lecciones creadas por mí sobre la librería CUDA en C. El programa que utilizo para programar es MobaXterm.

c cuda cuda-programming gnu-linux googlecolab mobaxterm nvidia parallel-programming ubuntu university

Last synced: 19 May 2026

https://github.com/linux-alex/geep

GEEP (Genetic Evolutionary Engineering Platform) - a C++/Qt framework for genetic programming, optimized with CUDA acceleration. GEEP enables large-scale population-based optimization, ideal for solving high-dimensional problems using evolutionary algorithms and GPU computing.

cpp cuda framework genetic-programming

Last synced: 18 May 2026

https://github.com/renatomaynard/a-multiple-population-coarse-grained-genetic-algorithm-to-solve-the-quadratic-assignment-problem-

A Multiple-population coarse-grained Genetic Algorithm to solve the Quadratic Assignment Problem

c cuda genetic-algorithm quadratic-assignment-problem

Last synced: 09 May 2026

https://github.com/maelstrom6/mandelpy

A Mandelbrot and Buddhabrot viewer with GPU acceleration

buddhabrot cuda gpu mandelbrot python3

Last synced: 27 Apr 2026

https://github.com/xusworld/tars

Tars is a cool deep learning framework.

avx2 avx512 cuda deep-learning

Last synced: 27 Apr 2026

https://github.com/sakurabtc888/btc-eth-evm-ltc-trx-collision

针对BTC、ETH(EVM)、LTC、TRX链的私钥、公钥CPU+GPU碰撞工具

btc cuda eth evm ltc trx

Last synced: 04 Jul 2025

https://github.com/ezroot/gacc

GIACC - Generate Images, Art, Code and Conversations

ai codegen cuda huggingface image imagegeneration python rust stablediffusion

Last synced: 06 Apr 2026

https://github.com/jonathanraiman/mini_cuda_rtc

Miniature CUDA Array library with Runtime Compilation

cpp11 cuda jit runtime-compilation

Last synced: 14 Apr 2026

https://github.com/mark0011astra/simplecuda

CUDAを使用したGPU演算をNumPyと同様のインターフェースで簡単行えるライブラリ。A library that allows users to easily perform GPU operations using CUDA with a NumPy-like interface.

cuda cupy gpu machine-learning numpy python vector

Last synced: 02 May 2026

https://github.com/alpha74/hungarianalgocuda

Hungarian Algorithm for Linear Assignment Problem implemented using CUDA.

cuda nvcc parallel-computing parallel-programming

Last synced: 01 Jun 2026

https://github.com/perl-openmp/p5-openmp-environment

Perl interface for manipulating OpenMP's environmental runtime execution variables

compiler cuda gcc gpu hpc openmp perl pthreads

Last synced: 19 Feb 2026

https://github.com/dolongbien/cuda

CUDA and Caffe/Caffe2 installation Ubuntu 16.04

c3d-intel-caffe caffe caffe2 cuda cudnn deep-learning ubuntu

Last synced: 28 Apr 2026

https://github.com/fblupi/grado_informatica-ppr

Prácticas de la asignatura Programación Paralela de la UGR

cuda mpi openmp parallel-computing

Last synced: 22 Apr 2026

https://github.com/xavierjiezou/gpu-compute-capability

An application for querying the computing power of each gpu released by NVIDIA.

cuda gpu nvidia

Last synced: 28 Apr 2026

https://github.com/SanaeProject/Matrix-for-Cpp

This repository has types that handle matrices.

cpp14 cpp14-library cuda matrix-library

Last synced: 15 May 2025

https://github.com/hartorn/docker-python

Repository to build python image, based on ubuntu and CUDA

cuda docker mkl-dnn onednn python3 ubuntu ubuntu1804

Last synced: 05 May 2026

https://github.com/quantum-integrated-technologies/deepforge

DeepForge : framework for working with machine learning.

ai artificial-intelligence cuda library machine-learning ml neural-network

Last synced: 31 Jul 2025

https://github.com/leocelente/basic_cuda

My CUDA source files while learning

cpp cuda gpgpu

Last synced: 29 Apr 2026

https://github.com/thunder-compute/thunder-compute-documentation

Documentation for Thunder Compute, a cloud platform creating technology to virtualize GPUs over TCP

ai artificial-intelligence cloud cloud-computing cuda gpu llm machine-learning nvidia pytorch tensorflow thunder-compute virtualization

Last synced: 15 Oct 2025

https://github.com/andrewboessen/bitonic-merge-sort

Bitonic Merge Sort algorithm optimized for GPU execution

bitonic-merge-sort cuda sorting-network

Last synced: 16 May 2026

https://github.com/Programmer-RD-AI/DetectX

A Pythonic approach to object detection using Detectron2, a clean, modular framework for training and deploying computer vision models. DetectX simplifies the complexity of object detection while maintaining high performance and extensibility.

coco-dataset computer-vision computer-vision-library cuda deep-learning detectron2 faster-rcnn gpu-accelerated machine-learning ml-framework object-detection object-recognition python3 pytorch retinanet

Last synced: 04 May 2025

https://github.com/asadiahmad/gesture-detection

Real-time Gesture Detection using CUDA-accelerated OpenCV in Python.

computer-vision cuda gesture-recognition gpu-acceleration open-pose opencv opencv-cuda pose-detection real-time

Last synced: 29 Apr 2026

https://github.com/anras5/parallel-computing

Comparing CPU and GPU

cuda gpu openmp

Last synced: 29 Apr 2026

https://github.com/nofaralfasi/parallel-sequence-alignment

A parallelized version of multiple DNA sequence alignment algorithm with MPI, OpenMP and CUDA

cuda mpi openmp sequence-alignment

Last synced: 29 Apr 2026

https://github.com/erosiv/silt

simple immediate lightweight tensors

cmake cuda simulation tensor

Last synced: 31 Oct 2025

https://github.com/m15kh/cuda_programming

CUDA programming enables parallel computing on NVIDIA GPUs for high-performance tasks like deep learning and scientific computing

cuda cuda-programming gpu nvidia parallel-computing practice-programming

Last synced: 03 Apr 2025

https://github.com/ismailtekin05/caloriedetectingai

🍎🔍 Smart AI system that identifies food items in photos and calculates their calorie content automatically. Built with TensorFlow, YOLOv8, CUDA and computer vision for accurate nutrition tracking.

ai aimodel calorie-calculator computer-vision cuda data-analysis data-science data-segmentation data-visualization dataset dataset-generation image-processing image-recognition python segmentation-models tensorflow ultralytics yaml yolo yolov8

Last synced: 29 Apr 2026

https://github.com/lightshade12/kittlespt

A hobby CUDA pathtracing renderer.

3d-graphics computer-graphics cuda gpu path-tracing ray-tracing

Last synced: 18 Mar 2025

https://github.com/kartavyaantani/cuda_image_processing

A CUDA-accelerated image processing project featuring multiple GPU-based filters and enhancement techniques. Implements convolution, edge detection, Non-Local Means (NLM) denoising, K-Nearest Neighbors (KNN), and pixelization. Each operation is optimized using CUDA kernels for real-time performance on large images. The project supports command-line

cuda cuda-kernels cuda-programming cuda-toolkit gpu-programming high-performance-computing image-manipulation image-processing nvidia-cuda nvidia-gpu

Last synced: 30 Apr 2026

https://github.com/eric900115/parallelprogramming

The repository contains the coursework for CS5422, NTHU's Parallel Programming Course.

cuda mpi openmp ucx

Last synced: 26 May 2026

https://github.com/himeyama/cuda-nmf

NMF calculations are performed on NVIDIA GPUs using the Cuda API. (GEM released)

cublas cuda gem nmf ruby

Last synced: 13 Apr 2026

https://github.com/bolner/totally-diffused

Debian/NVIDIA Docker image for AUTOMATIC1111's Stable Diffusion application.

automatic1111 cuda debian docker-image nvidia stable-diffusion xformers

Last synced: 11 Apr 2026

https://github.com/jxlarrea/homeassistant-voice-recipes

GPU/CUDA-accelerated voice control stack for Home Assistant. Runs on x86/x64 and ARM64 (including the NVIDIA DGX Spark). 100% Local - No Cloud, No Subscriptions.

arm64 cuda dgx-spark gb10 gpu-acceleration home-assistant local-llm qwen3 speech-to-text text-to-speech voice-assistant x86-64

Last synced: 26 May 2026

https://github.com/sarah627/horus_eye_fcih_graduation_project

An AI-powered tourism website using YOLOv7 for real-time landmark detection in images. Built with Flask, PyTorch, and Roboflow for seamless tourist interaction.

computer-vision cuda flask jupyter-notebook kaggle matplotlib object-detection opencv python pytorch roboflow

Last synced: 14 Apr 2026

https://github.com/fynv/cudainline

A CUDA interface for Python. A distillation of the engine part of ThrustRTC.

cuda gpu nvrtc pyhton

Last synced: 18 May 2026

https://github.com/dotblueshoes/robertscross

The Roberts cross operator is used in image processing and computer vision for edge detection.

cuda edge-detection image-processing

Last synced: 30 Mar 2025

https://github.com/inventwithdean/cuda_mlp

Implementation of a simple Multilayer Perceptron in pure CUDA

cuda cuda-programming deep-learning neural-networks

Last synced: 30 Mar 2025

https://github.com/tudasc/cusan-tests

A test suite for CUDA-aware MPI race detection

cuda dataracebench-cuda mpi

Last synced: 03 May 2026

https://github.com/duskvirkus/ofxarrayfire

An openFrameworks addon with pre-compiled binaries of ArrayFire.

arrayfire cuda ofxaddon openframeworks openframeworks-addon

Last synced: 09 May 2026

https://github.com/varun-1703/eu-act-navigator-rag-qabot

An interactive, privacy-first application for querying the European Union’s AI Act using a local Retrieval-Augmented Generation (RAG) pipeline. Combines semantic search (FAISS) and a quantized TinyLlama LLM for fast, accurate, and context-aware answers—all running on your own hardware.

cuda faiss hugging-face-transformers langchain legal-tech local-slm machine-learning nlp open-source privacy rag-chatbot sentence-transformers streamlit tinyllama

Last synced: 03 May 2026

https://github.com/thomasonzhou/minitorch

rebuilding pytorch: from autograd to convolutions in CUDA

cuda numba numpy

Last synced: 02 Feb 2026