An open API service indexing awesome lists of open source software.

CUDA

CUDA® is a parallel computing platform and programming model developed by NVIDIA for general computing on graphical processing units (GPUs). With CUDA, developers are able to dramatically speed up computing applications by harnessing the power of GPUs.

https://github.com/umer-farooq-cs/canny-edge-detector

High-performance Canny edge detector with CPU and CUDA implementations. Loads PGM images, performs Gaussian smoothing, gradients, non-max suppression, and hysteresis. Benchmarks both paths, outputs edge maps, and reports speedup. Simple Makefile, sample images included.

c canny-edge-detection computer-vision cpp cuda gpu high-performance-computing image-processing nvcc pgm

Last synced: 18 Apr 2026

https://github.com/sakurabtc888/btc-eth-evm-ltc-trx-collision

针对BTC、ETH(EVM)、LTC、TRX链的私钥、公钥CPU+GPU碰撞工具

btc cuda eth evm ltc trx

Last synced: 04 Jul 2025

https://github.com/shahed-chy-suzan/psd-to-html--cuda

Cuda is a single page creative portfolio psd to html template which is built with HTML5 & CSS3. The site can be customized easily to suit your needs.

cuda portfolio psd-to-html

Last synced: 18 Jan 2026

https://github.com/himeyama/cuda-nmf

NMF calculations are performed on NVIDIA GPUs using the Cuda API. (GEM released)

cublas cuda gem nmf ruby

Last synced: 13 Apr 2026

https://github.com/sarah627/horus_eye_fcih_graduation_project

An AI-powered tourism website using YOLOv7 for real-time landmark detection in images. Built with Flask, PyTorch, and Roboflow for seamless tourist interaction.

computer-vision cuda flask jupyter-notebook kaggle matplotlib object-detection opencv python pytorch roboflow

Last synced: 14 Apr 2026

https://github.com/jessetg/cuda-practice

Working through the chapters of Cuda by Example

c cpp cuda cuda-by-example gpgpu

Last synced: 01 May 2026

https://github.com/alekseyscorpi/vacancies_server

This is a server for vacancies generation using LLM (Saiga3)

code cuda cuda-toolkit docker dockerfile flask llama3 llamacpp llm ngrok pydantic saiga

Last synced: 06 Feb 2026

https://github.com/alwaysai/jetpack-46-hacky-hour

NVIDIA’s Jetpack 4.6 capabilities and how to use them with EdgeIQ, alwaysAI Computer Vision framework.

alwaysai computer-vision cuda edge-computing jetpack tensorrt

Last synced: 01 May 2026

https://github.com/m-torhan/cuda-stl-renderer

CUDA C++ implementation of STL file renderer using ray tracing method

cuda

Last synced: 25 Feb 2026

https://github.com/straightchlorine/quantum-pipeline

A Python module for executing and monitoring quantum algorithms across local simulators and IBM Quantum platforms. Seamlessly handles data collection, organization, and streaming to Apache Kafka

apache-kafka apache-spark aws-s3 cuda docker gpu-acceleration ibm-cloud ibm-quantum minio qiskit qiskit-aer qiskit-nature quantum-computing visualizations vqe

Last synced: 08 Oct 2025

https://github.com/gravitytwog/electromagneticfield

Electro-magnetic field simulation made with CUDA

c cuda cuda-kernels cuda-programming

Last synced: 26 Apr 2026

https://github.com/codingrule/cuda-mbrot

Just another mandlebrot with cuda

cuda cuda-toolkit cupy fractal mandelbrot mathematics nvidia

Last synced: 27 Apr 2026

https://github.com/davidalgis/godot_cuda

Demonstration that it is possible to use CUDA directly from Godot engine.

cuda godot modules

Last synced: 03 May 2026

https://github.com/pintamonas4575/tfg-diffusion-model-customdataset

Creación en Pytorch de un modelo de difusión para generación incondicional de imágenes con un dataset propio.

attention-mechanism cnn cosine-scheduler cuda custom-dataset ddim deep-learning diffusion-models gpu image-generation pytorch

Last synced: 17 Apr 2026

https://github.com/ophoperhpo/dcgan-lentach-logo-generator

The Lentach logo generator. #MachineLearningFun

cuda dcgan dcgan-tensorflow keras lentach machinelearning ml

Last synced: 26 Jun 2026

https://github.com/poyea/lollipop

🍭 Sweet GPU compute kernels in CUDA, wrapped via CuPy

cuda cuda-kernel cuda-kernels cuda-programming gpu-kernels gpu-programming python

Last synced: 17 Jun 2026

https://github.com/naidezhujimo/cuda-rewrite-fast-matrix-multiplication

This repository contains an optimized implementation of matrix multiplication using CUDA. The goal of this project is to provide a high-performance solution for matrix multiplication operations on NVIDIA GPUs.

cuda

Last synced: 26 Mar 2025

https://github.com/bhattbhavesh91/rapids-cudf-cuml-example

Running KNN algorithm much faster on GPU for free using RAPIDS packages like cuML and cuDF

cuda cuml deep-learning nvidia-gpu rapids rapidsai

Last synced: 17 Apr 2026

https://github.com/kartavyaantani/cuda_image_processing

A CUDA-accelerated image processing project featuring multiple GPU-based filters and enhancement techniques. Implements convolution, edge detection, Non-Local Means (NLM) denoising, K-Nearest Neighbors (KNN), and pixelization. Each operation is optimized using CUDA kernels for real-time performance on large images. The project supports command-line

cuda cuda-kernels cuda-programming cuda-toolkit gpu-programming high-performance-computing image-manipulation image-processing nvidia-cuda nvidia-gpu

Last synced: 30 Apr 2026

https://github.com/ismailtekin05/caloriedetectingai

🍎🔍 Smart AI system that identifies food items in photos and calculates their calorie content automatically. Built with TensorFlow, YOLOv8, CUDA and computer vision for accurate nutrition tracking.

ai aimodel calorie-calculator computer-vision cuda data-analysis data-science data-segmentation data-visualization dataset dataset-generation image-processing image-recognition python segmentation-models tensorflow ultralytics yaml yolo yolov8

Last synced: 29 Apr 2026

https://github.com/jblaschke/pynvtx

Thin pybind11 wrapper for NVTX wrappers -- with some bells and whistles attached.

cuda nvtx nvtx-markers

Last synced: 23 Jun 2026

https://github.com/timothystewart6/ubuntu-gb10

Ubuntu 24.04 + NVIDIA stack setup guide for GB10 / DGX Spark systems

ansible ansible-playbook arm64 blackwell cuda dgx gpu grace-blackwell homelab nvidia nvidia-driver ubuntu

Last synced: 26 Jun 2026

https://github.com/nofaralfasi/parallel-sequence-alignment

A parallelized version of multiple DNA sequence alignment algorithm with MPI, OpenMP and CUDA

cuda mpi openmp sequence-alignment

Last synced: 29 Apr 2026

https://github.com/asadiahmad/gesture-detection

Real-time Gesture Detection using CUDA-accelerated OpenCV in Python.

computer-vision cuda gesture-recognition gpu-acceleration open-pose opencv opencv-cuda pose-detection real-time

Last synced: 29 Apr 2026

https://github.com/enkerewpo/talaria

AI Voice Assistant for Dialogue and IoT Control Powered by GPT4o

cuda gpt-4 python3 pytorch stt tts

Last synced: 16 Apr 2026

https://github.com/sartajbhuvaji/cuda

Deloped CUDA kernel functions to load and train a Convolution Neural Network from scratch.

cuda cuda-programming gpu-programming neural-network nvidia-cuda

Last synced: 30 Mar 2025

https://github.com/torotoki/simple-paged-attention

A simple implementation of PagedAttention purely written in CUDA and C++.

attention cpp cuda llm transformer

Last synced: 18 May 2026

https://github.com/croko22/vit-cpp

An implementation of the Transformer model architecture ("Attention Is All You Need") in pure C++17 from scratch

cpp cuda deep-learning machine-learning neural-network transformer

Last synced: 17 Jan 2026

https://github.com/thisalmandula/gpu_accelerated_lpt_cfd_code

This repository contains GPU accelerated version of the particle tracking model developed by Merel Kooi for biofouled microplastic particles ( available at: https://pubs.acs.org/doi/10.1021/acs.est.6b04702) written in CUDA Fortran and CUDA Python. This repository is intended as a learning tool for GPU programming.

biofouling computational-fluid-dynamics cuda fortran lagrangian-particle-tracking microplastics python

Last synced: 02 May 2026

https://github.com/a-nau/python-cuda-envs

Script to automatically map a specific CUDA version to a Conda Python environment.

anaconda anaconda-environment cuda installation installation-script python python-environment python3

Last synced: 18 Apr 2026

https://github.com/SanaeProject/Matrix-for-Cpp

This repository has types that handle matrices.

cpp14 cpp14-library cuda matrix-library

Last synced: 15 May 2025

https://github.com/rajarsheya/real-time-audio-feature-extraction-with-cuda-for-speech-recognition

This project accelerates MFCC extraction using CUDA for real-time speech recognition. Offloading the process to the GPU reduces latency and speeds up processing, enabling fast, local speech-to-text transcription for applications like virtual assistants, without cloud reliance.

audio-processing cpp cuda fourier-transform python

Last synced: 10 May 2026

https://github.com/xlisp/learn-vllm

vllm learning

cuda nvidia pytorch vllm

Last synced: 10 May 2026

https://github.com/tensorbfs/cutropicalgemm.jl

The fastest Tropical number matrix multiplication on GPU

cuda gemm tropical-algebra

Last synced: 20 Jan 2026

https://github.com/daelsepara/hipmandelbrot

GPU Implementation of Mandelbrot Fractal Generator with Benchmarking

amd cuda fractal gpu gpu-compute gpu-computing hip mandelbrot parallel-computing rocm sdk

Last synced: 20 Feb 2026

https://github.com/xusworld/tars

Tars is a cool deep learning framework.

avx2 avx512 cuda deep-learning

Last synced: 27 Apr 2026

https://github.com/shivendrra/axgrad

lightweight tensor library that contains it's own auto-diff engine like pytorch

autograd cuda pytorch scratch-implementation tinygrad

Last synced: 08 May 2026

https://github.com/satyajitghana/gpu-programming

Contains the contents of GPU Architecture and Programming course done on NPTEL

c cpp cuda cuda-programming gpu-programming nptel nvidia

Last synced: 09 Mar 2026

https://github.com/axel-ex/seame-ads-autonomous-lane-detection-24-25

🚗 Real-time lane detection and autonomous steering for JetRacer, powered by ROS2 and GPU-accelerated CV on Jetson Nano.

cuda jetson-nano ros2 tensorrt

Last synced: 27 Apr 2026

https://github.com/pharmcat/metidacu.jl

CUDA solver for Metida.jl

cuda julia-language metida mixed-models

Last synced: 27 Apr 2026

https://github.com/rkv0id/automata-vtk

Multi-dimensional Cellular Automata visualization using Python's VTK bindings on top of a CUDA-parallel grid updates.

cellular-automata cuda game-of-life python vtk

Last synced: 19 Apr 2026

https://github.com/lhldev/rust-neural-network

neural network implementation in rust

cuda feedforward-neural-network

Last synced: 16 May 2026

https://github.com/david-palma/cuda-programming

Educational CUDA C/C++ programming repository with commented examples on GPU parallel computing, matrix operations, and performance profiling. Requires a CUDA-enabled NVIDIA GPU.

c-cpp cpp cuda cuda-toolkit education gpu gpu-programming kernel matrix-operations nvcc nvidia parallel-computing parallel-programming practice profiling threads

Last synced: 25 Apr 2026

https://github.com/fynv/cudainline

A CUDA interface for Python. A distillation of the engine part of ThrustRTC.

cuda gpu nvrtc pyhton

Last synced: 18 May 2026

https://github.com/rajarsheya/real-time-traffic-analysis-with-cuda-object-detection

Implemented CUDA-accelerated object detection (YOLO) to analyze a sample image dataset. Performed vehicle counting and simulated speed estimation to demonstrate real-time traffic analysis capabilities.

cpp cuda opencv python yolo

Last synced: 12 Apr 2026

https://github.com/alegau03/parallel-k-means

Implementation of C programs for the K-Means algorithm for parallel computing.

c c-programming cuda parallel parallel-programming

Last synced: 24 Apr 2026

https://github.com/bolner/totally-diffused

Debian/NVIDIA Docker image for AUTOMATIC1111's Stable Diffusion application.

automatic1111 cuda debian docker-image nvidia stable-diffusion xformers

Last synced: 11 Apr 2026

https://github.com/gunrock/template

Template repository for essentials applications to get you started asap!

cpp cuda essentials gpu graph-algorithms graph-analytics gunrock

Last synced: 15 May 2026

https://github.com/emilienmendes/gpgpu

Parallélisation et optimisation de reconnaissance de point dans une image

cuda gpgpu parallel-programming

Last synced: 28 Oct 2025

https://github.com/hariprashad-ravikumar/accelerated-computing-in-cuda-c

This repo contains my codes for problem sets in NVIDIA Getting Started with Accelerated Computing in CUDA C/C++

c cuda cuda-kernels cuda-toolkit

Last synced: 24 Apr 2026

https://github.com/orgh0/highperformancecnn

Implementation of a High Performance CNN for MNIST dataset

cnn cpp cuda

Last synced: 18 May 2026

https://github.com/patrickm663/localglmnet.jl

This is a WIP implementation of Richman & Wüthrich (2022) using Julia's Flux.jl + CUDA.jl

cuda deep-learning flux julia neural-networks symbolic-regression xai

Last synced: 22 Apr 2026

https://github.com/jakubriegel/game_of_life_3d

3D game of life implemented in CUDA

concurency cuda gameoflife nvidia put-poznan

Last synced: 21 Apr 2026

https://github.com/kchristin22/ising_model

Implementation of a cellular automaton on GPU using different features of CUDA

cellular-automaton cuda gpu-programming hpc ising-model parallel-computing

Last synced: 15 Mar 2025

https://github.com/xihuai18/image-processing-in-cuda

Implementation of Image Processing Method

cuda imageprocessing

Last synced: 04 Oct 2025

https://github.com/subatomicplanets/simplebitcoinminer

A simple Bitcoin C++ and CUDA solo miner

bitcoin cpp cryptocurrency cuda miner

Last synced: 19 Apr 2026

https://github.com/lightshade12/kittlespt

A hobby CUDA pathtracing renderer.

3d-graphics computer-graphics cuda gpu path-tracing ray-tracing

Last synced: 18 Mar 2025

https://github.com/hatamiarash7/cuda-python

GPU programming using CUDA & Python

cuda gpu gpu-computing gpu-programming python

Last synced: 29 Apr 2026

https://github.com/haleelrah/Vision-pro-MAX

A Raspberry Pi-based object detection system for assisting visually impaired individuals. This project utilizes YOLO object detection and a Hailo 8L TPU to identify obstacles like manholes, potholes, and bumps, providing real-time audio feedback to aid navigation.

bash computer-vision cuda fine-tuning jupyter-notebook object-detection opencv python pytorch raspberry-pi rpi-camera ssh text-to-speech ultralytics yolo yolov8

Last synced: 30 Dec 2025

https://github.com/raumberg/hypervision

Neural Network based real-time aimbot system, operating on TensorRT with custom CUDA kernel and C FFI extensions

ai aim cuda cython neural-networks python tensorrt yolo

Last synced: 20 May 2026

https://github.com/adamczykpiotr/cudamatrixlibrary

Matrix operation library using single, n-threads or CUDA supported GPU

agh agh-ust cpp cuda cuda-library matrix matrix-computations matrix-functions matrix-multiplication

Last synced: 19 Apr 2026

https://github.com/eric900115/parallelprogramming

The repository contains the coursework for CS5422, NTHU's Parallel Programming Course.

cuda mpi openmp ucx

Last synced: 26 May 2026

https://github.com/eshibusawa/cupy-cuda

Learn CUDA programming essentials with CuPy, from basic kernels to advanced memory patterns

cooperative-thread-array cub cuda cupy gpu parallel-computing python

Last synced: 15 Jun 2025

https://github.com/sohhamseal/scalable-systems-programs

A little less effort to learn parallel programming...

cuda mpi openmp

Last synced: 18 Apr 2026

https://github.com/5had3z/torch-discounted-cumsum-nd

PyTorch Discounted Cumsum with Autograd (CPU + CUDA)

cuda machine-learning pytorch

Last synced: 18 Apr 2026

https://github.com/senli1073/docker-gpu-monitor

A lightweight GPU monitor designed for real-time web-based viewing of GPU server status.

container cuda docker flask gpu gpu-monitoring linux memory-usage nvidia-smi web

Last synced: 05 Apr 2026

https://github.com/inventwithdean/cuda_mlp

Implementation of a simple Multilayer Perceptron in pure CUDA

cuda cuda-programming deep-learning neural-networks

Last synced: 30 Mar 2025

https://github.com/matx64/rs-netbot

Old School Runescape bot with CNN for object identification

cuda numpy python pytorch

Last synced: 04 May 2026

https://github.com/gvvsnrnaveen/cuda

this repository contains the various programs that can written using CUDA Toolkit.

c cpp cuda nvcc nvidia-cuda nvidia-gpu

Last synced: 17 Jan 2026

https://github.com/wallneradam/docker-ccminer

CCMiner (tpruvot version) Docker Builder

ccminer cuda docker gpu litecoin miner monero nvidia nvidia-docker

Last synced: 18 Apr 2026

https://github.com/sd7campeon/yelp-sentiment-analysis-with-python-bs4-and-llm

A scalable pipeline for automated extraction, preprocessing, and sentiment analysis of Yelp reviews. Uses advanced HTTP requests, HTML parsing, and text normalization (tokenization, stopword removal, lemmatization) to enable precise polarity and subjectivity analysis for consumer insights and business analytics.

beautifulsoup beautifulsoup4 business-analytics cuda data-analysis nlp-machine-learning nltk opinion-mining pandas python python3 requests-library-python sentiment-analysis text-preprocessing textblob torch web-scraping yelp-reviews

Last synced: 06 May 2026

https://github.com/andrewboessen/bitonic-merge-sort

Bitonic Merge Sort algorithm optimized for GPU execution

bitonic-merge-sort cuda sorting-network

Last synced: 16 May 2026

https://github.com/emmanuelmess/firstcollisiontimesteprarefiedgassimulator

This simulator computes all possible intersections for a very small timestep for a particle model

cpp20 cuda simulator

Last synced: 17 Apr 2026

https://github.com/le-ander/msc_bioinfo-experimental_design

Using information theory to inform experimental design with GPU acceleration. Computing group project as part of the MSc in Bioinformatics and Theorectical Systems Biology at Imperial College London 2016/2017.

cuda experimental-design gpu-computing information-theory pycuda systems-biology

Last synced: 26 Apr 2026

https://github.com/tortillazhawaii/fishes_cuda

3D boid simulation with GPU.

cuda opengl

Last synced: 04 May 2026

https://github.com/ergonomech/comfyui-windows-installer

Automated setup for ComfyUI on Windows with CUDA, custom plugins, and optimized PyTorch settings. Made to Run as Server and Error Correct,. Easy installation and launch using Miniconda.

automation comfy conda conda-environment cuda hosting-deployment setup windows

Last synced: 31 Mar 2025

https://github.com/jtompuri/weighted-voronoi-stippling

High-performance weighted Voronoi stippling implementation. Exports PNG and TSP files. Visualizes TSP tours as continuous line drawings.

computer-graphics cuda gpu-acceleration lloyd-relaxation numba python stippling traveling-salesman tsp voronoi

Last synced: 18 May 2026

https://github.com/bl33h/productoftwovectors

This code utilizes CUDA for parallel vector multiplication on a GPU, demonstrating GPU's acceleration capabilities.

cuda gpu kernel paralelism parallel-programming product vector

Last synced: 16 May 2026

https://github.com/enp1s0/curand_fp16

FP16 pseudo random number generator on GPU

cuda gpu half-precision random-number-generators

Last synced: 20 Aug 2025

https://github.com/jxlarrea/homeassistant-voice-recipes

GPU/CUDA-accelerated voice control stack for Home Assistant. Runs on x86/x64 and ARM64 (including the NVIDIA DGX Spark). 100% Local - No Cloud, No Subscriptions.

arm64 cuda dgx-spark gb10 gpu-acceleration home-assistant local-llm qwen3 speech-to-text text-to-speech voice-assistant x86-64

Last synced: 26 May 2026

https://github.com/ehsanmok/cs-521

UBC CS 521: Parallel Computing and Architectures

cuda erlang parallel-algorithm parallel-computing

Last synced: 16 May 2026

https://github.com/tudasc/cusan-tests

A test suite for CUDA-aware MPI race detection

cuda dataracebench-cuda mpi

Last synced: 03 May 2026

https://github.com/matteogianferrari/qr-decomposition

Tthis project implements different methods to exploit caches usage, the multicore CPU and the GPU architectures, on the Gram-Schmidt QR Decomposition algorithm and measure the performance of the different implementations.

cuda openmp parallel-computing

Last synced: 12 Apr 2026

https://github.com/microo8/micronn

Simple neural network library with backpropagation using CUDA

c cuda neural-network

Last synced: 19 May 2026

https://github.com/programmer-rd-ai/digivis

A PyTorch-based deep learning implementation for MNIST digit recognition featuring CNNs, GPU acceleration, experiment tracking, and comprehensive testing capabilities.

cnn computer-vision cuda data-science deep-learning digit-recognition image-classification machine-learning mnist neural-networks python pytorch wandb

Last synced: 10 Jun 2025

https://github.com/ashwani-rathee/imagesgpu.jl

Image Processing on GPU in Julia

cuda gpu image image-processing julia

Last synced: 11 Jul 2025