An open API service indexing awesome lists of open source software.

CUDA

CUDA® is a parallel computing platform and programming model developed by NVIDIA for general computing on graphical processing units (GPUs). With CUDA, developers are able to dramatically speed up computing applications by harnessing the power of GPUs.

https://github.com/elcruzo/cuda-conv

Lightweight CUDA kernel for 2D image convolution achieving 20x+ speedup. Built with CuPy for the NVIDIA Hackathon.

computer-vision convolution cuda cupy gpu-computing hackathon high-performance-computing image-processing nvidia python

Last synced: 15 May 2026

https://github.com/naidezhujimo/cuda-learning-just-record-the-learning-process-

just record the learning process,There are notes,Welcome to learn.

cuda

Last synced: 26 Mar 2025

https://github.com/nxoti1/points-reader-ocr

🖥️ Extract text from images easily with POINTS-Reader OCR, a high-accuracy application for seamless document conversion and processing.

cuda gradio huggingface-transformers ocr open-source points-reader reportlab spaces tencent vision-language-model vlm

Last synced: 20 May 2026

https://github.com/lehoangan2906/cuda_basics

A simple implementation of operations on vectors and matrices, optimized for running on Nvidia GPU with CUDA

cpp cuda cuda-programming

Last synced: 16 Jun 2025

https://github.com/sevilze/folderesque

Python Script to process and upscale images in specified folders using RRDB models.

cuda esrgan scripts upscaler

Last synced: 02 Mar 2026

https://github.com/kenwuqianghao/c4ai-cuda-birds

Homework assignments for C4AI Beginners in Research-Driven Studies

cuda machine-learning pytorch

Last synced: 18 Apr 2026

https://github.com/TheodoreAI/monte-carlo-simulator

CUDA application for Monte Carlo simulation is used to determine the range of outcomes for a series of parameters, each of which has a probability distribution showing how likely each option is to happen. This is using CUDA.

cuda gpu-computing monte-carlo-simulation parallel-computing

Last synced: 06 Oct 2025

https://github.com/thesupercd/cuda_sort

A simple project implementing and measuring the runtime performance metrics related to massively parallel algorithms (radix sort) on an NVIDIA GPU device.

benchmarking c cpp cuda cuda-programming gpu-acceleration gpu-programming multithreading parallel-processing radix-sort sorting-algorithms

Last synced: 10 May 2026

https://github.com/hshshshshsh12e/gpumkat

Gpumkat is a shader debugger for metal which is designed to do what instruments can't do

alternative api control cuda darwin debugger debugging gpumkat macos management profiler release shaders threads

Last synced: 14 Apr 2026

https://github.com/deltatecs/voses

Volatile Secret Searcher - massively parallel, brute force memory dump analysis for (D)TLS secret extraction

cuda memory-hacking reverse-engineering tls

Last synced: 15 Jun 2025

https://github.com/kanttouchthis/cuda_schem

script for voxelization of 3d models to minecraft .schem schematics with texture support powered by numba cuda.

cuda minecraft numba voxelization

Last synced: 07 Oct 2025

https://github.com/grindelfp/cuda-texture-memory

Exercise on using texture memory in CUDA.

cuda texture-memory

Last synced: 30 Mar 2025

https://github.com/andreasholt/cuda-matmul-benchmarking

Implementing and benchmarking various matmul implementations in CUDA

cuda matrix-multiplication

Last synced: 01 Nov 2025

https://github.com/dreoporto/tensorflow-gpu-docker

An example project to run TensorFlow with CUDA-enabled GPU acceleration using Windows, Docker and WSL2.

artificial-intelligence cuda deep-learning docker docker-compose jupyter machine-learning nvidia-docker python windows wsl2

Last synced: 27 Jan 2026

https://github.com/artheioupfat/mini-gpt-wiki

Projet visant à créer un mini LLM entraîné sur des données Wikipédia et à interagir avec lui via une interface Streamlit.

cuda gpt language llm model mps nlp pytorch scraping streamlit transformer wikipedia

Last synced: 08 Oct 2025

https://github.com/viktor-akusoff/chernabogpy

ChernabogPy is a Python package for visualizing gravitational distortions caused by black holes using nonlinear ray tracing.

cuda gpu physics-simulation python3 relativity-of-space-and-time torch

Last synced: 15 May 2026

https://github.com/thanduriel/cuda_hip_comparison

performance study of atomics on GPUs

atomics cuda hip

Last synced: 09 Oct 2025

https://github.com/ironjr/minimal-cuda-pytorch

Repository-level snippet for minimal implementation of a PyTorch CUDA extension.

cuda minimal pytorch

Last synced: 04 May 2026

https://github.com/enesdoruk/opencv-cpp

Opencv CPP tutorials

computer-vision cpp cuda opencv

Last synced: 09 Oct 2025

https://github.com/gama1903/cuda_programming

Practice of cuda programming

cuda parallel-computing

Last synced: 01 Nov 2025

https://github.com/mradovic38/pycuda-simulated-annealing

Simulated annealing process for finding the 'minimum energy' of an image.

cuda image-energy parallel-computing parallel-programming pycuda python simulated-annealing

Last synced: 09 Oct 2025

https://github.com/skyguy126/cuda-learnings

Collection of personal CUDA learnings.

cuda

Last synced: 10 Oct 2025

https://github.com/bhavinpatel4199/image-processing-with-opencv-and-cuda-on-google-colab

This repository demonstrates image processing using OpenCV with CUDA for GPU acceleration on Google Colab. It includes basics like displaying and manipulating images, alongside advanced techniques using CUDA to enhance performance. Ideal for learning GPU-accelerated image processing in Python.

computer-vision cuda google-colab gpu-acceleration high-performance-computing image-processing opencv pixel-manupulation

Last synced: 19 Jan 2026

https://github.com/maltsev-andrey/cuda-nn-inference

GPU-accelerated neural network inference using custom CUDA kernels. Achieves 97.82% accuracy on MNIST.

cuda deep-learning gpu-programming neural-networks numba nvidia parallel-computing parallel-programming performance-optimization python3 pytorch rhel9 tesla-p100

Last synced: 07 Mar 2026

https://github.com/zcemycl/distributecompute

Parallel Computing and Distributed Computing with C++ threads, Python threads+asyncio+multiprocessing and Spark, and Cuda.

asyncio boost cpp cuda global-interpreter-lock jthread multiprocessing python spark thread

Last synced: 14 Apr 2026

https://github.com/1180779/spheresraycasting

Raycasting of spheres

cuda opengl

Last synced: 02 Mar 2025

https://github.com/ericrihm/yt-whisper

Fast, local YouTube transcription with speaker diarization and a keyboard-first Textual TUI. YouTube-subs fast path, faster-whisper on CUDA, opt-in pyannote diarization, prompt profile auto-detection.

cli cuda faster-whisper pyannote python speaker-diarization textual transcription tui whisper youtube

Last synced: 31 May 2026

https://github.com/thesoenke/deeplearning-docker

Setup for Deep Learning experiments in Docker with Cuda

cuda docker fastai jupyter

Last synced: 11 May 2026

https://github.com/isaurabhmeshram28/cuda-examples

This repository contains examples and experiments with CUDA programming to explore GPU computing and parallel processing using NVIDIA's CUDA framework.

cpp cuda

Last synced: 19 May 2026

https://github.com/zury7/parallel-programming

A collection of performance optimizations and comparisons between multiprocessing and multithreading using pthreads, OpenMP, and CUDA. The experiments analyze execution speed, resource usage, and parallelization efficiency across different computational models. ( CS 4553 : Scientific Computing )

cuda openmp pthreads

Last synced: 08 May 2026

https://github.com/boostibot/bachelors

My bachelors thesis at CTU in Prague, Faculty of Nuclear Sciences and Physical Engineering supervised by Ing. Pavel Strachota, Ph.D

crystal-growth cuda finite-volume-method parallel-programming phase-field-method

Last synced: 26 Oct 2025

https://github.com/centuriontheman/parallelsortingalgorithms

Cpp/CUDA application for benchmarking sorting algorithms

benchamark cpp cuda multithreading sorting-algorithms

Last synced: 18 Feb 2026

https://github.com/anthongretter/spmv-cuda-analysis

A analysis on different approaches on Sparse Matrix-Vector Multiplication (SpMV) on GPU using CUDA

cuda gpu matrix-computations spmv unitn

Last synced: 14 Oct 2025

https://github.com/grizzz13/minimal-cuda

Minimal configurations to setup cuda cpp in cmake.

cmake cpp cuda

Last synced: 18 Apr 2026

https://github.com/dragonscypher/prompty

Tool for generating smart and secure prompts for language models!

autotokenizer bert-model cuda google-t5 llm python3 tensorflow threading

Last synced: 02 Jan 2026

https://github.com/sjmonson/tdr-inverse

A set of CUDA programs that invert matrices

cuda gpu matrix-inverse matrix-inversion tdr

Last synced: 14 Oct 2025

https://github.com/voduchuy/cudafsp

CUDA-based implementation of the Finite State Projection (FSP) algorithm.

chemical-master-equation cuda stochastic-reaction-networks sundials

Last synced: 20 Jan 2026

https://github.com/branebb/nn-framework

Framework for creating neural networks using C++ and CUDA platform. This project is part of my final university assignment for bachelor's degree.

cmake cpp cuda cuda-programming

Last synced: 20 Jan 2026

https://github.com/lu1smgb/ppr

Asignatura de Programacion Paralela. Curso 2024/2025. Universidad de Granada

cuda openmpi

Last synced: 15 Oct 2025

https://github.com/ergus/cuda-ts-mode

An emacs Cuda mode supported by tree-sitter

cuda emacs treesitter

Last synced: 20 May 2026

https://github.com/rgryta/jetsonnano-pytorch

Repository containing built PyTorch wheels for Jetson Nano

cuda jetson python pytorch tegra wheel

Last synced: 04 May 2026

https://github.com/flolu/hardware-praktikum

SoSe 2021 Hardware Praktikum

college cuda hardware

Last synced: 15 Oct 2025

https://github.com/matrixji/annb

Approximate Nearest Neighbor Benchmark

anns benchmarks cuda gpu

Last synced: 29 Apr 2026

https://github.com/masterskepticista/parallel_reductions_cuda

Iteratively optimizing parallel reductions in CUDA.

cuda reduce-sum reductions

Last synced: 16 Oct 2025

https://github.com/yangfengzzz/tardis

Travel space and time by using autodiff and codegen

autodiff codegen cuda

Last synced: 03 May 2026

https://github.com/puzzlef/vector-sum-cuda

Comparing performance of sequential vs CUDA-based vector element sum.

cuda element experiment gpu sum vector

Last synced: 14 Apr 2026

https://github.com/ojaswithag/opencv-doc

OpenCV ile görüntü ve video işleme, makine öğrenmesi ve proje uygulamaları için Türkçe kapsamlı bir rehber. 🐙 Adım adım kod örnekleriyle öğrenin ve projeler geliştirin.

arm-architecture cuda cuda-support deployment django docker-image docker-images heroku image-processing javascript nodejs nvidia opencv-contrib opencv3 production python scanner tutorial

Last synced: 08 Apr 2026

https://github.com/maltsev-andrey/julia_set_cuda

High-performance Julia set fractal computation in pure CUDA C, achieving 2.78 billion pixels/second on Tesla P100. Demonstrates GPU kernel programming, memory optimization, and massive parallelization (16M+ threads)."

cuda fractals gpu-programming high-performance-computing nvidia parallel-computing science visualization

Last synced: 03 Nov 2025

https://github.com/rbuj-uoc/m1.209

PAC 1, PAC 2, PAC 3 i PAC 4 de l'assignatura Computació d'altes prestacions del MUEI

cuda mpi openmp sge

Last synced: 21 May 2026

https://github.com/AMYPAD/miutil

Basic functionality needed for AMYPAD

cuda matlab medical-imaging python

Last synced: 10 Apr 2025

https://github.com/sakurabtc888/1000_btc_bitcoin_challenge

🔥 针对 [privatekeys.pw] 160个比特币的CPU+GPU碰撞工具

btc cpu cuda gpu

Last synced: 21 Oct 2025

https://github.com/paranoia55/env-setup

🚀 Set up a complete, production-ready JavaScript/TypeScript development environment on macOS with AI tools using a single command.

cuda deep-learning ethical-hacking-tools ios kali-linux kali-linux-tools make makefile next-14 next-appdir next-starter open-source-project prettier radix-ui shell tensorflow-gpu typescript ubuntu

Last synced: 29 Apr 2026

https://github.com/chibby0ne/cuda_by_example

Old notes (and new ones) of the Cuda by Example book

cuda cuda-programming gpgpu gpu-computing gpu-programming

Last synced: 15 Mar 2026

https://github.com/dhakalnirajan/baghchal-rl

C/CUDA implementation of Baagh Chaal Game with Neural Network

bagh-chal baghchal c clang cuda cuda-kernels neural-network reinforcement-learning

Last synced: 14 Apr 2026

https://github.com/phrutis/bip39scan

brute bip39 mnemonic GPU - $250

bip39 brute brute-force bruteforce cuda gpu mnemonic phrases seed

Last synced: 10 Apr 2025

https://github.com/soloangema/nvidia-8z6ms

🐱 Generate randomized README files to enhance DX farming, powered by NVIDIA technology for an engaging development experience.

artificial-intelligence computer-vision cuda data-science deep-learning gpu image-processing machine-learning neural-networks nvidia parallel-computing performance-optimization pytorch tensor video-processing

Last synced: 02 May 2026

https://github.com/chad24dev/gpu-agent-opt

🧠 Optimize GPU workflows with `gpu-agent-opt`, a Python package for profiling, scientific computing, and efficient CUDA exploration.

ai-agents autotuning cuda edge-ai geospatial gpu hpc nvidia optimization performance pytorch

Last synced: 07 May 2026

https://github.com/kichappa/videosift

CUDA based 3D Computer Vision for Exoskins

computer-vision convolution-filter cuda hpc julia sift-algorithm

Last synced: 15 May 2026

https://github.com/maxenceleguery/jare

3D Render engine accelerated with CUDA

3d cuda engine raytracing

Last synced: 21 May 2026

https://github.com/sbstndb/nbody_k

A simple 3D naïve NBody simulation using Kokkos enabling CUDA or OpenMP backend

cuda kokkos nbody openmp simulation

Last synced: 21 May 2026

https://github.com/roryclear/warp-shuffle-demo

warp reduce example

cuda warp

Last synced: 17 Apr 2026

https://github.com/shermanlo77/poisson_icing

Gibbs sampling on the Poisson-Ising model. The Poisson-Ising model is a 2D image of Poisson distributed random variables but has a dependency on their four neighbours. This causes the Poisson random variables to be similar (or dissimilar) to their neighbours.

cuda cupy gibbs-sampling gpu ising-model mcmc monte-carlo poisson poisson-ising

Last synced: 21 May 2026

https://github.com/bjornmelin/ml-algorithm-playground

🧪 Core ML algorithm implementations with GPU acceleration. Featuring optimized implementations across various libraries with comprehensive analysis. 📈

algorithms cuda gpu-computing lightgbm machine-learning python scikit-learn xgboost

Last synced: 13 May 2026

https://github.com/thephiltacular/voice-ai-pipeline

A containerized AI pipeline for real-time speech-to-text and text-to-speech conversion, leveraging Whisper ASR and Coqui TTS models with Kubernetes orchestration. Features a Gradio web interface, GPU acceleration, and local processing for privacy-focused voice applications. 🚀🎤📝

ai asr containerization coqui-tts cuda docker fastapi gpu gradio kubernetes machine-learning nvidia orchestration python real-time-processing speech-recognition text-to-speech tts voice-assistant whisper

Last synced: 15 Apr 2026

https://github.com/sedflix/cuda_pattern_matching

Getting words frequency using the concepts of pattern matching in CUDA

cuda word-frequency

Last synced: 17 Mar 2026

https://github.com/matiasvlevi/cuno

Provides cuda bindings, kernel maps and device memory managment for Dannjs computations. [Experimental and not complete]

addon cuda dann dannjs machine-learning nodejs

Last synced: 15 Apr 2026

https://github.com/mohamedsamirx/yolov12-tensorrt-cpp

YOLOv12 Inference Using CPP, Tensorrt, And CUDA

cpp cuda tensorrt tensorrt-inference yolo yolov12

Last synced: 15 Apr 2026

https://github.com/manu-sh/cuda-mandelbrot

how to use cuda acceleration to compute mandelbrot set

cuda mandelbrot ppm-image

Last synced: 15 Apr 2026

https://github.com/ahmed5827/image_generation

This application provides a graphical user interface (GUI) for generating images using the Stable Diffusion model. The GUI allows users to input a text prompt, and the application generates an image based on the prompt.

ai cuda generative-ai image-generation

Last synced: 15 Apr 2026

https://github.com/lyynn777/cuda-bitonic-sort

Simple CUDA project to implement Bitonic Sort and compare it with normal CPU sorting.

bitonic-sort cuda gpu-computing gpu-vs-cpu parallel-computing performance-testing pycuda python

Last synced: 15 Apr 2026

https://github.com/tkemmer/cunessie.jl

CUDA-accelerated Nonlocal Electrostatics in Structured Solvents

bioinformatics boundary-element-method cuda electrostatics gpu-computing julia proteins

Last synced: 31 Jan 2026

https://github.com/snandasena/courseera_gpu_specilization

Example for Cuda streaming

c cpp cuda

Last synced: 15 Apr 2026

https://github.com/starlitdreams/pacman-convolutional-q-learning

This project implements a Deep Q-Network (DQN) using PyTorch to train an agent to play Atari's Ms. Pac-Man. It utilizes reinforcement learning with a convolutional neural network (CNN) for image processing. Features include experience replay, frame preprocessing, and CUDA support, with trained model saving and video rendering of gameplay.

artificial-intelligence artificial-neural-networks atari cuda deep-learning deep-learning-algorithms deep-q-learning deeplearning gymnasium gymnasium-environment python pytorch

Last synced: 15 Apr 2026

https://github.com/materight/pyav-cuda

Extension of PyAV with hardware encoding and decoding support. Compatible with PyTorch and Nvidia codecs.

cuda cuvid ffmpeg libav pytorch

Last synced: 01 Feb 2026

https://github.com/cscfi/csc-env-julia

Julia language environment including MPI.jl, CUDA.jl and AMDGPU.jl preferences for HPC clusters at CSC.

amdgpu ansible cuda hpc julia julia-language mpi

Last synced: 01 Feb 2026

https://github.com/teambipartite/bipartite-gemm

High throughput data-parallel GEMM implementations in Cuda using Cuda cores and Tensor cores

cuda data-parallelism gemm

Last synced: 17 Apr 2026

https://github.com/m-torhan/cuda-fractals

CUDA C++ implementation of Fractals visualization

cuda

Last synced: 25 Feb 2026

https://github.com/joe-mruz/hgvisualizer

An interactive simulation and visualization tool for evolving hypergraphs, inspired by the Wolfram Physics Project.

cpp cuda hypergraph physics simulator wolfram

Last synced: 02 May 2026

https://github.com/xza85hrf/flag_prediction_project

This application predicts the name of a country (or countries) based on an input flag image. It uses advanced image processing techniques and deep learning models built with PyTorch to classify flags accurately.

cross-validation cuda data-augmentation docker efficientnetb0 flag-recognition image-classification machine-learning mixed-precision-training mobilenetv2 python pytorch resnet resnet-50 transfer-learning

Last synced: 15 Apr 2026

https://github.com/dasbd72/nthu-ipc-2022

National Tsing Hua University - Introduction to Parallel Computing - 2022

cuda cuda-programming hpc mpi openmp pthreads

Last synced: 30 Mar 2025

https://github.com/fieldcure/fieldcure-whisper-runtimes

Pre-built Whisper.net native runtime binaries (CPU/CUDA/Vulkan) for the FieldCure software ecosystem.

cuda dotnet native-binaries nuget redistributable vulkan whisper whisper-net

Last synced: 01 Jun 2026

https://github.com/baremetalrt/baremetalrt

BareMetalRT — edge GPU compute mesh

cuda distributed-computing gpu inference llm nvidia tensorrt windows

Last synced: 18 Apr 2026

https://github.com/kentakoong/mtnlog

A simple multinode performance logger for Python

cuda lanta nvitop python slurm-cluster

Last synced: 11 Jan 2026

https://github.com/muppetsg2/cudaraytracer

A custom ray tracer originally developed during university studies to run on CPU, now ported to GPU using CUDA. This project was created to explore GPU rendering techniques and to gain hands-on experience with CUDA programming.

cuda mit-license nvidia-cuda nvidia-gpu raytracing sfml stb-image student-project study-project

Last synced: 16 Apr 2026

https://github.com/equiel-1703/cuhip

Wrapper tool to convert CUDA source code to HIP code and compile it with HIPCC. Useful for learning CUDA programming using AMD devices..

cuda hip

Last synced: 14 May 2026

https://github.com/yashpotdar-py/flood-vision

Flood Vision - A deep learning–based computer vision system for flood mapping and damage assessment using aerial imagery.

cuda deep-learning flood-detection iot python

Last synced: 16 Apr 2026

https://github.com/sferez/sspp_sparse_matrix_cuda

Small Scale Parallel Programming, Sparse Matrix multiplication with CUDA

cpp cuda omp omp-parallel parallel-computing small-scale-parallel-programming sparse-matrix

Last synced: 30 Apr 2026

https://github.com/daelsepara/hipnewton

GPU Implementation of Newton Fractal Generator with Benchmarking

amd cuda fractal gpu gpu-compute gpu-computing hip newton parallel-computing rocm sdk

Last synced: 03 May 2026