An open API service indexing awesome lists of open source software.

CUDA

CUDA® is a parallel computing platform and programming model developed by NVIDIA for general computing on graphical processing units (GPUs). With CUDA, developers are able to dramatically speed up computing applications by harnessing the power of GPUs.

https://github.com/ergus/cuda-ts-mode

An emacs Cuda mode supported by tree-sitter

cuda emacs treesitter

Last synced: 20 May 2026

https://github.com/marcorentap/kokkos-docker-cluster

Deploy Docker containers with Kokkos, OpenMP, OpenMPI and CUDA as a Docker swarm.

cuda docker hpc kokkos

Last synced: 10 Mar 2025

https://github.com/jeong-j/multicore

Multi Thread in Java / C / C++ / Pthread / CUDA

c cpp cuda java multicore pthread thread

Last synced: 29 Apr 2026

https://github.com/boohohoo/shamining

Shamining is a cloud mining service that allows users to mine cryptocurrencies without the need for personal hardware. By renting computing power from eco-friendly data centers, users can mine efficiently. The platform offers easy-to-use interface, flexible contracts, and daily payouts.

cryptocurrency cryptomining cuda gpu-mining mining mining-software open-source opencl

Last synced: 04 Jul 2025

https://github.com/neugence/acehub

AI Champions for Excellence: Fresh, informative courses and content designed to help developers, researchers, and leaders advance in the field of AI.

ai cuda cv ml mlops nlp pytorch rl rlhf tensorflow

Last synced: 05 Jan 2026

https://github.com/prdai/mnist-digit-recognition

A PyTorch-based deep learning implementation for MNIST digit recognition featuring CNNs, GPU acceleration, experiment tracking, and comprehensive testing capabilities.

cnn computer-vision cuda data-science deep-learning digit-recognition image-classification machine-learning mnist neural-networks python pytorch wandb

Last synced: 12 Apr 2026

https://github.com/yangfengzzz/tardis

Travel space and time by using autodiff and codegen

autodiff codegen cuda

Last synced: 03 May 2026

https://github.com/occisor2/fluidsimulation

Second project of my parallel algorithms course

cuda high-performance-computing

Last synced: 28 Feb 2025

https://github.com/dmmutua/cuda_projects

An Implementation of a variety of Algorithms & Technical Papers Mostly Related to Machine Learning & Deep Learning in CUDA C

c cuda cuda-programming deep-learning machine-learning machine-learning-algorithms

Last synced: 18 Apr 2026

https://github.com/enapiuz/logic-circuit-simulator

Logic circuit (based on NAND gates) simulator using OpenCL

c circuit-simulator cuda digital-logic gpgpu logic-gates opencl simulator

Last synced: 03 May 2026

https://github.com/ivanbgd/cuda_quad_c

Calculates a definite integral by using three different rules. Compares sequential to parallel implementations.

cuda integrals parallel-implementations

Last synced: 28 Mar 2025

https://github.com/boned-fruitwood759/whisperx-asr-with-fastapi

🎤 Enable real-time speech recognition with WhisperX using FastAPI for efficient, scalable audio processing.

asr ctranslate2 cuda fastapi openai python speech-recognition torch transformers whisper whisperx

Last synced: 12 Apr 2026

https://github.com/rushirg/cuda-matrix-multiplication

Matrix Multiplication on GPGPU in CUDA

cpu cuda gpu parallel-processing

Last synced: 17 May 2026

https://github.com/tomosatop/docker-lammps

Lammps を手軽に使いたかったので、サービスを作りました

cuda lammps wsl-ubuntu

Last synced: 28 Mar 2025

https://github.com/puzzlef/vector-max-cuda

Performance of sequential vs CUDA-based vector element max.

basics cuda element experiment max vector

Last synced: 17 May 2026

https://github.com/ojaswithag/opencv-doc

OpenCV ile görüntü ve video işleme, makine öğrenmesi ve proje uygulamaları için Türkçe kapsamlı bir rehber. 🐙 Adım adım kod örnekleriyle öğrenin ve projeler geliştirin.

arm-architecture cuda cuda-support deployment django docker-image docker-images heroku image-processing javascript nodejs nvidia opencv-contrib opencv3 production python scanner tutorial

Last synced: 08 Apr 2026

https://github.com/reuben-sun/pybind-cuda-demo

一个 基于pybind11实现python调用cuda C++接口 的示例

cpp cuda pybind11 python pytorch

Last synced: 07 Apr 2026

https://github.com/bikrammajhi/100-days-of-gpu

This is my 🔥 100 Days of GPU — a wild, hands-on journey through CUDA kernels, Triton spells, and PTX sorcery.

cuda nsight-compute ptx triton

Last synced: 18 Jun 2025

https://github.com/genpat-it/ohe-rs

Ultra-fast one-hot encoding for bioinformatics and ML, powered by Rust + CUDA. Built for cgMLST allele profiles and large-scale categorical data.

bioinformatics cuda machine-learning one-hot-encoding performance pyo3 python rust

Last synced: 04 Jun 2026

https://github.com/amypad/miutil

Basic functionality needed for AMYPAD

cuda matlab medical-imaging python

Last synced: 13 May 2025

https://github.com/camille-004/cusprec

🏁 Sparse signal recovery library written in PyCUDA.

cuda ml python signal-processing sparse-recovery

Last synced: 18 Jan 2026

https://github.com/liebemama/repo-fastapi

GPU-ready FastAPI AI inference server with plugin system, supporting CUDA, ROCm, CPU, and macOS MPS.

ai-server cuda fastapi gpu inference mps plugins pytorch rocm

Last synced: 05 Apr 2026

https://github.com/tchung1970/sd-cli-cuda

CUDA-accelerated Stable Diffusion plugin for wavespeed-desktop

cuda gpu linux nvidia stable-diffusion

Last synced: 09 May 2026

https://github.com/fmigneault/dockers

Collection of docker setup with common libraries for image processing and machine learning.

boost cuda docker image-processing opencv python

Last synced: 12 Apr 2026

https://github.com/ex539/docker-dev-env

A collection of ready-to-use Docker development environments for multiple Linux distributions (Ubuntu, Debian, Alpine, Arch, Kali). Includes shared configurations, utility scripts, and comprehensive documentation for reproducible development setups across teams and CI/CD pipelines.

big-data cpp cuda docker docker-image docker-php docker-setup environment hadoop jenkins kubernetes qtcreator reproducibility x11

Last synced: 05 Apr 2026

https://github.com/miferreiro/cdap-cuda

CUDA exercises for the subject of "Computación Distribuída e de Altas Prestacións" in the Master Degree of Computer Engineering of the University of Vigo in 2020

c cuda scan

Last synced: 17 May 2026

https://github.com/neel-dandiwala/cuda-programs

Miscellaneous programs that grasp the concept of Parallel Computing

cuda gpu-programming parallel-programming

Last synced: 16 May 2025

https://github.com/ivanfioravanti/tflops_mps

TFLOPs testing on MPS and CUDA

cuda mps tflops

Last synced: 19 May 2026

https://github.com/nabilshadman/cuda-4-dummies

Lecture slides and exercise files of the CUDA 4 Dummies course (2025)

cuda gpu-computing high-performance-computing nsight-systems nvidia-gpu parallel-computing

Last synced: 31 Oct 2025

https://github.com/emanuelemessina/gigacheck

ABFT Matrix Multiplication of any size in CUDA

abft cuda matrix-multiplication

Last synced: 28 Feb 2025

https://github.com/sahil-rajwar-2004/vector-cuda

vector calculation with GPU acceleration using CUDA

c cpp11 cuda cuda-kernels cuda-programming nvcc

Last synced: 15 May 2025

https://github.com/karusb/2dca-cuda

2 Dimensional Cellular Automata Visualisation (Game of Life)

algorithm-flowchart cellular-automata cuda game game-of-life glut visual-studio

Last synced: 12 Apr 2026

https://github.com/himeyama/cuda-convolve

convolve + cuda + ruby (1次元のみ対応)

cuda filter gem ruby

Last synced: 19 Apr 2026

https://github.com/sagar-brahaman/imagefilterpy

Example of custom image filter for MRTech IFF Python SDK

camera cuda dng genicam gpu h264 h265 image-processing jetson json mipi rest-api rtsp tiff

Last synced: 18 Apr 2026

https://github.com/gammahazard/locate-anything

Sleek, mobile-friendly web UI for NVIDIA LocateAnything-3B — open-vocabulary object detection & grounding on your own GPU, via one docker compose up.

bounding-boxes computer-vision cuda docker fastapi gpu grounding locate-anything machine-learning nvidia object-detection ocr open-vocabulary-detection react self-hosted tailwindcss typescript vision-language-model web-ui

Last synced: 28 May 2026

https://github.com/tylerfaulkner/n-body_simulation

CUDA N-Body Gravitational Simulation with rendering in Python with MatPlotLib

cuda simulation

Last synced: 20 May 2026

https://github.com/aditiisaxena/cuda-accelerated-box-filter-for-texture-image-enhancement

Enhances grayscale texture images using a CUDA-based box filter. Built with CUDA, C++14, and OpenCV for high-performance image processing.

cpp cuda gpu-programming linux nvidia opencv

Last synced: 18 Apr 2026

https://github.com/ousscher/esi_2cs_hpc_tp

A collection of High-Performance Computing (HPC) codes showcasing parallel computing techniques. This repository includes implementations in CUDA, MPI, OpenMP, and threading ...

c cuda mpi openmp pthreads

Last synced: 18 Mar 2025

https://github.com/bjornmelin/cuda-core-projects

🎯 Essential CUDA programming patterns and optimizations. Showcasing parallel computing expertise through matrix operations, memory management, and advanced kernel implementations. 💻

cpp cuda cuda-kernels gpu-computing high-performance-computing nvidia optimization parallel-computing

Last synced: 12 Apr 2026

https://github.com/larygwil/cuda-samples-old

nvidia cuda samples old (5.0 - 7.5)

cuda nvidia

Last synced: 03 May 2026

https://github.com/tiktokfnf33/rayleigh-taylor-instability-simulation

# CUDA Rayleigh-Taylor Instability SimulationThis repository features a high-performance simulation of the Rayleigh-Taylor instability using CUDA, Python, and C. Explore the implementation and results to understand fluid dynamics in a parallel computing context. 🖥️🚀

c computational-fluid-dynamics cuda euler-method finite-difference gpu-computing hpc numerical-simulation parallel-computing physics-simulation python rayleigh-taylor-instability runge-kutta

Last synced: 04 May 2026

https://github.com/alkaifaftab000/autonomous-maze-solver

Building an Autonomous Maze Solver using reinforcement learning to train agents for decision-making in dynamic grid-based environments

agent criticism cuda gymnasium-environment maze-solving-bot pytorch reinforcement-learning reward-functions

Last synced: 12 Apr 2026

https://github.com/tianzonglin/cloud-control-gui

A tool to compute, visualize, analyse and drag points (high-dimensional data)

cuda interaction-design visualization

Last synced: 25 Apr 2026

https://github.com/versi379/optimized-matrix-multiplication

This project utilizes CUDA and cuBLAS to optimize matrix multiplication, achieving up to a 5x speedup on large matrices by leveraging GPU acceleration. It also improves memory efficiency and reduces data transfer times between CPU and GPU.

cublas cuda cuda-programming hpc matrix-multiplication parallel-computing parallel-programming

Last synced: 17 May 2026

https://github.com/adesoji1/youtubesummaryai

Python script for YouTube summary. The service should summarize an YouTube video by url. It should works for long video and for different languages.

cuda googleapi python3 speech-recognition transformers youtube-api-v3 youtube-dl

Last synced: 04 Apr 2025

https://github.com/lionpsiuc/cflow

A computational model for heat propagation in a cylindrical radiator using both CPU and GPU parallel processing. The simulation uses finite difference methods to model the directional flow of heat through a cylindrical pipe system with specific boundary conditions and cyclic connections between pipe segments.

c cuda parallel-programming

Last synced: 29 May 2026

https://github.com/AndreasKaratzas/orin

Setting up the NVIDIA Jetson Orin Nano Developer Kit

cuda cudnn jetpack6 nvidia-jetson nvidia-sdkmanager orin-nano

Last synced: 25 Feb 2025

https://github.com/fikri-rouzan/cuda-c-program-part-2

CUDA C program from NVIDIA course.

c cuda

Last synced: 30 Apr 2026

https://github.com/rgryta/jetsonnano-pytorch

Repository containing built PyTorch wheels for Jetson Nano

cuda jetson python pytorch tegra wheel

Last synced: 04 May 2026

https://github.com/dougeeai/llama-cpp-python-wheels

Pre-built wheels for llama-cpp-python across platforms and CUDA versions

ampere cuda cuda13 gguf llama-cpp-python llm machine-learning prebuilt python313 rtx3060 rtx3070 rtx3080 rtx3090 wheels windows

Last synced: 18 Apr 2026

https://github.com/santiagoenriquega/gpu_projects

Various Python GPU accelerated computations and simulations.

cuda cupy numba opencl pyopencl python

Last synced: 17 May 2026

https://github.com/ergus/algorithms

Set of multiple algorithms implemented in multiple paradigms

algorithms cmake concurrency cpp cuda gpgpu inter-language metaprogramming multithreading pthreads stl testing

Last synced: 17 May 2026

https://github.com/0xhilsa/tenop

A lightweight & minimalist tensor computation library with CUDA backend

bash c cuda python3 tensor

Last synced: 13 Apr 2026

https://github.com/timvgl/cuxrft

Performs FFT in xarrays using cuda

cuda cupy fft python xarray

Last synced: 07 Jan 2026

https://github.com/ubermorgott/morgottalk

Cross-platform desktop push-to-talk voice transcription. Single binary. GPU accelerated (CUDA/Vulkan/Metal/ROCm/OpenCL). Powered by whisper.cpp.

cuda desktop go gpu speech-to-text svelte transcription voice wails whisper

Last synced: 07 Apr 2026

https://github.com/fulvius31/triton-cache-tracker

A lightweight utility for monitoring and analyzing Triton kernel compilation cache behavior.

cache cuda gpu gpu-kernels triton triton-openai

Last synced: 30 Apr 2026

https://github.com/intelav/gpu-agent-opt

AI Agent Framework for GPU Kernel Autotuning & Optimization. Automate CUDA kernel exploration, profiling, and tuning with AI-driven agents for deep learning, geospatial AI, and HPC workloads.

ai-agents autotuning cuda deep-l edge-ai geospatial gpu hpc nvidia optimization performance pytorch

Last synced: 19 Apr 2026

https://github.com/vicen-te/tiny-nn

A tiny neural network framework for fully-connected layers with CPU and CUDA support

backpropagation cplusplus-20 cpu cuda cuda-12-8 kernel multi-threaded neural-network nn

Last synced: 19 Apr 2026

https://github.com/mohammadshabazuddin/text_to_speech_generation_with_llm_with_hugging_face

Build a text-to-speech generation system using LLMs and Hugging Face to convert text into natural audio speech.

cuda huggingface-transformers llms nlp

Last synced: 03 May 2026

https://github.com/programmergnome/kutyai

This is a python dog breed recognizer graphical application with 420 breeds and 42000 images.

cuda deep-learning image-classification python3 qt5-gui tensorflow transfer-learning

Last synced: 11 May 2026

https://github.com/timanema/msc-thesis-public

Repository containing a GPU-accelerated compressor based on FSST

compression cpp cuda gpu thesis

Last synced: 19 Apr 2026

https://github.com/zjeffer/docker-arch-cuda

Arch Linux base image with the latest CUDA, CUDNN and LibTorch preinstalled.

archlinux cuda docker libtorch pytorch

Last synced: 19 Apr 2026

https://github.com/vladd12/libexecstd

Modern C++ library for using an execution context of computer devices

cpp cpp17 cuda gpu-acceleration gpu-computing

Last synced: 06 May 2026

https://github.com/TeamBipartite/bipartite-gemm

High throughput data-parallel GEMM implementations in Cuda using Cuda cores and Tensor cores

cuda data-parallelism gemm

Last synced: 14 Jan 2026

https://github.com/drilonaliu/bachelor-thesis

Parallel Programming Fractals

cuda fractals gpu parallel-programming

Last synced: 15 May 2026

https://github.com/sergeipapina/color2graycuda

color to gray image conversion nvidia CUDA kernel implementation using make or cmake to compile and link

cmake cuda cuda-kernels cuda-programming link makefile nvidia

Last synced: 06 Apr 2025

https://github.com/fatlipp/toyslam

SLAM implementation from scratch w/o external graph optimization libs

cuda gpu lidar-slam mapping odometry robotics slam

Last synced: 20 Apr 2026

https://github.com/1ytic/cuda-gpu-zoo

Properties of the CUDA devices

cuda gpu

Last synced: 20 Aug 2025

https://github.com/phantom7knight/cuda-fusion

This project is for learning CUDA to understand the GPU work better.

cuda cuda-programming gpgpu gpu

Last synced: 17 May 2026

https://github.com/hrolive/data-analytics-in-the-era-of-large-scale-machine-learning

Slides and other material for the Cyprus NCC training event about "Data analytics in the era of large-scale machine learning".

cuda deep-learning gpu-acceleration gradient-boosting large-language-models machine-learning preprocessing python pytorch

Last synced: 13 Apr 2026

https://github.com/ydkn/htw-progko-cuda

Parallel processing of image transformations. Part of the "Programmierkonzepte und Algorithmen" course at HTW-Berlin.

cuda image-transformations opencv

Last synced: 20 Apr 2026

https://github.com/tameronline/repo-fastapi

GPU-Ready FastAPI AI Inference Server with plugin system (CUDA/CPU/MPS/ROCm)

ai-server cuda deep-learning fastapi inference mps nlp plugins pytorch rocm

Last synced: 20 Apr 2026

https://github.com/gaurisharan/cuda-ml-kernels

Repo for CUDA C++ GPU kernels for ML and HPC.

cpp cuda gpu hpc kernels ml parallel-computing systems-ml

Last synced: 30 Apr 2026

https://github.com/BardiFarsi/ThreadPoolManager

ThreadPoolManager is a C++ project that implements an efficient multi-threading system using a thread pool for generic functions of the same type and different tasks. It includes task management, synchronization mechanisms, and thread-safe logging to demonstrate concurrent task execution.

cpp cpp17 cpp20 cuda cuda-programming memory-management multiprocessing multithreading parallel-computing parallel-processing parallel-programming thread thread-pool thread-safety threadpool threads threadsafe

Last synced: 15 May 2025

https://github.com/uefi-code/bachelorgraduationdesign

I developed a PyTorch_For_PoorGuys framework and Let it train LLM on NVIDIA GeForce 2080Ti GPU as my Bachelor's Graduation Design Project

chatbot cuda gpu hacking large-language-models pytorch

Last synced: 03 May 2026

https://github.com/proafxin/cuda-docker

High performance computing Images with pycuda and tensorrt preinstalled

cuda docker dockerfile libcudnn nvidia-tensorrt pycuda python tensorrt

Last synced: 11 Apr 2026

https://github.com/matthewfeickert/report-urssi-fellowship-2025

Report on URSSI 2025 Early-Career Fellowship

cuda pixi urssi

Last synced: 17 Jan 2026

https://github.com/rtfirst/voice-to-text

Cross-platform Push-to-Talk speech-to-text — local Whisper transcription (CUDA/MPS) with optional Anthropic API correction and live VU meter overlay. Windows 11 + macOS.

cuda macos push-to-talk python speech-to-text voice-input whisper windows

Last synced: 04 Jun 2026

https://github.com/amirbroker/cupydtw

Use Cuda for Dynamic Time Warping

cuda dtw dynamic-time-warping python

Last synced: 20 Apr 2026

https://github.com/ray-chew/modified_ch

Density functional theory (DFT) and self-consistent field theory (SCFT) simulation of diblock copolymers

cuda density-functional-theory diblock-copolymer numerical-analysis numerical-methods self-consistent-field-theory

Last synced: 11 May 2026

https://github.com/hr-fahim/transformer-model-optimization

Sample GPT Transformer Model from Scratch.

cuda few-shot-learning transfomers

Last synced: 02 May 2026