An open API service indexing awesome lists of open source software.

CUDA

CUDA® is a parallel computing platform and programming model developed by NVIDIA for general computing on graphical processing units (GPUs). With CUDA, developers are able to dramatically speed up computing applications by harnessing the power of GPUs.

https://github.com/1ytic/cuda-gpu-zoo

Properties of the CUDA devices

cuda gpu

Last synced: 20 Aug 2025

https://github.com/sahil-rajwar-2004/vector-cuda

vector calculation with GPU acceleration using CUDA

c cpp11 cuda cuda-kernels cuda-programming nvcc

Last synced: 15 May 2025

https://github.com/neel-dandiwala/cuda-programs

Miscellaneous programs that grasp the concept of Parallel Computing

cuda gpu-programming parallel-programming

Last synced: 16 May 2025

https://github.com/tchung1970/sd-cli-cuda

CUDA-accelerated Stable Diffusion plugin for wavespeed-desktop

cuda gpu linux nvidia stable-diffusion

Last synced: 09 May 2026

https://github.com/bikrammajhi/100-days-of-gpu

This is my 🔥 100 Days of GPU — a wild, hands-on journey through CUDA kernels, Triton spells, and PTX sorcery.

cuda nsight-compute ptx triton

Last synced: 18 Jun 2025

https://github.com/enapiuz/logic-circuit-simulator

Logic circuit (based on NAND gates) simulator using OpenCL

c circuit-simulator cuda digital-logic gpgpu logic-gates opencl simulator

Last synced: 03 May 2026

https://github.com/kar-dim/CAS-2D

Implementation of the AMD FidelityFX CAS (Contrast Adaptive Sharpening) algorithm on CUDA, for sharpening static images.

cpp cuda dll fidelityfx gpu image-processing parallel-computing sharpen

Last synced: 01 Nov 2025

https://github.com/mohammadshabazuddin/text_to_speech_generation_with_llm_with_hugging_face

Build a text-to-speech generation system using LLMs and Hugging Face to convert text into natural audio speech.

cuda huggingface-transformers llms nlp

Last synced: 03 May 2026

https://github.com/sid911/neuralnetworkcpp

A small experiment to learn about neural networks and their runtimes in cpp

cpp cuda machine-learning neural-network

Last synced: 20 Aug 2025

https://github.com/lk/gpu-nbody

GPU-accelerated n-body engine for t-SNE and physics simulation

cuda gpu n-body n-body-simulator

Last synced: 02 Sep 2025

https://github.com/neugence/acehub

AI Champions for Excellence: Fresh, informative courses and content designed to help developers, researchers, and leaders advance in the field of AI.

ai cuda cv ml mlops nlp pytorch rl rlhf tensorflow

Last synced: 05 Jan 2026

https://github.com/phantom7knight/cuda-fusion

This project is for learning CUDA to understand the GPU work better.

cuda cuda-programming gpgpu gpu

Last synced: 17 May 2026

https://github.com/9prady9/archdock

Arch linux docker image for app development

arch-linux arrayfire cuda docker-image forge opencl

Last synced: 03 May 2026

https://github.com/h4ck3r-04/fpassword

Fpassword merges Hashcat's hash-cracking precision with Hydra's parallelized network login, offering penetration testers a powerful tool for swift hash deciphering and simultaneous login attempts across diverse protocols.

brute-force brute-force-attacks c cracking cuda gpgpu hashcat hashes hydra network-security opencl password penetration-testing

Last synced: 16 Jan 2026

https://github.com/chensongpoixs/cmedia_transcode

媒体服务转码版本GPU(cuda) 支持H264与H265转码

cuda gpu h264 h265 media transcode-media

Last synced: 19 May 2026

https://github.com/chiragajain/gpu-optimization-roadmap

This repository is part of a structured curriculum designed to master GPU optimization, Triton, Deep Learning, and LLMs. This section focuses on GPU fundamentals, CUDA programming, and PyTorch optimizations.

cuda deeplearning gpu-acceleration learning python pytorch triton

Last synced: 18 Feb 2026

https://github.com/pvgupta24/parallel-programming

Basic algorithms for parallel programming in CUDA C++, Java and OpenMP

cuda openmp parallel-programming

Last synced: 19 Aug 2025

https://github.com/lucatedeschini/feedforwardnn

This project is my submission for the exam "Project Work in Architecture and Platform for Artificial Intelligence"

c cuda neural-networks openmp scratch-implementation

Last synced: 20 Apr 2026

https://github.com/akshaysinhaaa/emova

A deep learning framework designed for emotion and sentiment recognition using text, audio, and video modalities. This project leverages the MELD (Multimodal EmotionLines Dataset) to train a robust and flexible model that reflects human communication more accurately than unimodal models.

bert cnn cuda deep-learning multimodal python pytorch resnet-18 tensorboard transformers

Last synced: 05 May 2026

https://github.com/kratugautam99/logiclink-project

LogicLink is a conversational AI chatbot developed by Kratu Gautam (AIML Engineer). Powered by the TinyLlama-1.1B-Chat-v1.0 model, it provides an interactive interface for engaging conversations, query resolution, and task assistance. Version 5 features streaming responses, conversation management, and a sleek GUI.

antd-design chatbot-application conversational-ai cuda gradio graphical-user-interface huggingface-spaces huggingface-transformers jupyter-notebooks keras large-language-models mlops model-service-controller modelscope-studio natural-language-generation natural-language-processing pytorch reasoning-agent tensorflow

Last synced: 07 Apr 2026

https://github.com/cmazakas/cuda-stuff

A CUDA-based playground

cmake cuda delaunay-triangulation vscode

Last synced: 24 Mar 2025

https://github.com/dmalexx/cuda_check

How can you check if CUDA is available in Tensorflow

cuda python tensorflow

Last synced: 10 Apr 2026

https://github.com/Parxd/cuda-optim

various CUDA kernels optimized for specific ML algos

cuda machine-learning

Last synced: 02 Sep 2025

https://github.com/mattjesc/federated-learning-simulation-1gpu-mi-is

Federated Learning Simulation on a Single GPU with Model Interpretability and Interactive Visualization

ai cuda deep-learning distributed-systems federated-learning gpu hpc keras machine-learning ml model-interpretability python pytorch simulation streamlit tensorflow

Last synced: 05 Jan 2026

https://github.com/rmeli/cuda-pg

CUDA C++ Playground

cpp cuda gpu

Last synced: 16 Apr 2026

https://github.com/himeyama/cuda-convolve

convolve + cuda + ruby (1次元のみ対応)

cuda filter gem ruby

Last synced: 19 Apr 2026

https://github.com/minseoc03/cuda-100-days

A 100-day journey to master CUDA programming, inspired by the CUDA-120-DAYS--CHALLENGE project. This repo contains daily CUDA exercises and code folders, with learning notes hosted on Notion. Practicing on leetgpu.com due to lack of local NVIDIA GPU.

100daysofcode cuda deeplearning gpgpu gpu hpc nvidia parallel-computing

Last synced: 19 Apr 2025

https://github.com/moesio-f/cla

C Linear Algebra (CLA) library. A simple toy library for basic vector/matrix operations with CUDA support and Python bindings.

c cuda linear-algebra python

Last synced: 09 May 2026

https://github.com/ndgigliotti/torch-ipca

GPU-accelerated Incremental PCA for PyTorch

cuda dimensionality-reduction gpu incremental-pca machine-learning pca pytorch

Last synced: 26 Jan 2026

https://github.com/ionmich/cs149-local-dev

Provides `conda` installation instructions for Stanford's CS149 (Parallel Computing) programming assignments

conda cs149 cuda ispc parallel-computing

Last synced: 31 Mar 2025

https://github.com/zyn10/cuda_code

cude practice

cuda cuda-programming

Last synced: 22 Jun 2025

https://github.com/kar-dim/cas-2d

Implementation of the AMD FidelityFX CAS (Contrast Adaptive Sharpening) algorithm on CUDA/OpenCL, for sharpening static images.

cpp cuda dll fidelityfx gpu image-processing parallel-computing sharpen

Last synced: 22 Jun 2025

https://github.com/kanchishimono/python-images

Ubuntu based Python container images, including CUDA images

container-image cuda docker dockerfile machine-learning python python3

Last synced: 30 Apr 2026

https://github.com/rkarahul/person-detector-faceverifier

Person-Detector-FaceVerifier is a sophisticated system for detecting and verifying faces in images. Ideal for applications like passport control and security, it combines advanced face detection with precise verification techniques.

bootstrap5 css3 cuda django html5 javascipt opencv-python os python pytorch yolov8

Last synced: 07 Apr 2026

https://github.com/muneeb706/cuda

sample programs implemented using cuda (gpu)

cplusplus cuda gpu-programming

Last synced: 19 May 2026

https://github.com/cuda8/brainwords2

GPU brainflayer for sale $250

brain brainflayer brainwords cuda gpu key pass passphrase private

Last synced: 10 Mar 2025

https://github.com/shtrophic/wicuvanity

Generate wireguard vanity keys on your Nvidia GPU

cuda gpu vanity-address vanity-addresses vanitygen wireguard

Last synced: 10 Mar 2025

https://github.com/Neuro-Mechatronics-Interfaces/python-intan

Tools and demos for working with EMG data from intan using python

circuitpython cuda emg pico python realtime tensorflow

Last synced: 13 Jan 2026

https://github.com/ojeda-e/fokker-planck

Numerical solution of the Fokker-Planck equation in large times using CUDA/C.

cuda fokker-planck-equations

Last synced: 17 Aug 2025

https://github.com/kataglyphis/machinelearningalgorithms

Basic Machine Learning Algorithms

cuda machine-learning python tensorflow

Last synced: 31 Mar 2025

https://github.com/codename-detective/cuda_gpgpus_shared_memory_systems_pdp

CUDA GPGPUs Shared Memory Systems Parallel & Distributed Programming

cuda cuda-programming numa parallel-programming

Last synced: 30 Mar 2025

https://github.com/voltr0x/raytracing-cuda

Raytracing in a weekend using CUDA

cpp11 cuda raytracing sdl2

Last synced: 01 Apr 2026

https://github.com/alessiobugetti/integral-image-processing

Implements sequential and parallel integral image computation in C++ and Python, utilizing CUDA for parallel computation on GPU

cuda gpu-acceleration integral-image numba parallel-computing pycuda

Last synced: 24 May 2026

https://github.com/i-m-iron-man/abmax

Abmax is an agent-based modelling framework in Jax, focused on dynamic population size

abm agent agent-based agent-based-modeling agent-based-simulation agents cuda jax python

Last synced: 04 Oct 2025

https://github.com/ribin-baby/cuda_cudnn_installation_on_ubuntu20.04

Installation of CUDA-11.8 with cuDNN-8.7 for ubuntu(20.04) server A30 GPU, and onnx gpu installation guide

cuda gpu linux onnxruntime server

Last synced: 16 May 2026

https://github.com/patriciobcs/mini-aevol

Parallel implementation of a reduced version of the Aevol simulator

aevol cuda simulation

Last synced: 19 May 2026

https://github.com/kronbii/thermal-super-resolution

State-of-the-art thermal super-resolution system (IMDN) with RGB→thermal adaptation, custom multi-component loss, 29.6 dB PSNR, 0.713 SSIM, 250+ FPS, production-ready PyTorch + CUDA implementation.

computer-vision cuda deep-learning image-enhancement imdn model-optimization production-machine-learning pytorch real-time real-time-processing research super-resolution thermal-imaging

Last synced: 18 Apr 2026

https://github.com/asadiahmad/100_sports_image_classification

A deep learning project for sport image classification using a custom VGG19-based architecture with integrated Grad-CAM heatmap visualization for model interpretability.

computer-vision cuda data-augmentation deep-learning explainable-ai gpu-acceleration grad-cam heatmap-visualization image-classification mixed-precision-training pytorch pytorch-grad-cam sports-analytics sports-classification transfer-learning vgg19

Last synced: 11 Jun 2025

https://github.com/andreeo/parallel-computing-cuda

Programs in terminal applying the parallel programming model with the CUDA arquitecture

c cpp cuda docker lineal-search parallel-computing parallel-reduction rank-sort-algorithm

Last synced: 09 Apr 2026

https://github.com/ysl1016/cudadigitfilter

CUDA-based parallel image filtering system for MNIST dataset

computer-vision cuda deep-learning gpu-acceleration image-processing mnist parallel-computing

Last synced: 28 Mar 2025

https://github.com/githubfoam/cuda-travisci

cuda miniconda pytorch

cuda miniconda pytroch

Last synced: 30 Mar 2025

https://github.com/hshindo/libcuda.jl

CUDA GPU array for Julia

cuda gpu julia

Last synced: 16 May 2026

https://github.com/flosmume/cpp-cuda-deepvision-rtx-starter

CUDA C++ practice project for RTX 4070 SUPER — explore GPU concurrency, pinned memory, and Nsight profiling. Includes SAXPY and 2D blur kernels to train optimization, stream overlap, and timing analysis for NVIDIA Developer Technology Engineering skillset.

cpp cuda cuda-kernels cuda-streams deep-learning-inference gpu gpu-optimization gpu-profiling high-performance-computing nsight nvidia parrallel-computing pinned-memory

Last synced: 16 May 2026

https://github.com/nwpu66/cookiekiss-engine

CookieKiss Engine include a render and other small tech related to compute graphic.

compute-graphics cpp cuda opengl vulkan

Last synced: 09 Apr 2026

https://github.com/ahmadrafidev/learn-cuda

A place where I learn about CUDA

cuda cuda-programming gpu os parallel-programming

Last synced: 13 Apr 2025

https://github.com/drilonaliu/parallel-fractal-tree

GPU-accelerated fractal tree generation with CUDA and OpenGL interoperability.

cuda fractal-tree fractals gpu

Last synced: 19 May 2026

https://github.com/ibrar-syed/complete_deep-learning-nvidia_gpu-setup-linux

Full setup for a deep learning environment on Ubuntu Linux with CUDA, cuDNN, TensorRT, and TensorFlow GPU. Includes scripts, test code, and environment configuration

ai bash conda cuda cudnn deep-learning environment-setup gcc gpu jupyter linux machine-learning nvidia-cuda nvidia-gpu pytorch setup-script tensorflow tensorrt

Last synced: 09 Apr 2026

https://github.com/tomtolleson/cuda-kernel-benchmarking-tool

A benchmarking tool in C++ that creates Cuda kernels and tests the overall system performance between CPU and GPU

cuda cuda-kernels cuda-support cuda-toolkit nvidia nvidia-cuda nvidia-gpu

Last synced: 30 Mar 2025

https://github.com/timdev-r/cv-ground-truth-extraction

(Dump) Helper for ground truth extraction, movement analytics and silhouette visual demonstration

computer-vision cuda ground-truth intel-realsense pandas python

Last synced: 18 Apr 2026

https://github.com/grindelfp/cuda-n-body-simulation

Simulation of N-Body movement using CUDA.

cuda n-body-simulation

Last synced: 06 Apr 2025

https://github.com/drbh/quemer

GPU accelerated k-mer counter

biology cuda gpu

Last synced: 07 May 2025

https://github.com/sephiroth7712/k-nearest-neigbours

Implementation of K-Nearest Neighbors algorithm using multiple parallel computing approaches: CUDA (GPU), Hadoop, Spark, MPI, OpenMP, and PThreads. Demonstrates scalable machine learning across different parallel computing paradigms from GPU to distributed frameworks.

cuda cuda-programming hadoop-mapreduce java mpi multiprocessing multithreading openmp pthreads scala spark

Last synced: 12 Apr 2026

https://github.com/datasagess/fic

NLP Hackaton \w NN + FastAPI + Docker

catboost cuda docker fastapi lstm python pytorch rapidfuzz tensorflow

Last synced: 08 Aug 2025

https://github.com/notargets/gocca

Go bindings for OCCA - Portable parallel programming framework

bindings cfd cgo cuda golang gpu hpc occa opencl parallel-computing

Last synced: 20 Jan 2026

https://github.com/dmitryyurov/bitonic-cuda

An implementation of bitonic search on CUDA

cuda gpu-programming sorting-algorithms

Last synced: 02 Oct 2025

https://github.com/TeamBipartite/bipartite-gemm

High throughput data-parallel GEMM implementations in Cuda using Cuda cores and Tensor cores

cuda data-parallelism gemm

Last synced: 14 Jan 2026

https://github.com/aeyage/intraday_prices

GPU-accelerated portfolio optimisation

cuda cupy nvidia-gpu

Last synced: 05 Apr 2025

https://github.com/drilonaliu/parallel-mandelbrot-set

GPU-accelerated Mandelbrot Set generation with CUDA and OpenGL interoperability.

cuda fractals gpu mandelbrot-fractal parallel-programming

Last synced: 12 Apr 2026

https://github.com/farukalamai/cpp-for-cuda

A structured C++ learning path designed specifically for developers preparing to learn CUDA programming.

cpp cuda gpu nvidia

Last synced: 09 Jun 2026

https://github.com/conan-kiln/kiln

An actively maintained fork of ConanCenter with an emphasis on CV, ML and robotics capabilities on edge devices

computer-vision conan cuda machine-learning oneapi packaging robotics rust scientific-computing

Last synced: 02 Oct 2025

https://github.com/aurelienperez/gpu-heston-monte-carlo

GPU-accelerated Monte Carlo simulation for option pricing under the Heston model using CUDA.

cuda gpu heston-model

Last synced: 01 Apr 2025

https://github.com/nikhilrout/thetensorcoreproject

Microarchitecture implementation of Nvidia's Tensor Cores

cuda floating-point gpgpu hybrid-precision-training tensorcore

Last synced: 01 Apr 2025

https://github.com/akira4o4/cuda-program

CUDA YOLO Processing

cuda yolo

Last synced: 22 Jul 2025

https://github.com/brave-tarnished/gpu-accelerated-opc

Optical Proximity Correction (OPC) is a photolithography technique that modifies photomask geometry to counteract diffraction and process effects, ensuring accurate printing of patterns on the wafer. This work demonstrates a proof of concept showing how using a GPU-based approach can significantly speed up these modifications compared to a CPU.

cpp cuda gpu-acceleration photolithography semiconductors

Last synced: 02 Oct 2025

https://github.com/jaderock/cuda-by-example

Sample CUDA projects for the CUDA by Example book

bazel c cpp cuda gpu

Last synced: 05 May 2026

https://github.com/morristai/kvik-rs

KvikIO Rust implementation

cuda cufile gds kvikio nvidia rust

Last synced: 02 Apr 2026

https://github.com/yutakseo/docker_ubuntu-cuda_environment

🐳 A ready-to-use Docker environment for deep learning development with Ubuntu 22.04 and CUDA 11.8.

container cuda docker environment ubuntu

Last synced: 12 Apr 2026

https://github.com/sankeer28/pptx-text-audio-transcriber

Extract text and transcribe audio from PowerPoint presentations using OpenAI Whisper.

audio-transcription cuda openai-whisper powerpoint pptx-parser

Last synced: 02 Oct 2025

https://github.com/isquicha/cuda-parallel-studies

Learning CUDA programming here =D

cuda cuda-programming cuda-toolkit

Last synced: 03 Jul 2025

https://github.com/desmondjs/cuda_mceliece_kem

CUDA-Accelerated McEliece KEM 🔑 | Post-Quantum Cryptography on GPU Implementation of Classic McEliece key encapsulation, encryption, decryption, and decapsulation on CPU & GPU with CUDA, including benchmarking scripts and full FYP2 report

academic-project benchmarking classic-mceliece cuda fyp gpu-acceleration kem pqc

Last synced: 02 Oct 2025

https://github.com/nvaranki/cmmx

CUDA matrix multiplication (official guide, modified)

cuda cuda-kernels

Last synced: 08 Aug 2025

https://github.com/fikri-rouzan/cuda-c-program-part-1

CUDA C program from NVIDIA course.

c cuda

Last synced: 12 Apr 2026

https://github.com/lixk28/knn-cuda

cuda knn

Last synced: 01 Apr 2025

https://github.com/alpinebuster/meshlib

Mesh processing library with extra `C/C#/JS/TS/PYTHON` bindings.

cuda dicom electron emscripten mesh mesh-modelling pybind11 stl stomatology threejs wasm

Last synced: 03 Jul 2025

https://github.com/ne0nwinds/gpupuzzles

My solutions to srush/GPU-Puzzles using CUDA

cpp cuda gpgpu

Last synced: 16 May 2026

https://github.com/ivanfioravanti/tflops_mps

TFLOPs testing on MPS and CUDA

cuda mps tflops

Last synced: 19 May 2026

https://github.com/pipecruz/cuda-flocking-sim

CPU and GPU (CUDA) implementations of naive/optimized flocking algorithms

cuda

Last synced: 07 May 2026