An open API service indexing awesome lists of open source software.

CUDA

CUDA® is a parallel computing platform and programming model developed by NVIDIA for general computing on graphical processing units (GPUs). With CUDA, developers are able to dramatically speed up computing applications by harnessing the power of GPUs.

https://github.com/gama1903/cuda_programming

Practice of cuda programming

cuda parallel-computing

Last synced: 01 Nov 2025

https://github.com/maltsev-andrey/cuda-nn-inference

GPU-accelerated neural network inference using custom CUDA kernels. Achieves 97.82% accuracy on MNIST.

cuda deep-learning gpu-programming neural-networks numba nvidia parallel-computing parallel-programming performance-optimization python3 pytorch rhel9 tesla-p100

Last synced: 07 Mar 2026

https://github.com/voduchuy/cudafsp

CUDA-based implementation of the Finite State Projection (FSP) algorithm.

chemical-master-equation cuda stochastic-reaction-networks sundials

Last synced: 20 Jan 2026

https://github.com/branebb/nn-framework

Framework for creating neural networks using C++ and CUDA platform. This project is part of my final university assignment for bachelor's degree.

cmake cpp cuda cuda-programming

Last synced: 20 Jan 2026

https://github.com/ergus/cuda-ts-mode

An emacs Cuda mode supported by tree-sitter

cuda emacs treesitter

Last synced: 20 May 2026

https://github.com/yangfengzzz/tardis

Travel space and time by using autodiff and codegen

autodiff codegen cuda

Last synced: 03 May 2026

https://github.com/ojaswithag/opencv-doc

OpenCV ile görüntü ve video işleme, makine öğrenmesi ve proje uygulamaları için Türkçe kapsamlı bir rehber. 🐙 Adım adım kod örnekleriyle öğrenin ve projeler geliştirin.

arm-architecture cuda cuda-support deployment django docker-image docker-images heroku image-processing javascript nodejs nvidia opencv-contrib opencv3 production python scanner tutorial

Last synced: 08 Apr 2026

https://github.com/lanceberge/cuda-newton-fractals

Parallelize and visualize the Newton Iteration

cpp cuda mathematical-modelling visualization

Last synced: 16 May 2026

https://github.com/tylerfaulkner/n-body_simulation

CUDA N-Body Gravitational Simulation with rendering in Python with MatPlotLib

cuda simulation

Last synced: 20 May 2026

https://github.com/larygwil/cuda-samples-old

nvidia cuda samples old (5.0 - 7.5)

cuda nvidia

Last synced: 03 May 2026

https://github.com/alkaifaftab000/autonomous-maze-solver

Building an Autonomous Maze Solver using reinforcement learning to train agents for decision-making in dynamic grid-based environments

agent criticism cuda gymnasium-environment maze-solving-bot pytorch reinforcement-learning reward-functions

Last synced: 12 Apr 2026

https://github.com/adesoji1/youtubesummaryai

Python script for YouTube summary. The service should summarize an YouTube video by url. It should works for long video and for different languages.

cuda googleapi python3 speech-recognition transformers youtube-api-v3 youtube-dl

Last synced: 04 Apr 2025

https://github.com/AndreasKaratzas/orin

Setting up the NVIDIA Jetson Orin Nano Developer Kit

cuda cudnn jetpack6 nvidia-jetson nvidia-sdkmanager orin-nano

Last synced: 25 Feb 2025

https://github.com/voschezang/holographic-projector-simulations

Optimizations of Simulations of Holographic Projectors using CUDA

cuda gpu holography parallel-computing photonics

Last synced: 16 May 2026

https://github.com/sergeipapina/color2graycuda

color to gray image conversion nvidia CUDA kernel implementation using make or cmake to compile and link

cmake cuda cuda-kernels cuda-programming link makefile nvidia

Last synced: 06 Apr 2025

https://github.com/uefi-code/bachelorgraduationdesign

I developed a PyTorch_For_PoorGuys framework and Let it train LLM on NVIDIA GeForce 2080Ti GPU as my Bachelor's Graduation Design Project

chatbot cuda gpu hacking large-language-models pytorch

Last synced: 03 May 2026

https://github.com/belrbez/ship-graphic-qt-qml-cuda-c

Client-Server application for Rocket driving in QML graphics

c client-server cpp cuda qml qt5 rocket

Last synced: 08 Apr 2026

https://github.com/td99/ai-sandbox

A collection of AI tools and prototypes.

ai cuda docker image-generation-ai nvidia python

Last synced: 08 Apr 2026

https://github.com/cs550-epfl/review

Review of the paper A Formal Analysis of the NVIDIA PTX Memory Consistency Model

cuda formal-verification gpu memory-consistency ptx simt

Last synced: 30 Mar 2025

https://github.com/fedesky25/hpc-project-2024

Project for the 2024 course of HPC: generator of streamplot of complex-valued functions

complex-numbers cuda openmp

Last synced: 30 Mar 2025

https://github.com/anne-andresen/autoencoder_3d_c_cuda

3D Autoencoder training in raw C/CUDA

3d autoencoder c cuda nifti

Last synced: 28 Apr 2026

https://github.com/daelsepara/hipnewton

GPU Implementation of Newton Fractal Generator with Benchmarking

amd cuda fractal gpu gpu-compute gpu-computing hip newton parallel-computing rocm sdk

Last synced: 03 May 2026

https://github.com/dasbd72/nthu-ipc-2022

National Tsing Hua University - Introduction to Parallel Computing - 2022

cuda cuda-programming hpc mpi openmp pthreads

Last synced: 30 Mar 2025

https://github.com/bjornmelin/ml-algorithm-playground

🧪 Core ML algorithm implementations with GPU acceleration. Featuring optimized implementations across various libraries with comprehensive analysis. 📈

algorithms cuda gpu-computing lightgbm machine-learning python scikit-learn xgboost

Last synced: 13 May 2026

https://github.com/shermanlo77/poisson_icing

Gibbs sampling on the Poisson-Ising model. The Poisson-Ising model is a 2D image of Poisson distributed random variables but has a dependency on their four neighbours. This causes the Poisson random variables to be similar (or dissimilar) to their neighbours.

cuda cupy gibbs-sampling gpu ising-model mcmc monte-carlo poisson poisson-ising

Last synced: 21 May 2026

https://github.com/sbstndb/nbody_k

A simple 3D naïve NBody simulation using Kokkos enabling CUDA or OpenMP backend

cuda kokkos nbody openmp simulation

Last synced: 21 May 2026

https://github.com/maxenceleguery/jare

3D Render engine accelerated with CUDA

3d cuda engine raytracing

Last synced: 21 May 2026

https://github.com/rbuj-uoc/m1.209

PAC 1, PAC 2, PAC 3 i PAC 4 de l'assignatura Computació d'altes prestacions del MUEI

cuda mpi openmp sge

Last synced: 21 May 2026

https://github.com/maltsev-andrey/julia_set_cuda

High-performance Julia set fractal computation in pure CUDA C, achieving 2.78 billion pixels/second on Tesla P100. Demonstrates GPU kernel programming, memory optimization, and massive parallelization (16M+ threads)."

cuda fractals gpu-programming high-performance-computing nvidia parallel-computing science visualization

Last synced: 03 Nov 2025

https://github.com/shermanlo77/oxwasp_phd

Code for the PhD thesis. The topic was on defect detection of 3D printing using x-rays. The repository includes an implementation of the mode filter and empirical null filter.

3d-printing applied-statistics computational-statistics cuda empirical-null imagej mode-filter statistics xray-projection

Last synced: 27 Mar 2025

https://github.com/bergolho/sycl

Repository with simple programs to learn SYCL.

cpp cuda sycl

Last synced: 16 May 2026

https://github.com/dragonscypher/prompty

Tool for generating smart and secure prompts for language models!

autotokenizer bert-model cuda google-t5 llm python3 tensorflow threading

Last synced: 02 Jan 2026

https://github.com/thesoenke/deeplearning-docker

Setup for Deep Learning experiments in Docker with Cuda

cuda docker fastai jupyter

Last synced: 11 May 2026

https://github.com/grindelfp/cuda-texture-memory

Exercise on using texture memory in CUDA.

cuda texture-memory

Last synced: 30 Mar 2025

https://github.com/thesupercd/cuda_sort

A simple project implementing and measuring the runtime performance metrics related to massively parallel algorithms (radix sort) on an NVIDIA GPU device.

benchmarking c cpp cuda cuda-programming gpu-acceleration gpu-programming multithreading parallel-processing radix-sort sorting-algorithms

Last synced: 10 May 2026

https://github.com/fanziyang-v/parallel-computing

Parallel Computing course materials from Harbin Institute of Technology(Shenzhen).

cuda openmp openmpi parallel-computing

Last synced: 27 Mar 2025

https://github.com/curiousci/wind

Multicore Systems Programming project

cuda mpi openmp pthreads

Last synced: 25 Dec 2025

https://github.com/tzervas/unsloth-rs

Memory-optimized GPU kernels for LLM fine-tuning in Rust (2-5x speedup, 70-80% less VRAM)

cuda gpu machine-learning optimization rust

Last synced: 25 Jan 2026

https://github.com/awikramanayake/optimized-matrix-mult

Optimizing matrix multiplication using parallelism and SIMD (AVX2, CUDA)

avx2 cuda matrix-multiplication

Last synced: 22 May 2026

https://github.com/lttofu/cosmic

Fast, lightweight GUI-based C++ Ethereum ERC918 token miner for Win64 | CUDA GPUs | CPUs | Pool | Solo Mining

0xbitcoin 0xbtc cplusplus cplusplus-cli cpuminer cuda erc20 erc918 ethereum ethereum-token gpuminer gui pool-mining solo-mining windows windows-10 windows-7 windows-gui winforms

Last synced: 08 Apr 2026

https://github.com/drc0ns0le/rtxvideoprocessor

CLI tool to apply NVIDIA RTX VSR and TrueHDR processing to video files

cuda ffmpeg hdr nvidia rtx upscale

Last synced: 20 Apr 2026

https://github.com/illagrenan/cuda-80-cudnn6-runtime-1604-py36

Ubuntu 16.04 with Python 3.6 and CUDA Dockerfile

cuda dockerfile ubuntu

Last synced: 22 Jun 2025

https://github.com/danieljvickers/fluid_simulation

An educational example for learning the Navier-Stoke equations. Also included is a C++ and CUDA shared object library, buildable with CMake, for use in your personal projects.

cpp cuda differential-equations navier-stokes numpy physics python simulation

Last synced: 04 May 2026

https://github.com/shineiarakawa/particle-stabilizer

A C++ and CUDA-based program for simulating the motion of particles.

cpp cuda n-body particles

Last synced: 12 May 2026

https://github.com/faresargus/artaxerxes

Adaptive high-performance stress tester "artaxerxes" supports GPU, io_uring, DPDK, and eBPF/XDP for advanced cybersecurity labs. Ideal for network testing. 🚀🛠️

cuda cuda-programming cybersecurity cybersecurity-education cybersecurity-tools dpdk ebpf educational github-config high-performance network-security network-security-tool penetration-testing penetration-testing-framework penetration-testing-tools stress-testing

Last synced: 24 Jul 2025

https://github.com/malolm/football-player-detection-with-yolov8

Football player detection YOLOv8 fine-tuning

cuda jupyterlab python3 yolov8-detection

Last synced: 07 May 2026

https://github.com/macaycz/nn

A lightweight, GPU-accelerated machine learning library built with CUDA.

cuda deep-learning gpu machine-learning neural-network

Last synced: 25 Jul 2025

https://github.com/luis-kr/depthmap

Depth map estimation tool using Depth-Anything-V2. Generate accurate depth maps from images with support for both relative and metric depth measurements.

cuda depth-anything depth-estimation depth-map image-processing python pytorch

Last synced: 08 Feb 2026

https://github.com/shambac/shamboflow

Fierce tensorflow competitor

cuda cupy machine-learning numpy pypi-package

Last synced: 19 Feb 2026

https://github.com/jarmak-personal/vibespatial

GPU-first spatial analytics for Python. Drop-in GeoPandas replacement powered by runtime-compiled CUDA kernels

cccl cuda geodataframe geopandas geospatial gpu gpu-computing nvrtc python spatial-analytics

Last synced: 21 Apr 2026

https://github.com/9prady9/imageconvolve

Qt app for previewing Image convolution. Uses CUDA for convolution.

c-plus-plus convolution cuda desktop-app qt

Last synced: 03 May 2026

https://github.com/tornikeo/sample-openmp-in-cuda

Sample of using OpenMP and CUDA: single GPU, multiple CPU

cuda meson openmp

Last synced: 01 Aug 2025

https://github.com/empenoso/doorcam-face-report

Пример проекта по распознаванию лиц с CUDA-ускорением. Включает скрипты для автоматической сборки dlib и анализа видео на GPU

cuda dlib dlib-face-detection

Last synced: 19 May 2026

https://github.com/cerit-sc/scipion-docker

Scipion (Cryo em image processing framework (https://scipion.i2pc.es/)) adapted to run in Kubernetes.

cryo-em cryoem cuda desktop kubernetes scipion vnc

Last synced: 02 Aug 2025

https://github.com/f-koehler/itesol

WIP: Iterative eigensolvers for C++20, Python and CUDA

cpp20 cuda eigenvalues linear-algebra python

Last synced: 08 Nov 2025

https://github.com/oaslananka/cv_cuda_cpp_sample

This is a sample project demonstrating how to use OpenCV and CUDA in C++ for detecting people in drone footage with YOLO. The project aims to be simple and understandable for those who want to learn how to use OpenCV and CUDA in C++.

computervision cpp cuda opencv

Last synced: 01 May 2026

https://github.com/sergiomarquezdev/yt-transcriber

🛠️ CLI tool to transcribe YouTube videos using OpenAI Whisper with CUDA acceleration, generate AI summaries (EN/ES) with Gemini, and create LinkedIn/Twitter content. Supports YouTube, Google Drive, and local files.

ai cli cuda gemini python transcription whisper youtube

Last synced: 15 May 2026

https://github.com/nvaranki/cmmx

CUDA matrix multiplication (official guide, modified)

cuda cuda-kernels

Last synced: 08 Aug 2025

https://github.com/desmondjs/cuda_mceliece_kem

CUDA-Accelerated McEliece KEM 🔑 | Post-Quantum Cryptography on GPU Implementation of Classic McEliece key encapsulation, encryption, decryption, and decapsulation on CPU & GPU with CUDA, including benchmarking scripts and full FYP2 report

academic-project benchmarking classic-mceliece cuda fyp gpu-acceleration kem pqc

Last synced: 02 Oct 2025

https://github.com/sankeer28/pptx-text-audio-transcriber

Extract text and transcribe audio from PowerPoint presentations using OpenAI Whisper.

audio-transcription cuda openai-whisper powerpoint pptx-parser

Last synced: 02 Oct 2025

https://github.com/brave-tarnished/gpu-accelerated-opc

Optical Proximity Correction (OPC) is a photolithography technique that modifies photomask geometry to counteract diffraction and process effects, ensuring accurate printing of patterns on the wafer. This work demonstrates a proof of concept showing how using a GPU-based approach can significantly speed up these modifications compared to a CPU.

cpp cuda gpu-acceleration photolithography semiconductors

Last synced: 02 Oct 2025

https://github.com/conan-kiln/kiln

An actively maintained fork of ConanCenter with an emphasis on CV, ML and robotics capabilities on edge devices

computer-vision conan cuda machine-learning oneapi packaging robotics rust scientific-computing

Last synced: 02 Oct 2025

https://github.com/dmitryyurov/bitonic-cuda

An implementation of bitonic search on CUDA

cuda gpu-programming sorting-algorithms

Last synced: 02 Oct 2025

https://github.com/notargets/gocca

Go bindings for OCCA - Portable parallel programming framework

bindings cfd cgo cuda golang gpu hpc occa opencl parallel-computing

Last synced: 20 Jan 2026

https://github.com/datasagess/fic

NLP Hackaton \w NN + FastAPI + Docker

catboost cuda docker fastapi lstm python pytorch rapidfuzz tensorflow

Last synced: 08 Aug 2025

https://github.com/timdev-r/cv-ground-truth-extraction

(Dump) Helper for ground truth extraction, movement analytics and silhouette visual demonstration

computer-vision cuda ground-truth intel-realsense pandas python

Last synced: 18 Apr 2026

https://github.com/separatrixxx/pgp_labs_7_sem

👓 Laboratory work for the 7 semester of MAI on PGP and PDP

cpp cuda nvidia

Last synced: 15 May 2026

https://github.com/ibrar-syed/complete_deep-learning-nvidia_gpu-setup-linux

Full setup for a deep learning environment on Ubuntu Linux with CUDA, cuDNN, TensorRT, and TensorFlow GPU. Includes scripts, test code, and environment configuration

ai bash conda cuda cudnn deep-learning environment-setup gcc gpu jupyter linux machine-learning nvidia-cuda nvidia-gpu pytorch setup-script tensorflow tensorrt

Last synced: 09 Apr 2026

https://github.com/nwpu66/cookiekiss-engine

CookieKiss Engine include a render and other small tech related to compute graphic.

compute-graphics cpp cuda opengl vulkan

Last synced: 09 Apr 2026

https://github.com/andreeo/parallel-computing-cuda

Programs in terminal applying the parallel programming model with the CUDA arquitecture

c cpp cuda docker lineal-search parallel-computing parallel-reduction rank-sort-algorithm

Last synced: 09 Apr 2026

https://github.com/i-m-iron-man/abmax

Abmax is an agent-based modelling framework in Jax, focused on dynamic population size

abm agent agent-based agent-based-modeling agent-based-simulation agents cuda jax python

Last synced: 04 Oct 2025

https://github.com/alessiobugetti/integral-image-processing

Implements sequential and parallel integral image computation in C++ and Python, utilizing CUDA for parallel computation on GPU

cuda gpu-acceleration integral-image numba parallel-computing pycuda

Last synced: 24 May 2026

https://github.com/ojeda-e/fokker-planck

Numerical solution of the Fokker-Planck equation in large times using CUDA/C.

cuda fokker-planck-equations

Last synced: 17 Aug 2025

https://github.com/rmeli/cuda-pg

CUDA C++ Playground

cpp cuda gpu

Last synced: 16 Apr 2026

https://github.com/dmalexx/cuda_check

How can you check if CUDA is available in Tensorflow

cuda python tensorflow

Last synced: 10 Apr 2026

https://github.com/pvgupta24/parallel-programming

Basic algorithms for parallel programming in CUDA C++, Java and OpenMP

cuda openmp parallel-programming

Last synced: 19 Aug 2025

https://github.com/sid911/neuralnetworkcpp

A small experiment to learn about neural networks and their runtimes in cpp

cpp cuda machine-learning neural-network

Last synced: 20 Aug 2025

https://github.com/1ytic/cuda-gpu-zoo

Properties of the CUDA devices

cuda gpu

Last synced: 20 Aug 2025

https://github.com/camille-004/cusprec

🏁 Sparse signal recovery library written in PyCUDA.

cuda ml python signal-processing sparse-recovery

Last synced: 18 Jan 2026

https://github.com/sshoecraft/shepherd

An interactive multi-backend LLM runtime with intelligent cache eviction and persistent retrieval-augmented memory.

anthropic cli cpp cuda gemini grok inference kv-cache llama-cpp llm mcp ollama openai openai-server rag smart-evictions tensorrt tool-calling ulimited-context

Last synced: 10 Apr 2026

https://github.com/maneeshsit/pcie

Modify run:ai and other FOSS projects code for use with PCIe card-based AI accelerators for both inference and training

cuda cxl cxl-mem distro exo k3s k8s kestra llamacpp llm-d mpi4py mpio onnxoptimizer opentelemetry-ebpf-profiler paxos-cluster pcie photonics-computing runai visualize vllm

Last synced: 24 Aug 2025

https://github.com/quik-fe/node-nvidia-smi

Node wrapper around nvidia-smi.

cuda gpu nodejs nvidia nvidia-smi typescript

Last synced: 19 Feb 2026

https://github.com/elymsyr/auv_ws

An open-source simulation and control workspace for an Autonomous Underwater Vehicle (AUV) built on ROS 2 Humble and Gazebo. It features a high-fidelity dynamics model and an advanced AI-based motion controller (FossenNet) that uses a pre-trained LibTorch model to imitate a NL-MPC for real-time, high-performance manoeuvring.

autonomous-vehicles auv control-systems cpp cuda deep-learning gazebo imitation-learning libtorch mpc python robotics ros2 simulation

Last synced: 15 Apr 2026

https://github.com/derek-palmer/dvr-scan-file-organizer

DVR-Scan-Organizer is a Dockerized extension for DVR-Scan, designed to process multiple video files and organize output in a structured format.

cuda dvr dvr-scan multimedia opencv opencv-python python video video-processing

Last synced: 01 May 2026

https://github.com/bjornmelin/ai-system-design

🎨 Large-scale AI system architectures and implementations. Features distributed training systems, multi-GPU pipelines, and efficient resource management. 🏗️

architecture cuda distributed-systems engineering gpu-computing production scalability system-design

Last synced: 23 Jul 2025

https://github.com/kis-balazs/cuda-research

CUDA Research & Code. Course-style structured. Inspiration from @Infatoshi.

cuda

Last synced: 14 May 2025

https://github.com/kylesayrs/pttp

PyTorch Tensor Profiler with fully-supported memory timelines and events

cuda memory profiling pytorch

Last synced: 07 May 2026

https://github.com/m-torhan/advent-of-code

🎄 Solutions for the Advent of Code

advent-of-code advent-of-code-2024 cuda

Last synced: 07 Apr 2025

https://github.com/abhiksark/gluon-by-example

Learn Triton's Gluon by example — the same GPU kernels written in Triton and Gluon, benchmarked

cuda deep-learning gluon gpu gpu-kernels triton tutorial

Last synced: 01 Jul 2026

https://github.com/tornikeo/minimal-vscode-cuda-meson

Minimal sample of using VSCode and Meson to build CUDA applications

cuda meson template vscode

Last synced: 08 Sep 2025

https://github.com/moshidev/acap

Prácticas de la asignatura Arquitectura y Computación de Altas Prestaciones

cuda homework-assignments mpi pthreads

Last synced: 30 Mar 2025