An open API service indexing awesome lists of open source software.

CUDA

CUDA® is a parallel computing platform and programming model developed by NVIDIA for general computing on graphical processing units (GPUs). With CUDA, developers are able to dramatically speed up computing applications by harnessing the power of GPUs.

https://github.com/lordofhyphens/gpu-path-delay-coverage

CUDA-based Path Delay Fault Coverage

cpp cuda gpgpu moderngpu

Last synced: 04 May 2026

https://github.com/neugence/acehub

AI Champions for Excellence: Fresh, informative courses and content designed to help developers, researchers, and leaders advance in the field of AI.

ai cuda cv ml mlops nlp pytorch rl rlhf tensorflow

Last synced: 05 Jan 2026

https://github.com/hit07/ml-dl-torch

This repository contains comprehensive understanding of Machine Leaning, DeepLeaning using Pytorch

computer-vision convolutional-neural-networks cuda neural-networks pytorch

Last synced: 28 Feb 2025

https://github.com/gaaniruddha/mphil-gpu-imager

This repository contains code for project #1 of MPhil: test-version of GPU imager for a single time-step, single-channel and single time-step, multi-channel.

astronomy benchmarks cuda cufft google-sheets gpu-imager imaging-astronomy interferometry radio-astronomy

Last synced: 11 Jun 2026

https://github.com/alan-cooney/python-cuda-starter-template

Python CUDA Starter Template

cuda deep-learning

Last synced: 30 Mar 2025

https://github.com/jesuscopado/parallel-programming

My solutions for the course Programming Parallel Computers at Aalto University (http://ppc.cs.aalto.fi/). Grade: 5/5

cpp cuda image-segmentation median-filter sorting-algorithms

Last synced: 19 Apr 2026

https://github.com/h4ck3r-04/fpassword

Fpassword merges Hashcat's hash-cracking precision with Hydra's parallelized network login, offering penetration testers a powerful tool for swift hash deciphering and simultaneous login attempts across diverse protocols.

brute-force brute-force-attacks c cracking cuda gpgpu hashcat hashes hydra network-security opencl password penetration-testing

Last synced: 16 Jan 2026

https://github.com/sonhm3029/setup-experience

This project for storage my setup experience, error met-and-solve in developing end to end AI, software project

ai computer-vision cuda deep-learning software

Last synced: 10 Jun 2026

https://github.com/shineiarakawa/particle-stabilizer

A C++ and CUDA-based program for simulating the motion of particles.

cpp cuda n-body particles

Last synced: 12 May 2026

https://github.com/promptromp/aws-bootstrap-g4dn

fast and easy bootstrapping of AWS EC2 instances for CUDA development. Use as a CLI, as a programmatic SDK, or as an Agent Skill!

aws cuda ec2 jupyter-notebook machine-learning mlops python

Last synced: 21 Feb 2026

https://github.com/lucatedeschini/feedforwardnn

This project is my submission for the exam "Project Work in Architecture and Platform for Artificial Intelligence"

c cuda neural-networks openmp scratch-implementation

Last synced: 20 Apr 2026

https://github.com/akshaysinhaaa/emova

A deep learning framework designed for emotion and sentiment recognition using text, audio, and video modalities. This project leverages the MELD (Multimodal EmotionLines Dataset) to train a robust and flexible model that reflects human communication more accurately than unimodal models.

bert cnn cuda deep-learning multimodal python pytorch resnet-18 tensorboard transformers

Last synced: 05 May 2026

https://github.com/flavienbwk/tensorflow2-cuda-10.2-docker

Tensorflow 2.3, CUDA 10.2, Docker compatible image

cuda docker python3 tensorflow ubuntu1804

Last synced: 11 Apr 2026

https://github.com/kmock930/texture-image-comparison

This project aims to build a model which classifies the type of an unseen image as accurate as possible, by implementing, evaluating, and comparing amongst 2 different multi-layer perceptron Neural Networks.

computer-vision conda confusion-matrix convolutional-neural-networks cuda image-preprocessing keras keras-tensorflow learning-curve-analysis matplotlib multi-layer-perceptron neural-network pickle-file python3 skimage

Last synced: 12 Apr 2026

https://github.com/mjun0812/setup-cuda

Set up a specific version of NVIDIA CUDA in GitHub Actions on Linux x86_64, arm64 (Debian and Fedora based distribution) and Windows

action cuda cuda-toolkit github-actions

Last synced: 13 Jan 2026

https://github.com/cmazakas/cuda-stuff

A CUDA-based playground

cmake cuda delaunay-triangulation vscode

Last synced: 24 Mar 2025

https://github.com/ghusta/jcuda-demo

JCUDA demo

cuda java nvidia

Last synced: 14 May 2026

https://github.com/deepschneider/tinygrad-universal

Universal version of Tinygrad with CUDA and OpenCL support

autograd automatic-differentiation cuda pycuda pyopencl tinygrad tinygrad-cuda

Last synced: 06 Mar 2025

https://github.com/bjornmelin/llm-gpu-optimization

🚄 Advanced LLM optimization techniques using CUDA. Features efficient attention mechanisms, custom CUDA kernels for transformers, and memory-efficient training strategies. ⚡

cuda deep-learning gpu-acceleration llm-optimization machine-learning memory-optimization parallel-computing transformers

Last synced: 18 Mar 2025

https://github.com/ngoma1713/rushirb2001

🤖 Explore advanced AI and machine learning solutions for protein modeling and medical applications, developed by a dedicated data science graduate student.

computer-vision-opencv cuda data-science-portfolio deep-learning generative-ai machine-learning medical-ai protein-modeling published-researcher pytorch quantum-ml rag-chatbot tensorflow

Last synced: 02 May 2026

https://github.com/Parxd/cuda-optim

various CUDA kernels optimized for specific ML algos

cuda machine-learning

Last synced: 02 Sep 2025

https://github.com/mattjesc/federated-learning-simulation-1gpu-mi-is

Federated Learning Simulation on a Single GPU with Model Interpretability and Interactive Visualization

ai cuda deep-learning distributed-systems federated-learning gpu hpc keras machine-learning ml model-interpretability python pytorch simulation streamlit tensorflow

Last synced: 05 Jan 2026

https://github.com/himeyama/cuda-convolve

convolve + cuda + ruby (1次元のみ対応)

cuda filter gem ruby

Last synced: 19 Apr 2026

https://github.com/minseoc03/cuda-100-days

A 100-day journey to master CUDA programming, inspired by the CUDA-120-DAYS--CHALLENGE project. This repo contains daily CUDA exercises and code folders, with learning notes hosted on Notion. Practicing on leetgpu.com due to lack of local NVIDIA GPU.

100daysofcode cuda deeplearning gpgpu gpu hpc nvidia parallel-computing

Last synced: 19 Apr 2025

https://github.com/moesio-f/cla

C Linear Algebra (CLA) library. A simple toy library for basic vector/matrix operations with CUDA support and Python bindings.

c cuda linear-algebra python

Last synced: 09 May 2026

https://github.com/ndgigliotti/torch-ipca

GPU-accelerated Incremental PCA for PyTorch

cuda dimensionality-reduction gpu incremental-pca machine-learning pca pytorch

Last synced: 26 Jan 2026

https://github.com/ionmich/cs149-local-dev

Provides `conda` installation instructions for Stanford's CS149 (Parallel Computing) programming assignments

conda cs149 cuda ispc parallel-computing

Last synced: 31 Mar 2025

https://github.com/jpodivin/gputomata

Cellular automata running on CUDA capable GPUs

cellular-automata cellular-automaton cuda

Last synced: 07 Nov 2025

https://github.com/mathiasotnes/gemm

General Matrix Multiplication (GEMM) optimization in Cuda.

cuda gpu

Last synced: 26 Mar 2025

https://github.com/unknownnuts/meshsdk

Mesh processing library with extra `C/C#/JS/TS/PYTHON` bindings.

cuda dicom electron emscripten mesh modelling pybind11 stl stomatology threejs wasm

Last synced: 10 Apr 2026

https://github.com/ragu-manjegowda/parallel-programming

Assignments and Projects of Udacity's Introduction to Parallel Programming Course

cuda gpu-programming nvidia-cuda nvidia-gpu udacity-parallel-programming

Last synced: 25 May 2026

https://github.com/d-krylov/cuda_to_opengl

Simple examples for CUDA OpenGL interoperability

cuda cuda-opengl opengl

Last synced: 01 May 2026

https://github.com/xueeinstein/udacity-cs344-cuda8

Code for Udacity CS344 (Intro to Parallel Programming) using CUDA 8.0

cuda cuda-8 parallel-computing

Last synced: 02 May 2026

https://github.com/kenmalik/cuda-dr-bcg

CUDA C++ implementation of the DR-BCG algorithm for numerically solving linear systems.

cpp cuda hpc numerical-methods

Last synced: 19 Apr 2026

https://github.com/nourmorsy/convolution-neural-network-cuda

Code for optimization to CNN using CUDA

c cnn cuda

Last synced: 13 May 2026

https://github.com/yinguobing/opencv-docker

Dockerfiles for OpenCV build.

cuda docker ffmpeg opencv

Last synced: 10 Apr 2026

https://github.com/prateekshukla1108/thunderkittens-docs

Documentation for ThunderKittens framework

cuda deep-le

Last synced: 18 Mar 2025

https://github.com/moshidev/acap

Prácticas de la asignatura Arquitectura y Computación de Altas Prestaciones

cuda homework-assignments mpi pthreads

Last synced: 30 Mar 2025

https://github.com/cuda8/brainwords2

GPU brainflayer for sale $250

brain brainflayer brainwords cuda gpu key pass passphrase private

Last synced: 10 Mar 2025

https://github.com/shtrophic/wicuvanity

Generate wireguard vanity keys on your Nvidia GPU

cuda gpu vanity-address vanity-addresses vanitygen wireguard

Last synced: 10 Mar 2025

https://github.com/cserajdeep/dnn-iris-pytorch

Deep Neural Network with Batch normalization for tabulat datasets.

batch batch-normalization classification cuda deep-learning dnn iris-dataset

Last synced: 02 May 2026

https://github.com/Neuro-Mechatronics-Interfaces/python-intan

Tools and demos for working with EMG data from intan using python

circuitpython cuda emg pico python realtime tensorflow

Last synced: 13 Jan 2026

https://github.com/kataglyphis/machinelearningalgorithms

Basic Machine Learning Algorithms

cuda machine-learning python tensorflow

Last synced: 31 Mar 2025

https://github.com/m-torhan/advent-of-code

🎄 Solutions for the Advent of Code

advent-of-code advent-of-code-2024 cuda

Last synced: 07 Apr 2025

https://github.com/codename-detective/cuda_gpgpus_shared_memory_systems_pdp

CUDA GPGPUs Shared Memory Systems Parallel & Distributed Programming

cuda cuda-programming numa parallel-programming

Last synced: 30 Mar 2025

https://github.com/voltr0x/raytracing-cuda

Raytracing in a weekend using CUDA

cpp11 cuda raytracing sdl2

Last synced: 01 Apr 2026

https://github.com/kylesayrs/pttp

PyTorch Tensor Profiler with fully-supported memory timelines and events

cuda memory profiling pytorch

Last synced: 07 May 2026

https://github.com/kis-balazs/cuda-research

CUDA Research & Code. Course-style structured. Inspiration from @Infatoshi.

cuda

Last synced: 14 May 2025

https://github.com/bjornmelin/ai-system-design

🎨 Large-scale AI system architectures and implementations. Features distributed training systems, multi-GPU pipelines, and efficient resource management. 🏗️

architecture cuda distributed-systems engineering gpu-computing production scalability system-design

Last synced: 23 Jul 2025

https://github.com/derek-palmer/dvr-scan-file-organizer

DVR-Scan-Organizer is a Dockerized extension for DVR-Scan, designed to process multiple video files and organize output in a structured format.

cuda dvr dvr-scan multimedia opencv opencv-python python video video-processing

Last synced: 01 May 2026

https://github.com/elymsyr/auv_ws

An open-source simulation and control workspace for an Autonomous Underwater Vehicle (AUV) built on ROS 2 Humble and Gazebo. It features a high-fidelity dynamics model and an advanced AI-based motion controller (FossenNet) that uses a pre-trained LibTorch model to imitate a NL-MPC for real-time, high-performance manoeuvring.

autonomous-vehicles auv control-systems cpp cuda deep-learning gazebo imitation-learning libtorch mpc python robotics ros2 simulation

Last synced: 15 Apr 2026

https://github.com/quik-fe/node-nvidia-smi

Node wrapper around nvidia-smi.

cuda gpu nodejs nvidia nvidia-smi typescript

Last synced: 19 Feb 2026

https://github.com/kronbii/thermal-super-resolution

State-of-the-art thermal super-resolution system (IMDN) with RGB→thermal adaptation, custom multi-component loss, 29.6 dB PSNR, 0.713 SSIM, 250+ FPS, production-ready PyTorch + CUDA implementation.

computer-vision cuda deep-learning image-enhancement imdn model-optimization production-machine-learning pytorch real-time real-time-processing research super-resolution thermal-imaging

Last synced: 18 Apr 2026

https://github.com/asadiahmad/100_sports_image_classification

A deep learning project for sport image classification using a custom VGG19-based architecture with integrated Grad-CAM heatmap visualization for model interpretability.

computer-vision cuda data-augmentation deep-learning explainable-ai gpu-acceleration grad-cam heatmap-visualization image-classification mixed-precision-training pytorch pytorch-grad-cam sports-analytics sports-classification transfer-learning vgg19

Last synced: 11 Jun 2025

https://github.com/ysl1016/cudadigitfilter

CUDA-based parallel image filtering system for MNIST dataset

computer-vision cuda deep-learning gpu-acceleration image-processing mnist parallel-computing

Last synced: 28 Mar 2025

https://github.com/githubfoam/cuda-travisci

cuda miniconda pytorch

cuda miniconda pytroch

Last synced: 30 Mar 2025

https://github.com/maneeshsit/pcie

Modify run:ai and other FOSS projects code for use with PCIe card-based AI accelerators for both inference and training

cuda cxl cxl-mem distro exo k3s k8s kestra llamacpp llm-d mpi4py mpio onnxoptimizer opentelemetry-ebpf-profiler paxos-cluster pcie photonics-computing runai visualize vllm

Last synced: 24 Aug 2025

https://github.com/sshoecraft/shepherd

An interactive multi-backend LLM runtime with intelligent cache eviction and persistent retrieval-augmented memory.

anthropic cli cpp cuda gemini grok inference kv-cache llama-cpp llm mcp ollama openai openai-server rag smart-evictions tensorrt tool-calling ulimited-context

Last synced: 10 Apr 2026

https://github.com/camille-004/cusprec

🏁 Sparse signal recovery library written in PyCUDA.

cuda ml python signal-processing sparse-recovery

Last synced: 18 Jan 2026

https://github.com/1ytic/cuda-gpu-zoo

Properties of the CUDA devices

cuda gpu

Last synced: 20 Aug 2025

https://github.com/sid911/neuralnetworkcpp

A small experiment to learn about neural networks and their runtimes in cpp

cpp cuda machine-learning neural-network

Last synced: 20 Aug 2025

https://github.com/pvgupta24/parallel-programming

Basic algorithms for parallel programming in CUDA C++, Java and OpenMP

cuda openmp parallel-programming

Last synced: 19 Aug 2025

https://github.com/dmalexx/cuda_check

How can you check if CUDA is available in Tensorflow

cuda python tensorflow

Last synced: 10 Apr 2026

https://github.com/rmeli/cuda-pg

CUDA C++ Playground

cpp cuda gpu

Last synced: 16 Apr 2026

https://github.com/ojeda-e/fokker-planck

Numerical solution of the Fokker-Planck equation in large times using CUDA/C.

cuda fokker-planck-equations

Last synced: 17 Aug 2025

https://github.com/alessiobugetti/integral-image-processing

Implements sequential and parallel integral image computation in C++ and Python, utilizing CUDA for parallel computation on GPU

cuda gpu-acceleration integral-image numba parallel-computing pycuda

Last synced: 24 May 2026

https://github.com/i-m-iron-man/abmax

Abmax is an agent-based modelling framework in Jax, focused on dynamic population size

abm agent agent-based agent-based-modeling agent-based-simulation agents cuda jax python

Last synced: 04 Oct 2025

https://github.com/andreeo/parallel-computing-cuda

Programs in terminal applying the parallel programming model with the CUDA arquitecture

c cpp cuda docker lineal-search parallel-computing parallel-reduction rank-sort-algorithm

Last synced: 09 Apr 2026

https://github.com/nwpu66/cookiekiss-engine

CookieKiss Engine include a render and other small tech related to compute graphic.

compute-graphics cpp cuda opengl vulkan

Last synced: 09 Apr 2026

https://github.com/tomtolleson/cuda-kernel-benchmarking-tool

A benchmarking tool in C++ that creates Cuda kernels and tests the overall system performance between CPU and GPU

cuda cuda-kernels cuda-support cuda-toolkit nvidia nvidia-cuda nvidia-gpu

Last synced: 30 Mar 2025

https://github.com/ibrar-syed/complete_deep-learning-nvidia_gpu-setup-linux

Full setup for a deep learning environment on Ubuntu Linux with CUDA, cuDNN, TensorRT, and TensorFlow GPU. Includes scripts, test code, and environment configuration

ai bash conda cuda cudnn deep-learning environment-setup gcc gpu jupyter linux machine-learning nvidia-cuda nvidia-gpu pytorch setup-script tensorflow tensorrt

Last synced: 09 Apr 2026

https://github.com/timdev-r/cv-ground-truth-extraction

(Dump) Helper for ground truth extraction, movement analytics and silhouette visual demonstration

computer-vision cuda ground-truth intel-realsense pandas python

Last synced: 18 Apr 2026

https://github.com/datasagess/fic

NLP Hackaton \w NN + FastAPI + Docker

catboost cuda docker fastapi lstm python pytorch rapidfuzz tensorflow

Last synced: 08 Aug 2025

https://github.com/notargets/gocca

Go bindings for OCCA - Portable parallel programming framework

bindings cfd cgo cuda golang gpu hpc occa opencl parallel-computing

Last synced: 20 Jan 2026

https://github.com/dmitryyurov/bitonic-cuda

An implementation of bitonic search on CUDA

cuda gpu-programming sorting-algorithms

Last synced: 02 Oct 2025

https://github.com/sephiroth7712/k-nearest-neigbours

Implementation of K-Nearest Neighbors algorithm using multiple parallel computing approaches: CUDA (GPU), Hadoop, Spark, MPI, OpenMP, and PThreads. Demonstrates scalable machine learning across different parallel computing paradigms from GPU to distributed frameworks.

cuda cuda-programming hadoop-mapreduce java mpi multiprocessing multithreading openmp pthreads scala spark

Last synced: 12 Apr 2026

https://github.com/conan-kiln/kiln

An actively maintained fork of ConanCenter with an emphasis on CV, ML and robotics capabilities on edge devices

computer-vision conan cuda machine-learning oneapi packaging robotics rust scientific-computing

Last synced: 02 Oct 2025

https://github.com/brave-tarnished/gpu-accelerated-opc

Optical Proximity Correction (OPC) is a photolithography technique that modifies photomask geometry to counteract diffraction and process effects, ensuring accurate printing of patterns on the wafer. This work demonstrates a proof of concept showing how using a GPU-based approach can significantly speed up these modifications compared to a CPU.

cpp cuda gpu-acceleration photolithography semiconductors

Last synced: 02 Oct 2025

https://github.com/sankeer28/pptx-text-audio-transcriber

Extract text and transcribe audio from PowerPoint presentations using OpenAI Whisper.

audio-transcription cuda openai-whisper powerpoint pptx-parser

Last synced: 02 Oct 2025

https://github.com/desmondjs/cuda_mceliece_kem

CUDA-Accelerated McEliece KEM 🔑 | Post-Quantum Cryptography on GPU Implementation of Classic McEliece key encapsulation, encryption, decryption, and decapsulation on CPU & GPU with CUDA, including benchmarking scripts and full FYP2 report

academic-project benchmarking classic-mceliece cuda fyp gpu-acceleration kem pqc

Last synced: 02 Oct 2025

https://github.com/TeamBipartite/bipartite-gemm

High throughput data-parallel GEMM implementations in Cuda using Cuda cores and Tensor cores

cuda data-parallelism gemm

Last synced: 14 Jan 2026

https://github.com/nvaranki/cmmx

CUDA matrix multiplication (official guide, modified)

cuda cuda-kernels

Last synced: 08 Aug 2025

https://github.com/drilonaliu/parallel-mandelbrot-set

GPU-accelerated Mandelbrot Set generation with CUDA and OpenGL interoperability.

cuda fractals gpu mandelbrot-fractal parallel-programming

Last synced: 12 Apr 2026

https://github.com/aurelienperez/gpu-heston-monte-carlo

GPU-accelerated Monte Carlo simulation for option pricing under the Heston model using CUDA.

cuda gpu heston-model

Last synced: 01 Apr 2025

https://github.com/nikhilrout/thetensorcoreproject

Microarchitecture implementation of Nvidia's Tensor Cores

cuda floating-point gpgpu hybrid-precision-training tensorcore

Last synced: 01 Apr 2025

https://github.com/akira4o4/cuda-program

CUDA YOLO Processing

cuda yolo

Last synced: 22 Jul 2025

https://github.com/f-koehler/itesol

WIP: Iterative eigensolvers for C++20, Python and CUDA

cpp20 cuda eigenvalues linear-algebra python

Last synced: 08 Nov 2025

https://github.com/jaderock/cuda-by-example

Sample CUDA projects for the CUDA by Example book

bazel c cpp cuda gpu

Last synced: 05 May 2026

https://github.com/yutakseo/docker_ubuntu-cuda_environment

🐳 A ready-to-use Docker environment for deep learning development with Ubuntu 22.04 and CUDA 11.8.

container cuda docker environment ubuntu

Last synced: 12 Apr 2026