An open API service indexing awesome lists of open source software.

CUDA

CUDA® is a parallel computing platform and programming model developed by NVIDIA for general computing on graphical processing units (GPUs). With CUDA, developers are able to dramatically speed up computing applications by harnessing the power of GPUs.

https://github.com/debowin/gpu-parallel-recommender-system

GPGPU Parallel User-User Collaborative Filtering System in CUDA C

collaborative-filtering cuda gpu-programming movielens-dataset recommender-system

Last synced: 24 Apr 2026

https://github.com/cfries/javagpuexperiments

Repository used to demo OpenCL, JOCL, JCuda.

cuda

Last synced: 25 Apr 2026

https://github.com/kpetridis24/four-russians-algorithm

Boolean matrix multiplication accelerated by the four-Russians algorithm

c cuda gpu high-performance matrix-multiplication preprocess

Last synced: 29 May 2026

https://github.com/kagof/julia-image-processing

Image processing programs written in Julia

cuda image-processing julia

Last synced: 18 May 2026

https://github.com/juntyr/necsim-rust

Spatially explicit biodiversity simulations using a parallel library written in Rust

biodiversity cuda mpi necsim rust simulation

Last synced: 22 Mar 2025

https://github.com/stdogpkg/cukuramoto

A python/CUDA pkg which solves numerically the kuramoto model through the Heun's method

complex-networks cuda kuramoto-model

Last synced: 28 Jan 2026

https://github.com/acrlakshman/gradient-augmented-levelset-cuda

Implementation of Gradient Augmented Levelset method for CPU and GPU

cfd cuda levelset

Last synced: 17 Feb 2026

https://github.com/demoriarty/doksparse

sparse DOK tensors on GPU, pytorch

cuda pytorch sparse

Last synced: 28 Jun 2026

https://github.com/markdtw/parallel-programming

Basic Pthread, OpenMP, CUDA examples

cuda openmp parallel-programming pthreads

Last synced: 20 Apr 2026

https://github.com/kilamper/matrix-multiplication

AC - Matrix multiplication using OpenMP, MPI and CUDA

cuda ms-mpi openmp

Last synced: 16 May 2026

https://github.com/alpha74/cuda_basics

Nvidia NVCC CUDA programs for begineers.

c cpp cuda cuda-programs nvcc nvidia parallel-computing parallel-programming

Last synced: 08 May 2026

https://github.com/pothosware/pothosgpu

Pothos toolkit for ArrayFire API support

arrayfire cuda dataflow dataflow-programming gpu opencl pothos

Last synced: 19 Apr 2026

https://github.com/muhac/jupyter-pytorch-docker

JupyterLab for AI in Docker! Anaconda and PyTorch GPU supported.

conda-environment cuda docker jupyterlab pytorch

Last synced: 01 Oct 2025

https://github.com/andreasholt/cusmc

A CUDA-accelerated Statistical Model Checker for Stochastic Timed Automata

cuda smc

Last synced: 11 Feb 2026

https://github.com/alpinebuster/arkime-docker-compose

Deploy Arkime with GPU-accelerated Rust/Python parsers and custom plugins using Docker Compose.

arkime c cuda deep-neural-networks docker docker-compose llm machine-learning networking pcap pcapng python rust traffic-analysis

Last synced: 16 Apr 2026

https://github.com/tvanfossen/entropic

Local-first agentic inference engine in C/C++. Multi-tier model routing, grammar-constrained output, MCP tool servers. Embeddable via C ABI.

agentic-ai agentic-framework cpp cpp20 cuda edge-ai embedded-ai gbnf gguf grammar-constrained-decoding inference-engine llama-cpp llm local-llm mcp on-device-ai privacy-first tool-calling

Last synced: 30 May 2026

https://github.com/szymon423/tsp-cpu-vs-gpu

Simple brute force approach to solve travelling salesman problem with CPU and GPU

cuda tsp

Last synced: 11 Mar 2025

https://github.com/kar-dim/watermarking-gpu

Code for my Diploma thesis at Information and Communication Systems Engineering (University of the Aegean, School of Engineering) with title "Efficient implementation of watermark and watermark detection algorithms for image and video using the graphics processing unit". Part 2 / GPU

arrayfire cpp cuda ffmpeg gpu image-processing opencl parallel-computing video-processing watermark-image watermarking

Last synced: 09 Apr 2025

https://github.com/prithivsakthiur/vlm-parsing

VLM-Parsing is a Gradio-based web application for parsing documents and images into structured HTML and Markdown formats using advanced Vision Language Models (VLMs).

cuda gradio html huggingface-models huggingface-spaces huggingface-transformers logics markdown ocr-recognition pytorch qwen2-5-vl spaces vlm

Last synced: 05 Apr 2026

https://github.com/kishore-narendran/eecs221-highperformancecomputing

Assignments done during the graduate course EECS 221 - Introduction to HPC that I took in the Spring Quarter of 2016 at University of California, Irvine. Involves assignments that use OpenMP, MPI and CUDA.

cuda hpc mpi openmp

Last synced: 17 May 2026

https://github.com/l30nardosv/reproduce-parcosi-moleculardocking

Reproducing paper: "Benchmarking the Performance of Irregular Computations in AutoDock-GPU Molecular Docking"

autodock-gpu cpu cuda gpu molecular-docking molecular-docking-scripts opencl paper reproducible-research

Last synced: 16 Feb 2026

https://github.com/tthebc01/cudaconda3

Lightweight container environment with Cuda, Miniconda3, and Jupyter Lab.

cuda docker gpu jupyterlab marimo-notebook miniconda3 reverse-proxy-application

Last synced: 11 Feb 2026

https://github.com/xiongsp/pytorch-docker

Pure Pytorch Docker Images. Support almost all combinations of Pytorch, Python, Ubuntu, CentOS, and CUDA. 纯净的Pytorch镜像,支持几乎各种Pytorch、Python、Ubuntu、CentOS、CUDA版本的组合。

centos cuda docker docker-image python3 pytorch ubuntu

Last synced: 17 Apr 2026

https://github.com/droduit/multiprocessor-architecture

Introduction to Multiprocessor Architecture @ EPFL

cuda multiprocessor multithreading openmp-parallelization

Last synced: 17 Apr 2026

https://github.com/galaxies99/inception-cuda

CUDA Implementation of Inception

cuda inception-v3

Last synced: 12 Apr 2025

https://github.com/true-real-michael/python-plane-ransac

Parallel RANSAC for plane detection for multiple point clouds using Python and CUDA

cuda numba plane-detection python ransac

Last synced: 14 Mar 2025

https://github.com/lintenn/cudaaddvectors-explicit-vs-unified-memory

Performance comparison of two different forms of memory management in CUDA

c cuda explicit memory memory-management performance unified-memory

Last synced: 17 May 2026

https://github.com/bonj4/wiki

This repository contains documentation and installation scripts for various tools and libraries.

cuda pangolin pybind11 sfm tensorrt

Last synced: 17 Jan 2026

https://github.com/mulx10/firefly

Enhancing Object Detection in using Thermal Imaging for thin cross-section unidentifiable objects(eg. cyclist, pedestrians).

autonomous-cars autonomous-navigation autonomous-vehicles c cuda object-detection thermal-camera yolov3

Last synced: 03 Sep 2025

https://github.com/programmer-rd-ai/detectx

A Pythonic approach to object detection using Detectron2, a clean, modular framework for training and deploying computer vision models. DetectX simplifies the complexity of object detection while maintaining high performance and extensibility.

coco-dataset computer-vision computer-vision-library cuda deep-learning detectron2 faster-rcnn gpu-accelerated machine-learning ml-framework object-detection object-recognition python3 pytorch retinanet

Last synced: 10 Jun 2025

https://github.com/tortillazhawaii/rr_sort

Various sorting implementations using distributed and parallel methods

bazel cpp cuda java openmp spark threads

Last synced: 14 Apr 2026

https://github.com/xlite-dev/HGEMM

⚡️Write HGEMM from scratch using Tensor Cores with WMMA, MMA PTX and CuTe API. 🎉🎉

cuda hgemm tensor-cores

Last synced: 30 Jul 2025

https://github.com/B1-663R/docker-mining

Dockerfiles to build docker images to start mining with an NVIDIA Docker architecture

cryptocurrency cuda docker-image docker-nvidia mining

Last synced: 28 Mar 2025

https://github.com/matthewfeickert/cuda-tf-torch

An Ubuntu 18.04 NVIDIA Docker image with CUDA 10.1 CuDNN 7 with TensorFlow and PyTorch

cuda cuda-101 cudnn cudnn-v7 docker docker-image gpu nvidia-docker nvidia-gpu pytorch tensorflow torch

Last synced: 07 Jan 2026

https://github.com/capelliexp/sc2-im-pf-pathfinding-thesis

Master of science thesis project. Using CUDA to utilize a systems GPU to create pathfinding data (IM+PF), usable by multiple agents in the same environment.

ai cplusplus cuda gpgpu pathfinding starcraft2

Last synced: 15 May 2026

https://github.com/bdwhst/fluora

A CUDA PBR path tracer

cpp cuda pathtracing pbr rendering

Last synced: 13 Feb 2026

https://github.com/peri044/cuda

GPU implementations of algorithms

cuda gauss-jordan parallel-programming

Last synced: 14 Jul 2025

https://github.com/andreabak/whispersubs

Generate subtitles for your video or audio files using the power of AI

ai cuda deep-learning gpu-acceleration machine-learning srt subtitles transcribe transcription translate whisper

Last synced: 15 Feb 2026

https://github.com/terrylindev/image-to-ASCII

🖼️ A command-line tool for converting images to ASCII art

ascii ascii-art cli command-line cpp cuda docker image-processing image-to-ascii mpi opencv terminal

Last synced: 12 Jul 2025

https://github.com/szaghi/adam

Multi-physics AMR SDK and apps for High Performance Computing — from laptop to exascale device-accelerated superpc

amr cfd cuda fluid-dynamics fortran gas-dynamics hpc hydro-dynamics mpi openacc openmp plasma-dynamics

Last synced: 04 Apr 2026

https://github.com/nellogan/distributed_compy

Distributed_compy is a distributed computing library that offers multi-threading, heterogeneous (CPU + mult-GPU), and multi-node support

cluster cuda heterogeneous-parallel-programming multi-threading multigpu openmp openmpi

Last synced: 16 Aug 2025

https://github.com/navdeep-g/dimreduce4gpu

Dimensionality reduction ("dimreduce") on GPUs ("4gpu")

cplusplus cuda dimensionality-reduction gpu linear-algebra pca python svd unsupervised-learning

Last synced: 14 Apr 2025

https://github.com/tank3-tk3/pi-calculation-cpu-gpu

PI calculation with CPU and GPU

c cpp cuda parallel-computing pi

Last synced: 13 Apr 2026

https://github.com/copperfr/blendervxkex

Windows 7 CUDA & OptiX support for Blender 4.x

blender cuda cycles-renderer optix vxkex windows-7

Last synced: 20 Jan 2026

https://github.com/boltzmannentropy/vllm-5090

vLLM-5090: Docker Container for RTX 5090 on WSL2/Windows

5090 cuda docker vllm

Last synced: 08 Oct 2025

https://github.com/dujonwalker/nixos-config-x86_64-cuda

This repository contains my NixOS configuration optimized for 64-bit x86 systems with NVIDIA CUDA support, featuring a Plasma 6 desktop environment and a variety of essential applications for development, multimedia, and productivity. It serves as a backup for easy restoration and setup on new installations.

cuda flatpak nix nixos nixos-configuration ollama

Last synced: 17 Jan 2026

https://github.com/tyler-romero/aegae

Learning Triton / CUDA

cuda triton

Last synced: 11 Apr 2026

https://github.com/mu7annad0/100gpu

100 Days of CUDA: Optimizing My Life, One Kernel at a Time. 🔄🔥

cuda gpu

Last synced: 08 Mar 2026

https://github.com/betarixm/cuecc

POSTECH: Heterogeneous Parallel Computing (Fall 2023)

cryptography ctypes cuda ecc postech secp256k1

Last synced: 12 May 2025

https://github.com/ginkgo-project/cudaarchitectureselector

A CMake module simplifying the specification of CUDA architectures

cmake cmake-modules cuda

Last synced: 05 Nov 2025

https://github.com/kim-hwiwon/T-espresso

A CUDA Library for Low-overhead Host-to-Device Transmission of Patterned Profile Data

cuda profiler

Last synced: 10 Apr 2025

https://github.com/miniex/maidenx

Rust-based CUDA library designed for learning purposes and building my AI engines named Maiden Engine

ai cuda rust

Last synced: 20 Mar 2025

https://github.com/stanczakdominik/cuda_poisson

A 2D poisson solver via CUDA

cuda electromagnetism pde

Last synced: 29 Jun 2025

https://github.com/saiccoumar/cuda-programming-exercises

Brief collection of GPU exercises (my reimplementation). Comes with relevant resources.

cuda cuda-programming nvcc nvidia

Last synced: 25 May 2026

https://github.com/dansolombrino/gphungarian

A GPU-accelerated implementation of the Hungarian Algorithm, written in CUDA

cuda gpu hpc opencl

Last synced: 31 Aug 2025

https://github.com/SanaeProject/Matrix-for-Cpp

This repository has types that handle matrices.

cpp14 cpp14-library cuda matrix-library

Last synced: 15 May 2025

https://github.com/xlisp/learn-vllm

vllm learning

cuda nvidia pytorch vllm

Last synced: 10 May 2026

https://github.com/thunder-compute/thunder-compute-documentation

Documentation for Thunder Compute, a cloud platform creating technology to virtualize GPUs over TCP

ai artificial-intelligence cloud cloud-computing cuda gpu llm machine-learning nvidia pytorch tensorflow thunder-compute virtualization

Last synced: 15 Oct 2025

https://github.com/vipaka2/sdforge-docker

latest sd forge docker image.

cuda docker nvidia python

Last synced: 24 Jul 2025

https://github.com/bl33h/productoftwovectors

This code utilizes CUDA for parallel vector multiplication on a GPU, demonstrating GPU's acceleration capabilities.

cuda gpu kernel paralelism parallel-programming product vector

Last synced: 16 May 2026

https://github.com/thisalmandula/gpu_accelerated_lpt_cfd_code

This repository contains GPU accelerated version of the particle tracking model developed by Merel Kooi for biofouled microplastic particles ( available at: https://pubs.acs.org/doi/10.1021/acs.est.6b04702) written in CUDA Fortran and CUDA Python. This repository is intended as a learning tool for GPU programming.

biofouling computational-fluid-dynamics cuda fortran lagrangian-particle-tracking microplastics python

Last synced: 02 May 2026

https://github.com/vishwamartur/btc_recovery

High-performance Bitcoin wallet password recovery system with GPU acceleration and integrated graphics support. Recover Bitcoin Core wallet.dat files without blockchain download using advanced algorithms and blockchain APIs.

bitcoin bitcoin-core blockchain blockchain-api cpp cryptocurrency cuda electrum gpu-acceleration integrated-graphics multithreading opencl password-recovery private-keys recovery-tools wallet-dat wallet-recovery

Last synced: 14 Apr 2026

https://github.com/satyajitghana/gpu-programming

Contains the contents of GPU Architecture and Programming course done on NPTEL

c cpp cuda cuda-programming gpu-programming nptel nvidia

Last synced: 09 Mar 2026

https://github.com/enp1s0/curand_fp16

FP16 pseudo random number generator on GPU

cuda gpu half-precision random-number-generators

Last synced: 20 Aug 2025

https://github.com/a-nau/python-cuda-envs

Script to automatically map a specific CUDA version to a Conda Python environment.

anaconda anaconda-environment cuda installation installation-script python python-environment python3

Last synced: 18 Apr 2026

https://github.com/renatomaynard/a-multiple-population-coarse-grained-genetic-algorithm-to-solve-the-quadratic-assignment-problem-

A Multiple-population coarse-grained Genetic Algorithm to solve the Quadratic Assignment Problem

c cuda genetic-algorithm quadratic-assignment-problem

Last synced: 09 May 2026

https://github.com/emmanuelmess/firstcollisiontimesteprarefiedgassimulator

This simulator computes all possible intersections for a very small timestep for a particle model

cpp20 cuda simulator

Last synced: 17 Apr 2026

https://github.com/daelsepara/hipmandelbrot

GPU Implementation of Mandelbrot Fractal Generator with Benchmarking

amd cuda fractal gpu gpu-compute gpu-computing hip mandelbrot parallel-computing rocm sdk

Last synced: 20 Feb 2026

https://github.com/dvhh/masscorrelation

An exercise in writing an efficient correlation calculator

calculations correlation-calculation cuda matrix multi-threading openmp

Last synced: 15 May 2026

https://github.com/blazekill/hello-cuda

Cpp + Vcpkg + CUDA + VsCode starter project.

cpp cuda vcpkg vscode

Last synced: 18 May 2026

https://github.com/rnabla/cuda-des

Bruteforcing DES using CUDA

bruteforce cuda data des encryption gpu parallel standard

Last synced: 27 Oct 2025

https://github.com/xihuai18/image-processing-in-cuda

Implementation of Image Processing Method

cuda imageprocessing

Last synced: 04 Oct 2025

https://github.com/andrewboessen/bitonic-merge-sort

Bitonic Merge Sort algorithm optimized for GPU execution

bitonic-merge-sort cuda sorting-network

Last synced: 16 May 2026

https://github.com/andih/cuda-fortran-stream

Variant of STREAM Benchmark in CUDA Fortran

cuda cuda-fortran gpu stream-benchmarks variants

Last synced: 02 Mar 2025

https://github.com/rjected/cuda-timelock

Solving a large number of timelock puzzles in parallel using GPU acceleration

c cgbn concurrent cpp cuda gmp graphics nvidia parallel puzzle timelock

Last synced: 14 Apr 2026

https://github.com/liuyuweitarek/pytorch-docker-builder

Automate PyTorch Docker image builds with compatible Python, CUDA, and Poetry versions, including CI/CD for testing.

cicd containerd cuda docker docker-image poetry-python python python3 pytorch pytorch-docker

Last synced: 06 Feb 2026

https://github.com/hubenchang0515/fft-benchmark

一些 FFT 库的性能测试

cuda fft

Last synced: 27 Oct 2025

https://github.com/lhldev/rust-neural-network

neural network implementation in rust

cuda feedforward-neural-network

Last synced: 16 May 2026

https://github.com/bjornmelin/ml-vision-lab

👁️ Production-grade computer vision implementations. Real-world applications in image processing, object detection, and video analytics with GPU acceleration. 📸

computer-vision cuda deep-learning image-processing object-detection opencv pytorch video-analytics

Last synced: 04 Apr 2026

https://github.com/michaelfranzl/image_debian-gpgpu

Dockerfile for a Debian base image with AMD and Nvidia GPGPU support

amd container container-image cuda debian docker gpgpu nvidia opencl

Last synced: 10 May 2026

https://github.com/mre/talks

...mostly Computer Science related.

computer-science cuda talks tech-talks

Last synced: 28 Apr 2026

https://github.com/dafadey/GPGPU_OpenCL_vs_CUDA

This is a repository with sample codes for testing memory bandwidth, arithmetic latency hiding and shared/local memory performance on AMD and nVidia devices

cuda gpgpu gpgpu-computing opencl

Last synced: 16 May 2025

https://github.com/sarah627/horus_eye_fcih_graduation_project

An AI-powered tourism website using YOLOv7 for real-time landmark detection in images. Built with Flask, PyTorch, and Roboflow for seamless tourist interaction.

computer-vision cuda flask jupyter-notebook kaggle matplotlib object-detection opencv python pytorch roboflow

Last synced: 14 Apr 2026

https://github.com/alekseyscorpi/vacancies_server

This is a server for vacancies generation using LLM (Saiga3)

code cuda cuda-toolkit docker dockerfile flask llama3 llamacpp llm ngrok pydantic saiga

Last synced: 06 Feb 2026