An open API service indexing awesome lists of open source software.

CUDA

CUDA® is a parallel computing platform and programming model developed by NVIDIA for general computing on graphical processing units (GPUs). With CUDA, developers are able to dramatically speed up computing applications by harnessing the power of GPUs.

https://github.com/hrntsm/ghgpucomputingtest

Test using CUDA with Alea GPU in grasshopper.

cuda grasshopper3d

Last synced: 14 Apr 2025

https://github.com/rmiguelkelly/quickcluster

A KMeans implemented in C++ with Python bindings and GPU acceleration

clustering clustering-algorithm cpp cuda gpu kmeans kmeans-clustering metal objective-c python python3 unsupervised-learning

Last synced: 26 Jul 2025

https://github.com/benediktalkin/kappaprofiler

lightweight simple profiling for python/pytorch

cuda profiler python pytorch

Last synced: 19 Jul 2025

https://github.com/marcogarlet/cuda_cubeattack

CUDA implementation of Cube Attack

cryptography cubeattack cuda

Last synced: 28 Oct 2025

https://github.com/chrxh/alien-docs

Documentation for ALIEN

cuda evolution physics-simulation simulation

Last synced: 24 Jun 2025

https://github.com/aresio/cupsoda

cupSODA is CUDA-powered coarse-grain deterministic simulator of mass-action kinetics models

biochemical cuda gpu-computing mass-action simulation

Last synced: 21 Feb 2026

https://github.com/jonasricker/autocvd

Tool to automatically set CUDA_VISIBLE_DEVICES based on GPU utilization. Usable from command line and code.

cuda cuda-visible-devices gpu keras machine-learning nvidia python pytorch tensorflow

Last synced: 26 Feb 2026

https://github.com/NCAR/micm

A model-independent chemistry module for atmosphere models

atmospheric-chemistry atmospheric-modeling atmospheric-science cuda gpu gpu-acceleration hpc ode-solver

Last synced: 20 Jul 2025

https://github.com/nyo16/llama_cpp_ex

Elixir bindings for llama.cpp — run LLMs locally with Metal, CUDA, Vulkan, or CPU. Streaming, chat templates, embeddings, structured output, and concurrent batched inference.

cuda elixir llamacpp llm

Last synced: 04 Jun 2026

https://github.com/ashvardanian/scaling-democracy

GPU-accelerated Schulze voting method in Python, Numba, and CUDA, using ideas from Algebraic Graph Theory

cuda cuda-kernels dynamic-programming gpgpu graph-algorithms graph-theory pybind11 python voting

Last synced: 12 Apr 2025

https://github.com/yomi4486/zundamon_v3

マスター、お冷ショットで。

cuda discord-bot discord-py docker docker-compose python tts voicevox zundamon

Last synced: 14 Apr 2025

https://github.com/pyhf/cuda-images

pyhf Docker images built on Nvidia Container Toolkit enabled base images

cuda jax nvidia nvidia-cuda nvidia-docker pyhf

Last synced: 15 Jul 2025

https://github.com/rapidsai/cugraph-docs

cuGraph Docs - RAPIDS Graph Analytics Documentation

cuda cugraph documentation graph rapids

Last synced: 12 Sep 2025

https://github.com/neoblizz/cudagl

CUDA based Graphics Library for NVIDIA's GPUs.

cuda graphics-library graphics-programming opengl

Last synced: 18 Jun 2025

https://github.com/lanzani/opencv-cuda-docker

Docker with opencv with cuda support.

cuda docker nvidia-docker nvidia-gpu opencv opencv-cuda opencv-dnn

Last synced: 12 Oct 2025

https://github.com/belval/raytracing

Using CUDA to implement "Raytracing in one weekend" by Peter Shirley

cuda raytracing raytracing-in-one-weekend

Last synced: 12 Apr 2025

https://github.com/cascadingradium/cuda-hungarian-clustering

A GPU-Accelerated Clustering Algorithm that uses the Hungarian method

clustering cpp cuda gpu hungarian-algorithm parallel-computing

Last synced: 16 May 2025

https://github.com/hbseong97/tf-c-api

Using tensorflow c api, c++ api, tf lite, tf js, model conversion in Windows

bazel checkpoint cuda cudnn tensorflow

Last synced: 09 Apr 2025

https://github.com/pinto0309/realsense-cuda-opengl-docker

RealSense execution environment built on a Docker container on Ubuntu 20.04. NIVIDA GPU and OpenGL capable. CUADA 11.4.

cuda docker opengl realsense realsense2 ubuntu wsl2

Last synced: 24 Mar 2025

https://github.com/jpuigcerver/nnutils

CPU & CUDA implementation of several neural network utils

cuda deep-learning neural-networks openmp pytorch

Last synced: 11 Apr 2025

https://github.com/sbaldu/neural_network_hep

Implementation of a neural network framework from scratch in C++ applied to particle physics

cpp cuda high-energy-physics neural-networks

Last synced: 20 Jul 2025

https://github.com/axnjr/snn_be_pro

A state of the art AI framework for no/low-code (visually - drag & drop) building, testing, deploying, integrating latest deep learning models with privacy & security compliance using ollama, as a final year project!

ai cplusplus cpp cuda deep-neural-networks kernel-driver ml mlops python

Last synced: 06 Oct 2025

https://github.com/pratikvn/schwarz-lib

Repository for testing asynchronous schwarz methods.

asynchronous cuda domain-decomposition ginkgo schwarz

Last synced: 14 Apr 2025

https://github.com/eunomia-bpf/basic-cuda-tutorial

A collection of CUDA programming examples to learn GPU programming

cuda tutorial

Last synced: 15 Jun 2025

https://github.com/gapi505/sparky-2

This is a discord bot running on llama cpp with the llama 3 model and image geneartion

ai cuda llama3 llamacpp stable-diffusion torch transformers

Last synced: 07 Oct 2025

https://github.com/anantzoid/cuda-genetic-algorithm-travelling-salesman-problem

Implementation of Parallel Genetic Algorithm in CUDA to solve TSP (Berlin52)

c cuda genetic-algorithm tsp tsp-solver

Last synced: 25 Jul 2025

https://github.com/neural-bits/ai-programming-hub

Learn and experiment with new techniques and programming languages with a focus on ML

cpp cuda cython openai-triton python rust

Last synced: 12 Apr 2025

https://github.com/jmuwrobotics/libbicos

GPU-Accelerated Binary Correspondence Search for Multishot Stereo Vision

computer-vision cuda depth-map stereo-camera stereo-matching stereo-vision

Last synced: 14 Oct 2025

https://github.com/abaksy/cuda-examples

A repository of examples coded in CUDA C/C++

cuda

Last synced: 31 May 2026

https://github.com/tudasc/cusan

A data race detector for CUDA C and C++ based on ThreadSanitizer

c cpp cuda datarace threadsanitizer

Last synced: 12 Aug 2025

https://github.com/harrydobbs/torch_ransac3d

A high-performance implementation of 3D RANSAC (Random Sample Consensus) algorithm using PyTorch and CUDA.

3d cloud cubiod cuda cylinder plane plane-detection point point-cloud ransac segmentation

Last synced: 03 Oct 2025

https://github.com/coderonion/cuda-beginner-course-python-version

bilibili视频【CUDA 12.x 并行编程入门(Python版)】配套代码

cpp cublas cuda cuda-programming cudnn cupy gpu gpu-programming nvcc nvidia parallel-programming python rust

Last synced: 19 Oct 2025

https://github.com/taeguk/dist-prog-assignment

Sogang Univ. Distributed Programming (CSE5414) Assignments.

assignment cuda distributed mpi-library openmp parallel pthreads sogang

Last synced: 13 Jun 2025

https://github.com/pkestene/tsp

traveling salesman problem solved with different programing models

cea cpp cuda kokkos nvidia-gpu openacc openmp performance-portability stdpar sycl

Last synced: 19 Aug 2025

https://github.com/rocm/hipmm

HIP Memory Manager (ROCm-DS)

amd cuda gpu hip memory-management radeon-instinct-mi-series rocm

Last synced: 12 Apr 2025

https://github.com/rogerallen/smandelbrotr

SDL2 CUDA OpenGL Mandelbrot explorer.

cuda mandelbrot-viewer opengl sdl2

Last synced: 08 Mar 2026

https://github.com/almirneeto99/leetgpu-challenges

This repository contains the solution for LeetGPU Challenges

cpp cuda gpu hpc

Last synced: 18 Apr 2026

https://github.com/rocm/numba-hip

HIP backend patch for Numba, the NumPy aware dynamic Python compiler using LLVM.

ai compiler cuda gpu hip hpc jit ml numba python radeon-instinct-mi-series rocm

Last synced: 31 Aug 2025

https://github.com/k-hengzhou/hphoto

一个基于AI的智能照片管理工具,支持人脸识别、相似人脸自动聚类和nsfw检测

cuda insightface nsfw nsfw-detection nudenet photos

Last synced: 26 Feb 2025

https://github.com/wzqvip/jetson-pytorch-builder

build PyTorch with CUDA for Jetson Orin and Thor.

cuda jetson pytorch

Last synced: 01 Dec 2025

https://github.com/fatlipp/cuda-tree

CUDA-based Tree builder

algorithms cpp cuda octree quadtree tree

Last synced: 19 Jun 2025

https://github.com/nikhilrout/thegemmcoreproject

SystemVerilog Implementation of Nvidia's CUDA/Tensor Core GEMM Operations

cuda floating-point gemm gpgpu hybrid-precision-training sparse-matrix systolic-array tensorcore tpu

Last synced: 17 Aug 2025

https://github.com/pietroglyph/argustag

A C++17 wrapper for Nvidia Argus with support for zero-copy frame transfers to CUDA kernels and CUDA-accelerated AprilTag detection with ISAAC (no ROS required).

apriltag argus cuda isaac nvargus

Last synced: 04 Oct 2025

https://github.com/andydevs/cudafractal

Fractal Generator using Nvidia's CUDA framework

cplusplus cuda nvidia

Last synced: 23 Apr 2025

https://github.com/skizzy-create/ayurvedic_his

🩺 A personalized app that serves as your personal Ayurvedic assistant, providing tailored advice and guidance based on Ayurvedic principles. 🩺

cuda gpt python pytorch transformers

Last synced: 04 Oct 2025

https://github.com/antoniopelusi/lu-solver

Assignments for High Performace Computing exam at Unimore, Modena, IT.

cuda lu-decomposition openmp

Last synced: 27 Feb 2026

https://github.com/lukoshkin/hpc

Skoltech HPC course

cuda curand hpc mpi omp

Last synced: 10 Apr 2025

https://github.com/ROCm/hipMM

HIP Memory Manager (ROCm-DS)

amd cuda gpu hip memory-management radeon-instinct-mi-series rocm

Last synced: 12 Apr 2025

https://github.com/timothystewart6/vllm-gb10

Bleeding edge vLLM Docker image for the NVIDIA DGX Spark (GB10 / sm_121a).

arm64 cuda dgx-spark docker gb10 inference llm nvidia pytorch vllm

Last synced: 26 Jun 2026

https://github.com/donpablonows/coin

🪙 Crypto Optimization Interface Network (aka COIN) is a high-performance Bitcoin address generator using CUDA acceleration and multi-threading. It optimizes GPU and CPU resources for fast address generation, ensures secure private key creation, and includes real-time monitoring and automatic system optimizations.

bitcoin blockchain cryptography cuda gpu-acceleration

Last synced: 07 May 2026

https://github.com/hope2333/tsac-ng

神经音频编解码器 — Multi-backend neural audio codec. CPU (AVX/AVX2/AVX-512, NEON/SVE, RVV), GPU (CUDA, HIP/ROCm, Vulkan), LLVM JIT. Clean-room implementation.

arm64 audio-codec avx c cuda dac hip llvm-jit neural-audio riscv simd vulkan

Last synced: 29 Jun 2026

https://github.com/sean-bradley/cudalookupripemd60

RipeMD160 Lookup using parallel processing on NVidia CUDA Graphics card

cuda parallel-processing ripemd160

Last synced: 05 May 2025

https://github.com/sean-bradley/cudalookupsha256

SHA256 Lookup using parallel processing on a NVidia CUDA Compatible Graphics card

cuda parallel-processing sha256

Last synced: 05 May 2025

https://github.com/ellite/anchor-sub-sync

Anchor: A universal, hardware-accelerated CLI tool for subtitle synchronization (Whisper) and context-aware translation (NLLB)

ai audio-transcription automation cli cuda nllb python pytorch srt subtitle-sync subtitle-translation subtitles synchronization translation whisper

Last synced: 24 Feb 2026

https://github.com/ivanrs297/pycuda-covariance-matrix

A PyCUDA covariance matrix parallel implementation

covariance-matrix cuda pycuda

Last synced: 25 Oct 2025

https://github.com/appsolves/lanepilot

The worlds first real-time AI-powered traffic management system, featuring automated vehicle detection, lane allocation optimization, and dynamic control for (autonomous) cars!

ai ai-traffic-management autonomous-driving computer-vision cuda edge-computing embedded-systems jetson-orin-nano-super lane-detection pytorch

Last synced: 29 Apr 2026

https://github.com/zeloe/rtconvolver

A realtime convolution VST3

c convolution cplusplus cuda juce

Last synced: 22 Apr 2025

https://github.com/rogerallen/qtmandelbrotr

Qt CUDA Mandelbrot explorer

cuda cuda-opengl mandelbrot-viewer qt5

Last synced: 02 Aug 2025

https://github.com/rupeshs/anomalydetection

Anomaly Detection Using Anomalib and OpenVINO – Step by Step by Guide

anomalib anomaly anomalydetection computer-vision cpu cuda gpu intel onnx opencv pytorch

Last synced: 13 Apr 2025

https://github.com/guilt/rocm-programming-masterclass

Udemy's CUDA programming Masterclass with Examples in ROCM/HIP.

cuda easy hip learning-by-doing masterclass rocm

Last synced: 04 Aug 2025

https://github.com/willigarneau/astar-pathfinding

🗺📌 Implementation of the A* pathfinding algorithm with OpenCV and Cuda in C++ 💪

a-star algorithm axis-camera cuda detection implementation opencv pathfinding

Last synced: 14 Jul 2025

https://github.com/pmeier/tox-ltt

Install PyTorch distributions with light-the-torch

cuda install light-the-torch pip plugin pytorch tox

Last synced: 25 Aug 2025

https://github.com/kuroko1t/gocuda

Go binding for Cuda Driver API

cuda go golang

Last synced: 02 May 2026

https://github.com/luismisanve/gguf-to-pytorchtensor

Simple Python Script that converts the Weight of a GGUF Model to a PyTorch Tensor

cuda gguf-models huggingface llamacpp numpy python pytorch tensor

Last synced: 20 Apr 2026

https://github.com/yuvix25/py2cuda

Convert Python 3 code to CUDA code.

converter cuda gpu gpu-acceleration python python3

Last synced: 11 Sep 2025

https://github.com/enfiskutensykkel/cuda-rdma-bench

NVIDIA GPU direct RDMA using SISCI API

cuda dma gpudirect-rdma pcie rdma sisci

Last synced: 30 Mar 2025

https://github.com/webis-de/pytorch-window-matmul

a custom CUDA kernel for windowed matrix multiplication

cuda cuda-kernel pytorch

Last synced: 31 Oct 2025

https://github.com/gpuengineering/gputils

A C++ header-only library for parallel linear algebra on GPUs (CUDA/cuBLAS under the hood)

cplusplus-17 cplusplus-20 cpp cuda cuda-c cuda-cpp cuda-programming header-only linear-algebra

Last synced: 13 Aug 2025

https://github.com/coderonion/moblas

BLAS (Basic Linear Algebra Subprograms) library written in mojo programming language.

blas blis cublas cuda eigen fortran gemm gonum hpc lapack linear-algebra math mkl mojo numpy openblas pytorch scientific-computing simd tensor

Last synced: 15 Jun 2025

https://github.com/brosnanyuen/raybnn_raytrace

Ray tracing library using GPUs, CPUs, and FPGAs via CUDA, OpenCL, and oneAPI

arrayfire cuda gpu gpu-computing opencl parallel parallel-computing ray ray-tracing raybnn raylib raytracer raytracing rust

Last synced: 26 Aug 2025

https://github.com/official-imvoiid/portable-miniconda-setup-for-window

Portable Miniconda Setup for Windows 🐍 Easily create a portable Conda environment with automated scripts for flexible Python version management and CUDA support. 🚀

conda conda-environment cuda datascience machinelearning nvidia nvidia-cuda portable python

Last synced: 16 Apr 2026

https://github.com/statikfintechllc/godcore

All-in-one local AI stack for Mistral-13B and Llama.cpp, with one-step CUDA wheel install, OpenAI-compatible API, and modern web dashboard. Switch between local and cloud chat, run on your own GPU, and deploy instantly—no API keys or paywalls. Designed for easy install, custom builds, and fast remote access. Enjoy!

ai chatbot chatgpt cuda dashboard fastapi llama-cpp llm local-ai mistral openai-compatible react selfhosted webui

Last synced: 25 Jun 2025

https://github.com/kareimgazer/mat-transpose-cuda

series of trials for optimizing matrix transpose with CUDA

cuda hpc matrix parallel-computing simd

Last synced: 29 Mar 2025

https://github.com/jaxony/pynvidia

⚙️ NVIDIA GPU utilities for Python 🔧

cuda deep-learning nvidia-gpu pip python utility

Last synced: 07 May 2025

https://github.com/jackeylea/cuda_linux

linux下cuda/qt教程

cpp cuda cudnn qt5

Last synced: 26 Jul 2025

https://github.com/648trindade/sbac-pad-marathon-problems

Repository containing problems of the SBAC-PAD Marathon of Parallel Programming and some parallel solutions to them.

cuda high-performance-computing mpi openmp parallel-computing

Last synced: 01 May 2025

https://github.com/fynv/curandrtc

CURandRTC is a GPU random number generation module based on ThrustRTC.

cuda nvrtc random-number-generators thrust

Last synced: 05 May 2025

https://github.com/rfsantacruz/mycudasamples

This is a series of CUDA C++ programming samples developed to study CUDA technology and its parallel programming model.

cpp cuda gpgpu

Last synced: 13 Apr 2025

https://github.com/pnnl/cuvite

Multi-GPU Graph Community Detection using CUDA

community-detection cuda graph-clustering mpi

Last synced: 25 Jul 2025