An open API service indexing awesome lists of open source software.

CUDA

CUDA® is a parallel computing platform and programming model developed by NVIDIA for general computing on graphical processing units (GPUs). With CUDA, developers are able to dramatically speed up computing applications by harnessing the power of GPUs.

https://github.com/lanl/stcuda

StCUDA allows Smalltalk to call CUDA Driver APIs to do GPU computing

cuda smalltalk visualworks

Last synced: 12 Apr 2025

https://github.com/phineas-pta/nvidia-win

NVIDIA’s deep learning stack on Windows: CUDA toolkit + cuDNN + TensorRT

cuda cudnn guide tensorrt tutorial windows

Last synced: 12 Apr 2025

https://github.com/chrxh/alien-docs

Documentation for ALIEN

cuda evolution physics-simulation simulation

Last synced: 24 Jun 2025

https://github.com/p-ranav/vulkan-earth

Vulkan-based 3D Rendering of Earth

3d cuda engine gpu rendering simulation vulkan

Last synced: 05 May 2025

https://github.com/koesie10/gpjson

GPU-based JSON data processing system accessible via all GraalVM languages

cuda gpu graalvm json jsonpath

Last synced: 20 Jun 2025

https://github.com/xiangronglin/grayscale-conversion

grayscale conversion optimized with OpenMP, SIMD and CUDA

cuda grayscale hpc openmp simd

Last synced: 23 Mar 2025

https://github.com/radenmuaz/slope-ad

A small automatic differentiation engine, supporting higher-order derivatives

array autograd automatic-differentiation cuda gradient iree jvp machine-learning metal mlir onnx onnxruntime tensor vjp

Last synced: 26 Jun 2025

https://github.com/giovaneiwamoto/cuda-shortest-paths

🧩 Cuda Shortest Paths - Parallel Dijkstra and Floyd algorithms using Nvidia CUDA to calculate All-Pairs Shortest Path (APSP) in a given graph represented by its adjacency matrix.

all-pairs-shortest-path cuda nvidia

Last synced: 29 Apr 2025

https://github.com/abus-aikorea/aria-coversong

The best gradio web-ui for creating cover song that uses mdx-net and rvc. Easy one click installation. Fully portable.

cuda demucs gradio karaoke mdx-net nvidia python pytorch rvc song-covers uvr vocal-remover voice-conversion

Last synced: 25 Apr 2025

https://github.com/jonasricker/autocvd

Tool to automatically set CUDA_VISIBLE_DEVICES based on GPU utilization. Usable from command line and code.

cuda cuda-visible-devices gpu keras machine-learning nvidia python pytorch tensorflow

Last synced: 26 Feb 2026

https://github.com/torinos-yt/nnonnx

Using CUDA for Faster Machine Learning Inference on Unity

cuda machine-learning onnxruntime unity

Last synced: 09 Jul 2025

https://github.com/anantzoid/cuda-genetic-algorithm-travelling-salesman-problem

Implementation of Parallel Genetic Algorithm in CUDA to solve TSP (Berlin52)

c cuda genetic-algorithm tsp tsp-solver

Last synced: 25 Jul 2025

https://github.com/belval/raytracing

Using CUDA to implement "Raytracing in one weekend" by Peter Shirley

cuda raytracing raytracing-in-one-weekend

Last synced: 12 Apr 2025

https://github.com/neural-bits/ai-programming-hub

Learn and experiment with new techniques and programming languages with a focus on ML

cpp cuda cython openai-triton python rust

Last synced: 12 Apr 2025

https://github.com/abaksy/cuda-examples

A repository of examples coded in CUDA C/C++

cuda

Last synced: 31 May 2026

https://github.com/antoniopelusi/lu-solver

Assignments for High Performace Computing exam at Unimore, Modena, IT.

cuda lu-decomposition openmp

Last synced: 27 Feb 2026

https://github.com/rocm/numba-hip

HIP backend patch for Numba, the NumPy aware dynamic Python compiler using LLVM.

ai compiler cuda gpu hip hpc jit ml numba python radeon-instinct-mi-series rocm

Last synced: 31 Aug 2025

https://github.com/k-hengzhou/hphoto

一个基于AI的智能照片管理工具,支持人脸识别、相似人脸自动聚类和nsfw检测

cuda insightface nsfw nsfw-detection nudenet photos

Last synced: 26 Feb 2025

https://github.com/pratikvn/schwarz-lib

Repository for testing asynchronous schwarz methods.

asynchronous cuda domain-decomposition ginkgo schwarz

Last synced: 14 Apr 2025

https://github.com/pkestene/tsp

traveling salesman problem solved with different programing models

cea cpp cuda kokkos nvidia-gpu openacc openmp performance-portability stdpar sycl

Last synced: 19 Aug 2025

https://github.com/rapidsai/cugraph-docs

cuGraph Docs - RAPIDS Graph Analytics Documentation

cuda cugraph documentation graph rapids

Last synced: 12 Sep 2025

https://github.com/gapi505/sparky-2

This is a discord bot running on llama cpp with the llama 3 model and image geneartion

ai cuda llama3 llamacpp stable-diffusion torch transformers

Last synced: 07 Oct 2025

https://github.com/jpuigcerver/nnutils

CPU & CUDA implementation of several neural network utils

cuda deep-learning neural-networks openmp pytorch

Last synced: 11 Apr 2025

https://github.com/almirneeto99/leetgpu-challenges

This repository contains the solution for LeetGPU Challenges

cpp cuda gpu hpc

Last synced: 18 Apr 2026

https://github.com/axnjr/snn_be_pro

A state of the art AI framework for no/low-code (visually - drag & drop) building, testing, deploying, integrating latest deep learning models with privacy & security compliance using ollama, as a final year project!

ai cplusplus cpp cuda deep-neural-networks kernel-driver ml mlops python

Last synced: 06 Oct 2025

https://github.com/nikhilrout/thegemmcoreproject

SystemVerilog Implementation of Nvidia's CUDA/Tensor Core GEMM Operations

cuda floating-point gemm gpgpu hybrid-precision-training sparse-matrix systolic-array tensorcore tpu

Last synced: 17 Aug 2025

https://github.com/pyhf/cuda-images

pyhf Docker images built on Nvidia Container Toolkit enabled base images

cuda jax nvidia nvidia-cuda nvidia-docker pyhf

Last synced: 15 Jul 2025

https://github.com/jmuwrobotics/libbicos

GPU-Accelerated Binary Correspondence Search for Multishot Stereo Vision

computer-vision cuda depth-map stereo-camera stereo-matching stereo-vision

Last synced: 14 Oct 2025

https://github.com/neoblizz/cudagl

CUDA based Graphics Library for NVIDIA's GPUs.

cuda graphics-library graphics-programming opengl

Last synced: 18 Jun 2025

https://github.com/tudasc/cusan

A data race detector for CUDA C and C++ based on ThreadSanitizer

c cpp cuda datarace threadsanitizer

Last synced: 12 Aug 2025

https://github.com/pietroglyph/argustag

A C++17 wrapper for Nvidia Argus with support for zero-copy frame transfers to CUDA kernels and CUDA-accelerated AprilTag detection with ISAAC (no ROS required).

apriltag argus cuda isaac nvargus

Last synced: 04 Oct 2025

https://github.com/skizzy-create/ayurvedic_his

🩺 A personalized app that serves as your personal Ayurvedic assistant, providing tailored advice and guidance based on Ayurvedic principles. 🩺

cuda gpt python pytorch transformers

Last synced: 04 Oct 2025

https://github.com/pinto0309/realsense-cuda-opengl-docker

RealSense execution environment built on a Docker container on Ubuntu 20.04. NIVIDA GPU and OpenGL capable. CUADA 11.4.

cuda docker opengl realsense realsense2 ubuntu wsl2

Last synced: 24 Mar 2025

https://github.com/lukoshkin/hpc

Skoltech HPC course

cuda curand hpc mpi omp

Last synced: 10 Apr 2025

https://github.com/harrydobbs/torch_ransac3d

A high-performance implementation of 3D RANSAC (Random Sample Consensus) algorithm using PyTorch and CUDA.

3d cloud cubiod cuda cylinder plane plane-detection point point-cloud ransac segmentation

Last synced: 03 Oct 2025

https://github.com/cascadingradium/cuda-hungarian-clustering

A GPU-Accelerated Clustering Algorithm that uses the Hungarian method

clustering cpp cuda gpu hungarian-algorithm parallel-computing

Last synced: 16 May 2025

https://github.com/andydevs/cudafractal

Fractal Generator using Nvidia's CUDA framework

cplusplus cuda nvidia

Last synced: 23 Apr 2025

https://github.com/eunomia-bpf/basic-cuda-tutorial

A collection of CUDA programming examples to learn GPU programming

cuda tutorial

Last synced: 15 Jun 2025

https://github.com/hbseong97/tf-c-api

Using tensorflow c api, c++ api, tf lite, tf js, model conversion in Windows

bazel checkpoint cuda cudnn tensorflow

Last synced: 09 Apr 2025

https://github.com/fatlipp/cuda-tree

CUDA-based Tree builder

algorithms cpp cuda octree quadtree tree

Last synced: 19 Jun 2025

https://github.com/taeguk/dist-prog-assignment

Sogang Univ. Distributed Programming (CSE5414) Assignments.

assignment cuda distributed mpi-library openmp parallel pthreads sogang

Last synced: 13 Jun 2025

https://github.com/ROCm/hipMM

HIP Memory Manager (ROCm-DS)

amd cuda gpu hip memory-management radeon-instinct-mi-series rocm

Last synced: 12 Apr 2025

https://github.com/coderonion/cuda-beginner-course-python-version

bilibili视频【CUDA 12.x 并行编程入门(Python版)】配套代码

cpp cublas cuda cuda-programming cudnn cupy gpu gpu-programming nvcc nvidia parallel-programming python rust

Last synced: 19 Oct 2025

https://github.com/rocm/hipmm

HIP Memory Manager (ROCm-DS)

amd cuda gpu hip memory-management radeon-instinct-mi-series rocm

Last synced: 12 Apr 2025

https://github.com/lanzani/opencv-cuda-docker

Docker with opencv with cuda support.

cuda docker nvidia-docker nvidia-gpu opencv opencv-cuda opencv-dnn

Last synced: 12 Oct 2025

https://github.com/rogerallen/smandelbrotr

SDL2 CUDA OpenGL Mandelbrot explorer.

cuda mandelbrot-viewer opengl sdl2

Last synced: 08 Mar 2026

https://github.com/wzqvip/jetson-pytorch-builder

build PyTorch with CUDA for Jetson Orin and Thor.

cuda jetson pytorch

Last synced: 01 Dec 2025

https://github.com/sbaldu/neural_network_hep

Implementation of a neural network framework from scratch in C++ applied to particle physics

cpp cuda high-energy-physics neural-networks

Last synced: 20 Jul 2025

https://github.com/648trindade/sbac-pad-marathon-problems

Repository containing problems of the SBAC-PAD Marathon of Parallel Programming and some parallel solutions to them.

cuda high-performance-computing mpi openmp parallel-computing

Last synced: 01 May 2025

https://github.com/matthewhaynesonline/ai-server-setup

Setup AWS EC2 instance from scratch with NVIDIA CUDA, Docker, Packer for AI / ML.

ai ami aws cuda devops docker ml mlops packer

Last synced: 12 Apr 2025

https://github.com/luismisanve/gguf-to-pytorchtensor

Simple Python Script that converts the Weight of a GGUF Model to a PyTorch Tensor

cuda gguf-models huggingface llamacpp numpy python pytorch tensor

Last synced: 20 Apr 2026

https://github.com/cascadingradium/air-traffic-distribution

A GPU-Accelerated Multi-Objective Genetic Algorithm for Air Traffic Management

air-traffic-control air-traffic-management c cuda genetic-algorithm gpu-acceleration

Last synced: 16 May 2025

https://github.com/yuvix25/py2cuda

Convert Python 3 code to CUDA code.

converter cuda gpu gpu-acceleration python python3

Last synced: 11 Sep 2025

https://github.com/guilt/rocm-programming-masterclass

Udemy's CUDA programming Masterclass with Examples in ROCM/HIP.

cuda easy hip learning-by-doing masterclass rocm

Last synced: 04 Aug 2025

https://github.com/appsolves/lanepilot

The worlds first real-time AI-powered traffic management system, featuring automated vehicle detection, lane allocation optimization, and dynamic control for (autonomous) cars!

ai ai-traffic-management autonomous-driving computer-vision cuda edge-computing embedded-systems jetson-orin-nano-super lane-detection pytorch

Last synced: 29 Apr 2026

https://github.com/rupeshs/anomalydetection

Anomaly Detection Using Anomalib and OpenVINO – Step by Step by Guide

anomalib anomaly anomalydetection computer-vision cpu cuda gpu intel onnx opencv pytorch

Last synced: 13 Apr 2025

https://github.com/caps-umu/fideslib

A server-side CKKS GPU library fully interoperable with OpenFHE.

ckks cuda gpu homomorphic-encryption openfhe

Last synced: 08 Oct 2025

https://github.com/skailasa/pyrsvd

Accelerated Randomised SVD in Python

cuda numba python3 randomised-algorithms svd

Last synced: 07 May 2025

https://github.com/shunk031/nvinfo-go

Rewrite of ikr7/nvinfo, a simple utility for monitoring your CUDA-enabled GPUs, with Golang

cli cuda go golang gpu nvidia nvidia-smi

Last synced: 02 Apr 2025

https://github.com/hyeonsangjeon/pdf2llm-tuning-studio

PDF 문서에서 GPU 가속 처리로 고품질 질의응답(QA) 데이터를 자동 생성하고 LLM을 효율적으로 파인튜닝하는 솔루션입니다. Unstructured 라이브러리와 AWS Bedrock Claude로 도메인 특화 QA 쌍을 생성하고, LoRA 기법으로 경량 모델을 훈련합니다.

aws bedrock claude cuda data-argumantation data-extraction distillation docker finetuning gpu llm pdf-generation pdf-text-extraction processing processing-job sagemaker text-disti unsloth unstructured

Last synced: 15 Jun 2025

https://github.com/pnnl/cuvite

Multi-GPU Graph Community Detection using CUDA

community-detection cuda graph-clustering mpi

Last synced: 25 Jul 2025

https://github.com/zeloe/rtconvolver

A realtime convolution VST3

c convolution cplusplus cuda juce

Last synced: 22 Apr 2025

https://github.com/ergus/gpukalmanfilter

Kalman Filter test code using C, C++, Cuda and OpenCL.

cpp cuda gpgpu kalman-filter makefile opencl performance vectorization

Last synced: 28 Oct 2025

https://github.com/kuroko1t/gocuda

Go binding for Cuda Driver API

cuda go golang

Last synced: 02 May 2026

https://github.com/rogerallen/qtmandelbrotr

Qt CUDA Mandelbrot explorer

cuda cuda-opengl mandelbrot-viewer qt5

Last synced: 02 Aug 2025

https://github.com/ellite/anchor-sub-sync

Anchor: A universal, hardware-accelerated CLI tool for subtitle synchronization (Whisper) and context-aware translation (NLLB)

ai audio-transcription automation cli cuda nllb python pytorch srt subtitle-sync subtitle-translation subtitles synchronization translation whisper

Last synced: 24 Feb 2026

https://github.com/brosnanyuen/raybnn_raytrace

Ray tracing library using GPUs, CPUs, and FPGAs via CUDA, OpenCL, and oneAPI

arrayfire cuda gpu gpu-computing opencl parallel parallel-computing ray ray-tracing raybnn raylib raytracer raytracing rust

Last synced: 26 Aug 2025

https://github.com/official-imvoiid/portable-miniconda-setup-for-window

Portable Miniconda Setup for Windows 🐍 Easily create a portable Conda environment with automated scripts for flexible Python version management and CUDA support. 🚀

conda conda-environment cuda datascience machinelearning nvidia nvidia-cuda portable python

Last synced: 16 Apr 2026

https://github.com/vorticity-inc/vtensor

VTensor, a C++ library, facilitates tensor manipulation on GPUs, emulating the python-numpy style for ease of use. It leverages RMM (RAPIDS Memory Manager) for efficient device memory management. It also supports xtensor for host memory operations.

cublas cuda curand cusolver gpu numpy rmm tensor xarray xtensor

Last synced: 14 Apr 2025

https://github.com/amirhoseinmasoumi/onnx-cuda-inference

A C++ project for running CUDA-accelerated ONNX model inference, using ONNX Runtime and OpenCV for image segmentation tasks.

cpp cuda inference onnxruntime onnxruntime-gpu opencv segmentation

Last synced: 12 Apr 2025

https://github.com/codingonion/moblas

BLAS (Basic Linear Algebra Subprograms) library written in mojo programming language.

blas blis cublas cuda eigen fortran gemm gonum hpc lapack linear-algebra math mkl mojo numpy openblas pytorch scientific-computing simd tensor

Last synced: 25 Apr 2025

https://github.com/ivanrs297/pycuda-covariance-matrix

A PyCUDA covariance matrix parallel implementation

covariance-matrix cuda pycuda

Last synced: 25 Oct 2025

https://github.com/pkestene/cuda_mpi_autotools_proj_template

A template project for CUDA+MPI with autotools build system

automake autotools cuda cuda-mpi mpi

Last synced: 25 Oct 2025

https://github.com/sean-bradley/cudalookupsha256

SHA256 Lookup using parallel processing on a NVidia CUDA Compatible Graphics card

cuda parallel-processing sha256

Last synced: 05 May 2025

https://github.com/sean-bradley/cudalookupripemd60

RipeMD160 Lookup using parallel processing on NVidia CUDA Graphics card

cuda parallel-processing ripemd160

Last synced: 05 May 2025

https://github.com/statikfintechllc/godcore

All-in-one local AI stack for Mistral-13B and Llama.cpp, with one-step CUDA wheel install, OpenAI-compatible API, and modern web dashboard. Switch between local and cloud chat, run on your own GPU, and deploy instantly—no API keys or paywalls. Designed for easy install, custom builds, and fast remote access. Enjoy!

ai chatbot chatgpt cuda dashboard fastapi llama-cpp llm local-ai mistral openai-compatible react selfhosted webui

Last synced: 25 Jun 2025

https://github.com/jaxony/pynvidia

⚙️ NVIDIA GPU utilities for Python 🔧

cuda deep-learning nvidia-gpu pip python utility

Last synced: 07 May 2025

https://github.com/aespinosadev/opengl-renderer

OpenGL renderer showcasing all basic functionality to render 3D scenes.

computer-graphics cuda gpgpu graphics-engine graphics-programming opengl rendering rendering-3d-graphics shaders video-game

Last synced: 24 Jul 2025

https://github.com/enfiskutensykkel/cuda-rdma-bench

NVIDIA GPU direct RDMA using SISCI API

cuda dma gpudirect-rdma pcie rdma sisci

Last synced: 30 Mar 2025

https://github.com/bokutotu/curs

cuda&cublas&cudnn wrapper for Rust

cuda deep-learning high-performance-computing hpc rust

Last synced: 20 May 2026

https://github.com/chiang-yuan/culsm

CUDA C++ code implementing GPU-accelerated Lattice Spring Model (CuLSM) simulations.

cuda gpu parallel-computing particles

Last synced: 07 Sep 2025

https://github.com/coderonion/moblas

BLAS (Basic Linear Algebra Subprograms) library written in mojo programming language.

blas blis cublas cuda eigen fortran gemm gonum hpc lapack linear-algebra math mkl mojo numpy openblas pytorch scientific-computing simd tensor

Last synced: 15 Jun 2025