CUDA
CUDA® is a parallel computing platform and programming model developed by NVIDIA for general computing on graphical processing units (GPUs). With CUDA, developers are able to dramatically speed up computing applications by harnessing the power of GPUs.
- GitHub: https://github.com/topics/cuda
- Wikipedia: https://en.wikipedia.org/wiki/CUDA
- Created by: Nvidia
- Released: June 23, 2007
- Related Topics: nvcc,
- Last updated: 2026-06-20 00:07:16 UTC
- JSON Representation
https://github.com/lanl/stcuda
StCUDA allows Smalltalk to call CUDA Driver APIs to do GPU computing
Last synced: 12 Apr 2025
https://github.com/chrxh/alien-docs
Documentation for ALIEN
cuda evolution physics-simulation simulation
Last synced: 24 Jun 2025
https://github.com/stonerlab/jams
JAMS: a GPU accelerated atomistic spin dynamics code
c-plus-plus cuda heisenberg-model leeds-university magnetism physics-simulation simulation spin-dynamics
Last synced: 28 Feb 2026
https://github.com/basemax/predictionwikipediamathematicsvisitsresearch
Improving Prediction of Daily Visits of Wikipedia Mathematics Topics using Graph Neural Networks
article convgru cuda gconvgru graph-neural-network graph-neural-networks neural-network neural-network-architectures neural-network-example neural-network-tutorials neural-networks python pytorch research torch wikipedia
Last synced: 05 May 2025
https://github.com/p-ranav/vulkan-earth
Vulkan-based 3D Rendering of Earth
3d cuda engine gpu rendering simulation vulkan
Last synced: 05 May 2025
https://github.com/radenmuaz/slope-ad
A small automatic differentiation engine, supporting higher-order derivatives
array autograd automatic-differentiation cuda gradient iree jvp machine-learning metal mlir onnx onnxruntime tensor vjp
Last synced: 26 Jun 2025
https://github.com/giovaneiwamoto/cuda-shortest-paths
🧩 Cuda Shortest Paths - Parallel Dijkstra and Floyd algorithms using Nvidia CUDA to calculate All-Pairs Shortest Path (APSP) in a given graph represented by its adjacency matrix.
all-pairs-shortest-path cuda nvidia
Last synced: 29 Apr 2025
https://github.com/abus-aikorea/aria-coversong
The best gradio web-ui for creating cover song that uses mdx-net and rvc. Easy one click installation. Fully portable.
cuda demucs gradio karaoke mdx-net nvidia python pytorch rvc song-covers uvr vocal-remover voice-conversion
Last synced: 25 Apr 2025
https://github.com/jonasricker/autocvd
Tool to automatically set CUDA_VISIBLE_DEVICES based on GPU utilization. Usable from command line and code.
cuda cuda-visible-devices gpu keras machine-learning nvidia python pytorch tensorflow
Last synced: 26 Feb 2026
https://github.com/torinos-yt/nnonnx
Using CUDA for Faster Machine Learning Inference on Unity
cuda machine-learning onnxruntime unity
Last synced: 09 Jul 2025
https://github.com/anantzoid/cuda-genetic-algorithm-travelling-salesman-problem
Implementation of Parallel Genetic Algorithm in CUDA to solve TSP (Berlin52)
c cuda genetic-algorithm tsp tsp-solver
Last synced: 25 Jul 2025
https://github.com/belval/raytracing
Using CUDA to implement "Raytracing in one weekend" by Peter Shirley
cuda raytracing raytracing-in-one-weekend
Last synced: 12 Apr 2025
https://github.com/ehwan/r-star-tree
HeaderOnly STL-like template N-dimensional R*Tree implementation on C++14
algorithm cplusplus-14 cuda eigen3 geometric-algorithms gpgpu header-only linear-algebra modern-cpp opencl rtree spatial spatial-index stl-like template traits tree-structure
Last synced: 09 Oct 2025
https://github.com/neural-bits/ai-programming-hub
Learn and experiment with new techniques and programming languages with a focus on ML
cpp cuda cython openai-triton python rust
Last synced: 12 Apr 2025
https://github.com/abaksy/cuda-examples
A repository of examples coded in CUDA C/C++
Last synced: 31 May 2026
https://github.com/mr-technologies/crowsnest
MRTech IFF SDK web interface sample
camera cuda demosaicing dng frontend genicam gpu h264 image-processing jetson json low-latency machine-vision mipi rest-api rtsp tiff vulkan webrtc ximea
Last synced: 06 Sep 2025
https://github.com/antoniopelusi/lu-solver
Assignments for High Performace Computing exam at Unimore, Modena, IT.
Last synced: 27 Feb 2026
https://github.com/k-hengzhou/hphoto
一个基于AI的智能照片管理工具,支持人脸识别、相似人脸自动聚类和nsfw检测
cuda insightface nsfw nsfw-detection nudenet photos
Last synced: 26 Feb 2025
https://github.com/pratikvn/schwarz-lib
Repository for testing asynchronous schwarz methods.
asynchronous cuda domain-decomposition ginkgo schwarz
Last synced: 14 Apr 2025
https://github.com/pkestene/tsp
traveling salesman problem solved with different programing models
cea cpp cuda kokkos nvidia-gpu openacc openmp performance-portability stdpar sycl
Last synced: 19 Aug 2025
https://github.com/rapidsai/cugraph-docs
cuGraph Docs - RAPIDS Graph Analytics Documentation
cuda cugraph documentation graph rapids
Last synced: 12 Sep 2025
https://github.com/gapi505/sparky-2
This is a discord bot running on llama cpp with the llama 3 model and image geneartion
ai cuda llama3 llamacpp stable-diffusion torch transformers
Last synced: 07 Oct 2025
https://github.com/jpuigcerver/nnutils
CPU & CUDA implementation of several neural network utils
cuda deep-learning neural-networks openmp pytorch
Last synced: 11 Apr 2025
https://github.com/almirneeto99/leetgpu-challenges
This repository contains the solution for LeetGPU Challenges
Last synced: 18 Apr 2026
https://github.com/axnjr/snn_be_pro
A state of the art AI framework for no/low-code (visually - drag & drop) building, testing, deploying, integrating latest deep learning models with privacy & security compliance using ollama, as a final year project!
ai cplusplus cpp cuda deep-neural-networks kernel-driver ml mlops python
Last synced: 06 Oct 2025
https://github.com/jasmcaus/hazel
A Tensor Library written in C++.
artificial-intelligence autodiff autograd automatic-differentiation computing cpp cuda deep-learning differentiation gpu hazel-lang ml neural neural-network python pytorch scientific-computing tensor tensor-library
Last synced: 26 Apr 2025
https://github.com/nikhilrout/thegemmcoreproject
SystemVerilog Implementation of Nvidia's CUDA/Tensor Core GEMM Operations
cuda floating-point gemm gpgpu hybrid-precision-training sparse-matrix systolic-array tensorcore tpu
Last synced: 17 Aug 2025
https://github.com/pyhf/cuda-images
pyhf Docker images built on Nvidia Container Toolkit enabled base images
cuda jax nvidia nvidia-cuda nvidia-docker pyhf
Last synced: 15 Jul 2025
https://github.com/jmuwrobotics/libbicos
GPU-Accelerated Binary Correspondence Search for Multishot Stereo Vision
computer-vision cuda depth-map stereo-camera stereo-matching stereo-vision
Last synced: 14 Oct 2025
https://github.com/neoblizz/cudagl
CUDA based Graphics Library for NVIDIA's GPUs.
cuda graphics-library graphics-programming opengl
Last synced: 18 Jun 2025
https://github.com/tudasc/cusan
A data race detector for CUDA C and C++ based on ThreadSanitizer
c cpp cuda datarace threadsanitizer
Last synced: 12 Aug 2025
https://github.com/sammwyy/cuda.js
CUDA bindings for Node.js
bindings bun bunjs cuda cuda-kernels cuda-library javascript library nodejs nvidia typescript
Last synced: 06 Oct 2025
https://github.com/skizzy-create/ayurvedic_his
🩺 A personalized app that serves as your personal Ayurvedic assistant, providing tailored advice and guidance based on Ayurvedic principles. 🩺
cuda gpt python pytorch transformers
Last synced: 04 Oct 2025
https://github.com/coderonion/cuda-beginner-course-rust-version
bilibili视频【CUDA 12.x 并行编程入门(Rust版)】配套代码
candle cpp cublas cuda cuda-programming cudarc cudnn gpu gpu-programming nvcc nvidia parellel-programming python rust
Last synced: 15 Jun 2025
https://github.com/pinto0309/realsense-cuda-opengl-docker
RealSense execution environment built on a Docker container on Ubuntu 20.04. NIVIDA GPU and OpenGL capable. CUADA 11.4.
cuda docker opengl realsense realsense2 ubuntu wsl2
Last synced: 24 Mar 2025
https://github.com/harrydobbs/torch_ransac3d
A high-performance implementation of 3D RANSAC (Random Sample Consensus) algorithm using PyTorch and CUDA.
3d cloud cubiod cuda cylinder plane plane-detection point point-cloud ransac segmentation
Last synced: 03 Oct 2025
https://github.com/aryagxr/cuda
100 Days of CUDA!!!
cuda gpu-programming kernels parallel-programming
Last synced: 05 Oct 2025
https://github.com/cascadingradium/cuda-hungarian-clustering
A GPU-Accelerated Clustering Algorithm that uses the Hungarian method
clustering cpp cuda gpu hungarian-algorithm parallel-computing
Last synced: 16 May 2025
https://github.com/nguyenphuminh/planckgpt
Train a GPT from scratch on your laptop
ai attention cuda deep-learning dl gpt gpu language-model llm machine-learning ml nlp torch transformer
Last synced: 16 May 2026
https://github.com/pedro-avalos/gpu-burn-snap
Unofficial snap for GPU Burn
cuda gpu gpu-burn linux package snap snapcraft stress-test stress-testing
Last synced: 23 Feb 2026
https://github.com/andydevs/cudafractal
Fractal Generator using Nvidia's CUDA framework
Last synced: 23 Apr 2025
https://github.com/eunomia-bpf/basic-cuda-tutorial
A collection of CUDA programming examples to learn GPU programming
Last synced: 15 Jun 2025
https://github.com/hbseong97/tf-c-api
Using tensorflow c api, c++ api, tf lite, tf js, model conversion in Windows
bazel checkpoint cuda cudnn tensorflow
Last synced: 09 Apr 2025
https://github.com/fatlipp/cuda-tree
CUDA-based Tree builder
algorithms cpp cuda octree quadtree tree
Last synced: 19 Jun 2025
https://github.com/taeguk/dist-prog-assignment
Sogang Univ. Distributed Programming (CSE5414) Assignments.
assignment cuda distributed mpi-library openmp parallel pthreads sogang
Last synced: 13 Jun 2025
https://github.com/ROCm/hipMM
HIP Memory Manager (ROCm-DS)
amd cuda gpu hip memory-management radeon-instinct-mi-series rocm
Last synced: 12 Apr 2025
https://github.com/coderonion/cuda-beginner-course-python-version
bilibili视频【CUDA 12.x 并行编程入门(Python版)】配套代码
cpp cublas cuda cuda-programming cudnn cupy gpu gpu-programming nvcc nvidia parallel-programming python rust
Last synced: 19 Oct 2025
https://github.com/rocm/hipmm
HIP Memory Manager (ROCm-DS)
amd cuda gpu hip memory-management radeon-instinct-mi-series rocm
Last synced: 12 Apr 2025
https://github.com/lanzani/opencv-cuda-docker
Docker with opencv with cuda support.
cuda docker nvidia-docker nvidia-gpu opencv opencv-cuda opencv-dnn
Last synced: 12 Oct 2025
https://github.com/rogerallen/smandelbrotr
SDL2 CUDA OpenGL Mandelbrot explorer.
cuda mandelbrot-viewer opengl sdl2
Last synced: 08 Mar 2026
https://github.com/wzqvip/jetson-pytorch-builder
build PyTorch with CUDA for Jetson Orin and Thor.
Last synced: 01 Dec 2025
https://github.com/sbaldu/neural_network_hep
Implementation of a neural network framework from scratch in C++ applied to particle physics
cpp cuda high-energy-physics neural-networks
Last synced: 20 Jul 2025
https://github.com/648trindade/sbac-pad-marathon-problems
Repository containing problems of the SBAC-PAD Marathon of Parallel Programming and some parallel solutions to them.
cuda high-performance-computing mpi openmp parallel-computing
Last synced: 01 May 2025
https://github.com/mr-technologies/iff
MRTech IFF SDK documentation
camera cuda demosaicing dng genicam gpu h264 h265 image-processing jetson json low-latency machine-vision mipi rest-api rtsp sdk tiff vulkan ximea
Last synced: 11 Apr 2025
https://github.com/luismisanve/gguf-to-pytorchtensor
Simple Python Script that converts the Weight of a GGUF Model to a PyTorch Tensor
cuda gguf-models huggingface llamacpp numpy python pytorch tensor
Last synced: 20 Apr 2026
https://github.com/mchatzakis/daisy
The DaiSy Library for Fast and Exact, Data Series and Vector Similarity Search
cuda data-series disk-based distributed-systems dynamic-time-warping euclidean-distance exact-searching gpu-acceleration in-memory-computing mpi pybind11 similarity-search time-series
Last synced: 01 Apr 2026
https://github.com/cascadingradium/air-traffic-distribution
A GPU-Accelerated Multi-Objective Genetic Algorithm for Air Traffic Management
air-traffic-control air-traffic-management c cuda genetic-algorithm gpu-acceleration
Last synced: 16 May 2025
https://github.com/yuvix25/py2cuda
Convert Python 3 code to CUDA code.
converter cuda gpu gpu-acceleration python python3
Last synced: 11 Sep 2025
https://github.com/guilt/rocm-programming-masterclass
Udemy's CUDA programming Masterclass with Examples in ROCM/HIP.
cuda easy hip learning-by-doing masterclass rocm
Last synced: 04 Aug 2025
https://github.com/appsolves/lanepilot
The worlds first real-time AI-powered traffic management system, featuring automated vehicle detection, lane allocation optimization, and dynamic control for (autonomous) cars!
ai ai-traffic-management autonomous-driving computer-vision cuda edge-computing embedded-systems jetson-orin-nano-super lane-detection pytorch
Last synced: 29 Apr 2026
https://github.com/mr-technologies/imagebrokerpy
Example of image export from MRTech IFF Python SDK
camera cuda demosaicing dng genicam gpu h264 h265 image-processing jetson json low-latency machine-vision mipi opencv python rest-api rtsp tiff vulkan
Last synced: 12 Apr 2025
https://github.com/rupeshs/anomalydetection
Anomaly Detection Using Anomalib and OpenVINO – Step by Step by Guide
anomalib anomaly anomalydetection computer-vision cpu cuda gpu intel onnx opencv pytorch
Last synced: 13 Apr 2025
https://github.com/caps-umu/fideslib
A server-side CKKS GPU library fully interoperable with OpenFHE.
ckks cuda gpu homomorphic-encryption openfhe
Last synced: 08 Oct 2025
https://github.com/skailasa/pyrsvd
Accelerated Randomised SVD in Python
cuda numba python3 randomised-algorithms svd
Last synced: 07 May 2025
https://github.com/shunk031/nvinfo-go
Rewrite of ikr7/nvinfo, a simple utility for monitoring your CUDA-enabled GPUs, with Golang
cli cuda go golang gpu nvidia nvidia-smi
Last synced: 02 Apr 2025
https://github.com/hyeonsangjeon/pdf2llm-tuning-studio
PDF 문서에서 GPU 가속 처리로 고품질 질의응답(QA) 데이터를 자동 생성하고 LLM을 효율적으로 파인튜닝하는 솔루션입니다. Unstructured 라이브러리와 AWS Bedrock Claude로 도메인 특화 QA 쌍을 생성하고, LoRA 기법으로 경량 모델을 훈련합니다.
aws bedrock claude cuda data-argumantation data-extraction distillation docker finetuning gpu llm pdf-generation pdf-text-extraction processing processing-job sagemaker text-disti unsloth unstructured
Last synced: 15 Jun 2025
https://github.com/toruniina/spray
molecular viewer based on ray-tracing
c-plus-plus computer-graphics cuda molecular-graphics molecular-viewer opengl raytracing
Last synced: 19 Jan 2026
https://github.com/pnnl/cuvite
Multi-GPU Graph Community Detection using CUDA
community-detection cuda graph-clustering mpi
Last synced: 25 Jul 2025
https://github.com/evanmcclure/hello_gpu
Hello world example for Rust on GPU
apple apple-silicon cuda cuda-programming example-project gpu gpu-programming gpu-support metal rust rust-lang
Last synced: 12 Apr 2025
https://github.com/zeloe/rtconvolver
A realtime convolution VST3
c convolution cplusplus cuda juce
Last synced: 22 Apr 2025
https://github.com/ergus/gpukalmanfilter
Kalman Filter test code using C, C++, Cuda and OpenCL.
cpp cuda gpgpu kalman-filter makefile opencl performance vectorization
Last synced: 28 Oct 2025
https://github.com/tyler-hilbert/cuda-linearregression
Linear Regression in CUDA
ai cublas cuda gpu linear-regression nsight
Last synced: 30 Mar 2025
https://github.com/rogerallen/qtmandelbrotr
Qt CUDA Mandelbrot explorer
cuda cuda-opengl mandelbrot-viewer qt5
Last synced: 02 Aug 2025
https://github.com/ellite/anchor-sub-sync
Anchor: A universal, hardware-accelerated CLI tool for subtitle synchronization (Whisper) and context-aware translation (NLLB)
ai audio-transcription automation cli cuda nllb python pytorch srt subtitle-sync subtitle-translation subtitles synchronization translation whisper
Last synced: 24 Feb 2026
https://github.com/valohai/dl4j-nlp-cuda-example
A git repository containing an NLP example using DL4J (cuda) in Java
cuda cuda-details cudnn deep-learning deeplearning4j dl4j docker-container java jvm machine-learning natural-language-processing nlp nvidia nvidia-drivers nvidia-gpu valohai-cli valohai-platform
Last synced: 02 Aug 2025
https://github.com/brosnanyuen/raybnn_raytrace
Ray tracing library using GPUs, CPUs, and FPGAs via CUDA, OpenCL, and oneAPI
arrayfire cuda gpu gpu-computing opencl parallel parallel-computing ray ray-tracing raybnn raylib raytracer raytracing rust
Last synced: 26 Aug 2025
https://github.com/official-imvoiid/portable-miniconda-setup-for-window
Portable Miniconda Setup for Windows 🐍 Easily create a portable Conda environment with automated scripts for flexible Python version management and CUDA support. 🚀
conda conda-environment cuda datascience machinelearning nvidia nvidia-cuda portable python
Last synced: 16 Apr 2026
https://github.com/vorticity-inc/vtensor
VTensor, a C++ library, facilitates tensor manipulation on GPUs, emulating the python-numpy style for ease of use. It leverages RMM (RAPIDS Memory Manager) for efficient device memory management. It also supports xtensor for host memory operations.
cublas cuda curand cusolver gpu numpy rmm tensor xarray xtensor
Last synced: 14 Apr 2025
https://github.com/amirhoseinmasoumi/onnx-cuda-inference
A C++ project for running CUDA-accelerated ONNX model inference, using ONNX Runtime and OpenCV for image segmentation tasks.
cpp cuda inference onnxruntime onnxruntime-gpu opencv segmentation
Last synced: 12 Apr 2025
https://github.com/ivanrs297/pycuda-covariance-matrix
A PyCUDA covariance matrix parallel implementation
Last synced: 25 Oct 2025
https://github.com/umitkacar/onnx-tensorrt-optimization
40x faster AI inference: ONNX to TensorRT optimization with FP16/INT8 quantization, multi-GPU support, and deployment
cuda deep-learning edge-computing fp16 gpu-acceleration inference-acceleration int8 latency-optimization mlops model-deployment model-optimization nvidia-gpu onnx onnxruntime production-ai pytorch-to-onnx quantization real-time-inference tensorflow-to-onnx tensorrt
Last synced: 18 Feb 2026
https://github.com/sean-bradley/cudalookupsha256
SHA256 Lookup using parallel processing on a NVidia CUDA Compatible Graphics card
cuda parallel-processing sha256
Last synced: 05 May 2025
https://github.com/sean-bradley/cudalookupripemd60
RipeMD160 Lookup using parallel processing on NVidia CUDA Graphics card
cuda parallel-processing ripemd160
Last synced: 05 May 2025
https://github.com/statikfintechllc/godcore
All-in-one local AI stack for Mistral-13B and Llama.cpp, with one-step CUDA wheel install, OpenAI-compatible API, and modern web dashboard. Switch between local and cloud chat, run on your own GPU, and deploy instantly—no API keys or paywalls. Designed for easy install, custom builds, and fast remote access. Enjoy!
ai chatbot chatgpt cuda dashboard fastapi llama-cpp llm local-ai mistral openai-compatible react selfhosted webui
Last synced: 25 Jun 2025
https://github.com/jaxony/pynvidia
⚙️ NVIDIA GPU utilities for Python 🔧
cuda deep-learning nvidia-gpu pip python utility
Last synced: 07 May 2025
https://github.com/aespinosadev/opengl-renderer
OpenGL renderer showcasing all basic functionality to render 3D scenes.
computer-graphics cuda gpgpu graphics-engine graphics-programming opengl rendering rendering-3d-graphics shaders video-game
Last synced: 24 Jul 2025
https://github.com/enfiskutensykkel/cuda-rdma-bench
NVIDIA GPU direct RDMA using SISCI API
cuda dma gpudirect-rdma pcie rdma sisci
Last synced: 30 Mar 2025
https://github.com/bokutotu/curs
cuda&cublas&cudnn wrapper for Rust
cuda deep-learning high-performance-computing hpc rust
Last synced: 20 May 2026
https://github.com/chiang-yuan/culsm
CUDA C++ code implementing GPU-accelerated Lattice Spring Model (CuLSM) simulations.
cuda gpu parallel-computing particles
Last synced: 07 Sep 2025