Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
CUDA
![](https://explore-feed.github.com/topics/cuda/cuda.png)
CUDA® is a parallel computing platform and programming model developed by NVIDIA for general computing on graphical processing units (GPUs). With CUDA, developers are able to dramatically speed up computing applications by harnessing the power of GPUs.
- GitHub: https://github.com/topics/cuda
- Wikipedia: https://en.wikipedia.org/wiki/CUDA
- Created by: Nvidia
- Released: June 23, 2007
- Related Topics: nvcc,
- Last updated: 2025-02-06 00:06:50 UTC
- JSON Representation
https://github.com/kareimgazer/mat-transpose-cuda
series of trials for optimizing matrix transpose with CUDA
cuda hpc matrix parallel-computing simd
Last synced: 03 Feb 2025
https://github.com/shunk031/nvinfo-go
Rewrite of ikr7/nvinfo, a simple utility for monitoring your CUDA-enabled GPUs, with Golang
cli cuda go golang gpu nvidia nvidia-smi
Last synced: 15 Dec 2024
https://github.com/andydevs/cudafractal
Fractal Generator using Nvidia's CUDA framework
Last synced: 08 Nov 2024
https://github.com/rupeshs/anomalydetection
Anomaly Detection Using Anomalib and OpenVINO – Step by Step by Guide
anomalib anomaly anomalydetection computer-vision cpu cuda gpu intel onnx opencv pytorch
Last synced: 21 Jan 2025
https://github.com/pratikvn/schwarz-lib
Repository for testing asynchronous schwarz methods.
asynchronous cuda domain-decomposition ginkgo schwarz
Last synced: 27 Nov 2024
https://github.com/ctknight/fluidsimulator
A CUDA-accelerated SPH Fluid Simulator capable of simulating millions of particles in seconds
computer-animation computer-graphics cuda fluid-simulation hydrostatics simulation-engine
Last synced: 08 Nov 2024
https://github.com/lucaangioloni/parallelcomputingexam
Parallel Computing Exam
c cuda histogram-equalization integral-image java java-thread openmp parallel-computing
Last synced: 03 Feb 2025
https://github.com/648trindade/sbac-pad-marathon-problems
Repository containing problems of the SBAC-PAD Marathon of Parallel Programming and some parallel solutions to them.
cuda high-performance-computing mpi openmp parallel-computing
Last synced: 10 Jan 2025
https://github.com/toruniina/spray
molecular viewer based on ray-tracing
c-plus-plus computer-graphics cuda molecular-graphics molecular-viewer opengl raytracing
Last synced: 20 Dec 2024
https://github.com/mr-technologies/crowsnest
MRTech IFF SDK web interface sample
camera cuda demosaicing dng frontend genicam gpu h264 image-processing jetson json low-latency machine-vision mipi rest-api rtsp tiff vulkan webrtc ximea
Last synced: 24 Jan 2025
https://github.com/bencardoen/singularity_slurm_cuda
Example on how to get started with Singularity and CUDA on a SLURM cluster
cuda nvidia singularity-container slurm-cluster tensorflow
Last synced: 23 Oct 2024
https://github.com/neoheartbeats/neoheartbeats-kernel
An architecture for LLMs' continual-learning and long-term memories
cuda fine-tuning llama-factory llm
Last synced: 23 Oct 2024
https://github.com/skizzy-create/ayurvedic_his
🩺 A personalized app that serves as your personal Ayurvedic assistant, providing tailored advice and guidance based on Ayurvedic principles. 🩺
cuda gpt python pytorch transformers
Last synced: 06 Nov 2024
https://github.com/jaxony/pynvidia
⚙️ NVIDIA GPU utilities for Python 🔧
cuda deep-learning nvidia-gpu pip python utility
Last synced: 13 Dec 2024
https://github.com/enfiskutensykkel/cuda-rdma-bench
NVIDIA GPU direct RDMA using SISCI API
cuda dma gpudirect-rdma pcie rdma sisci
Last synced: 01 Nov 2024
https://github.com/aespinosadev/opengl-renderer
OpenGL renderer showcasing all basic functionality to render 3D scenes.
computer-graphics cuda gpgpu graphics-engine graphics-programming opengl rendering rendering-3d-graphics shaders video-game
Last synced: 23 Jan 2025
https://github.com/cloudmercato/python-fpb
Python Floating Point Benchmark
benchmark cuda floating-point numpy pandas python
Last synced: 22 Jan 2025
https://github.com/yuvix25/py2cuda
Convert Python 3 code to CUDA code.
converter cuda gpu gpu-acceleration python python3
Last synced: 05 Jan 2025
https://github.com/neural-bits/ai-programming-hub
Learn and experiment with new techniques and programming languages with a focus on ML
cpp cuda cython openai-triton python rust
Last synced: 05 Feb 2025
https://github.com/mre/cudampi
Large hybrid CPU/GPU sorting network using CUDA and MPI
algorithms bucket bucketsort cuda filesystem gpu hybrid-cpu mpi parallel sorting-network
Last synced: 13 Dec 2024
https://github.com/pmeier/tox-ltt
Install PyTorch distributions with light-the-torch
cuda install light-the-torch pip plugin pytorch tox
Last synced: 22 Dec 2024
https://github.com/fynv/curandrtc
CURandRTC is a GPU random number generation module based on ThrustRTC.
cuda nvrtc random-number-generators thrust
Last synced: 23 Oct 2024
https://github.com/mr-technologies/farsight
Basic MRTech IFF SDK sample application
basler camera cuda demosaicing dng gpu h264 h265 image-processing jetson json low-latency machine-vision mipi nvidia-gpu rest-api rtsp sdk tiff ximea
Last synced: 24 Nov 2024
https://github.com/rogerallen/qtmandelbrotr
Qt CUDA Mandelbrot explorer
cuda cuda-opengl mandelbrot-viewer qt5
Last synced: 25 Nov 2024
https://github.com/sbaldu/neural_network_hep
Implementation of a neural network framework from scratch in C++ applied to particle physics
cpp cuda high-energy-physics neural-networks
Last synced: 30 Oct 2024
https://github.com/cascadingradium/air-traffic-distribution
A GPU-Accelerated Multi-Objective Genetic Algorithm for Air Traffic Management
air-traffic-control air-traffic-management c cuda genetic-algorithm gpu-acceleration
Last synced: 19 Nov 2024
https://github.com/gurbaaz27/cs433a-design-exercises
Solutions of design exercises in CS433A: Parallel Programming, Spring Semester 2021-22
barriers cuda gpu-programming locks openmp parallel-programming posix-threads semaphores
Last synced: 14 Nov 2024
https://github.com/mr-technologies/iff
MRTech IFF SDK documentation
basler camera cuda demosaicing dng gpu h264 h265 image-processing jetson json low-latency machine-vision mipi nvidia-gpu rest-api rtsp sdk tiff ximea
Last synced: 24 Nov 2024
https://github.com/meetps/me-766
Assignment Solutions to course ME766 High Performance Scientific Computing.
cuda gpu-computing opencl openmp parallel-computing
Last synced: 04 Jan 2025
https://github.com/brosnanyuen/raybnn_diffeq
Differential Equation Solver using GPUs, CPUs, and FPGAs via CUDA, OpenCL, and oneAPI
arrayfire cuda differential differential-equations gpu gpu-computing opencl parallel parallel-computing parallel-programming raybnn rust
Last synced: 13 Nov 2024
https://github.com/garciparedes/parallel-scan-sky
Parallel Computing work
c cuda high-performance-computing hpc mpi openmp parallel parallel-algorithm parallel-computing parallel-processing parallel-programming parallelism parallelization university-of-valladolid
Last synced: 16 Jan 2025
https://github.com/deftruth/ptx-isa-8.2-zh
🎉持续更新:CUDA 12.2 PTX-ISA-8.2学习笔记,部分中文翻译 + 个人理解 + 内联汇编示例,讲解CUDA 12.2 PTX-ISA-8.2 汇编指令;进行中.....
Last synced: 17 Dec 2024
https://github.com/3zrv/raytracerincpp
A ray tracer that renders in 16-color VGA palette at 640x480 resolution.
Last synced: 06 Jan 2025
https://github.com/ammaryasirnaich/deeplearning_playland
This repository contains Docker Image files, which support the common frameworks required for Deep learning implementation. The images support both the latest GPU (Nvidia CUDA) and CPU processors.
cuda cuda11 cudnn cudnn8 deep-learning docker docker-image dockerfile gpu kersa opencv pytorch pytorch-cnn scikit-learn tensorflow2
Last synced: 15 Dec 2024
https://github.com/ai-dock/python
Python docker images for use in GPU cloud and local environments. Includes AI-Dock base for authentication and improved user experience.
ai cuda docker machine-learning python rocm runpod vast
Last synced: 18 Nov 2024
https://github.com/brosnanyuen/raybnn_raytrace
Ray tracing library using GPUs, CPUs, and FPGAs via CUDA, OpenCL, and oneAPI
arrayfire cuda gpu gpu-computing opencl parallel parallel-computing ray ray-tracing raybnn raylib raytracer raytracing rust
Last synced: 13 Nov 2024
https://github.com/antoniopelusi/lu-solver
Assignments for High Performace Computing exam at Unimore, Modena, IT.
Last synced: 21 Jan 2025
https://github.com/alejandrogallo/atrip
High Performance library for the CCSD(T) algorithm in quantum chemistry
asynchronous-programming coupled-cluster cuda literate-programming mpi quantum-chemistry
Last synced: 26 Nov 2024
https://github.com/hiway-media/ffmpeg-nvenc-static
FFmpeg supports NVENC encoding
cuda ffmpeg ffmpeg-cuda ffmpeg-nvenc nvidia-gpu
Last synced: 22 Dec 2024
https://github.com/simmsb/p4haskell
P4 backend in haskell
compiler cuda gpu p4 p4c p4language
Last synced: 07 Jan 2025
https://github.com/pnnl/cuvite
Multi-GPU Graph Community Detection using CUDA
community-detection cuda graph-clustering mpi
Last synced: 25 Nov 2024
https://github.com/tristanpenman/cuda-examples
A collection of CUDA example code
Last synced: 08 Dec 2024
https://github.com/mr-technologies/farsightpy
Basic MRTech IFF Python SDK sample application
camera cuda demosaicing dng genicam gpu h264 h265 image-processing jetson json low-latency machine-vision mipi python rest-api rtsp sdk tiff vulkan
Last synced: 05 Feb 2025
https://github.com/mr-technologies/imagebrokerpy
Example of image export from MRTech IFF Python SDK
camera cuda demosaicing dng genicam gpu h264 h265 image-processing jetson json low-latency machine-vision mipi opencv python rest-api rtsp tiff vulkan
Last synced: 05 Feb 2025
https://github.com/jtriley/gpucrate
Creates hard-linked GPU driver (currently just NVIDIA) volumes for use with docker, singularity, etc.
container cuda docker gpu singularity
Last synced: 07 Nov 2024
https://github.com/mr-technologies/imagebrokercpp
Example of image export from MRTech IFF C++ SDK
camera cpp cuda demosaicing dng genicam gpu h264 h265 image-processing jetson json low-latency machine-vision mipi opencv rest-api rtsp tiff vulkan
Last synced: 05 Feb 2025
https://github.com/phrutis/brainwords_for_sale
GPU program for brute brainwallets
bitcoin brainwallet brute-force btc cuda gpu passphrases program
Last synced: 02 Feb 2025
https://github.com/hrolive/fundamentals-of-accelerated-computing-with-cuda-c-cpp
Accelerate and optimize existing C/C++ CPU-only applications using the most essential CUDA tools and techniques.
cpp cuda cuda-kernels cuda-programming nsight nvidia profilling
Last synced: 04 Jan 2025
https://github.com/amirhoseinmasoumi/onnx-cuda-inference
A C++ project for running CUDA-accelerated ONNX model inference, using ONNX Runtime and OpenCV for image segmentation tasks.
cpp cuda inference onnxruntime onnxruntime-gpu opencv segmentation
Last synced: 05 Feb 2025
https://github.com/neomatrix369/dl4j-nlp-cuda-example
A git repository containing an NLP example using DL4J (cuda) in Java
cuda cuda-details cudnn deep-learning deeplearning4j dl4j docker-container java jvm machine-learning natural-language-processing nlp nvidia nvidia-drivers nvidia-gpu valohai-cli valohai-platform
Last synced: 21 Dec 2024
https://github.com/mr-technologies/farsightcpp
Basic MRTech IFF C++ SDK sample application
camera cpp cuda demosaicing dng genicam gpu h264 h265 image-processing jetson json low-latency machine-vision mipi rest-api rtsp sdk tiff vulkan
Last synced: 05 Feb 2025
https://github.com/zhihu/ZhiLight
A highly optimized inference acceleration engine for Llama and its variants.
cuda gpt inference-engine llama llm llm-serving pytorch
Last synced: 14 Dec 2024
https://github.com/bokutotu/curs
cuda&cublas&cudnn wrapper for Rust
cuda deep-learning high-performance-computing hpc rust
Last synced: 22 Dec 2024
https://github.com/definetlynotai/llm_data
A bunch of very famous repos source code's in python as pure localdocs all in this repo to train CODE AI
c code-examples cpp cuda data data-dum jupyter-notebook llm llm-code llm-datasets programming-data programming-data-sets python3
Last synced: 26 Jan 2025
https://github.com/chiang-yuan/culsm
CUDA C++ code implementing GPU-accelerated Lattice Spring Model (CuLSM) simulations.
cuda gpu parallel-computing particles
Last synced: 18 Dec 2024
https://github.com/bruce-lee-ly/cuda_back2back_hgemm
Use tensor core to calculate back-to-back HGEMM (half-precision general matrix multiplication) with MMA PTX instruction.
back2back-gemm back2back-hgemm cublas cuda fused-gemm fused-hgemm gemm gpu hgemm matrix-multiply nvidia tensor-core
Last synced: 15 Nov 2024
https://github.com/tensorush/my-dev-containers
:whale: My development environments wrapped into VS Code Dev Containers (15.02.2022).
containers cuda cuda-programming dev-container development docker docker-container jax mamba micromamba python python3 vscode vscode-devcontainer
Last synced: 25 Dec 2024
https://github.com/cniweb/srbminer-multi-cuda
Docker containing SRBMiner-Multi and CUDA
cpu-miner cpu-mining cuda gpu-miner gpu-mining miner srbminer srbminer-multi yespower yespoweric
Last synced: 13 Dec 2024
https://github.com/maximedebarbat/dolphin
Dolphin is a python toolkit meant to speed up inference of TensorRT by providing CUDA-Accelerated processing.
cuda python tensorrt-inference
Last synced: 01 Nov 2024
https://github.com/shalithasuranga/cudaperformance
Compare the performance of matrix multiplication among GPU shared memory, GPU global memory and CPU
cuda cuda-demo matrix-multiplication nvidia
Last synced: 19 Dec 2024
https://github.com/firaja/parallel-floydwarshall
Various parallel implementations of Floyd-Warshall algorithm
algorithms c cuda distributed-computing floyd-warshall gpu-computing mpi multiprocessing openmp parallel-computing parallel-programming
Last synced: 19 Jan 2025
https://github.com/harrydobbs/torch_ransac3d
A high-performance implementation of 3D RANSAC (Random Sample Consensus) algorithm using PyTorch and CUDA.
3d cloud cubiod cuda cylinder plane plane-detection point point-cloud ransac segmentation
Last synced: 22 Nov 2024
https://github.com/silviopaganini/darknet-docker-nvidia
Docker Image to run Darknet on Nvidia with CUDA 9.0 and openCV 3.4.0
cuda darknet docker nvidia-docker opencv
Last synced: 27 Nov 2024
https://github.com/fatlipp/cuda-tree
CUDA-based Tree builder
algorithms cpp cuda octree quadtree tree
Last synced: 21 Jan 2025
https://github.com/jedbrooke/cuda_bwt
CUDA accelerated burrows-wheeler transform
bioinformatics burrows-wheeler-transform bwt compression cuda
Last synced: 19 Jan 2025
https://github.com/franneck94/cuda-aes
AES Implementation (Counter Mode) in C++, OpenMP and CUDA.
aes c-plus-plus counter cuda encryption openmp parallel
Last synced: 31 Oct 2024
https://github.com/marcoplaitano/counting-sort-cuda
Parallelized version of Counting Sort using CUDA
counting-sort cuda cuda-kernels cuda-programming gpu gpu-programming sort sorting sorting-algorithms
Last synced: 26 Dec 2024
https://github.com/mr-technologies/lensprofiler
MRTech IFF SDK lens profiling tool
c calibration camera camera-calibration cuda distortion distortion-correction genicam gpu image-processing jetson json lens low-latency machine-vision mipi opencv python sdk tiff
Last synced: 20 Dec 2024
https://github.com/rkv0id/boltzmanumba
GPU-Parallelization of a sequential Lattice Boltzmann gist on CUDA-capable devices using Numba.
Last synced: 03 Jan 2025
https://github.com/prince781/libgpublas
Drop-in GPU acceleration for linear algebra.
blas blas-kernels c cblas clblas cuda gpu gpu-acceleration hpc interposition linear-algebra nvidia opencl
Last synced: 14 Jan 2025
https://github.com/ragibson/cuda-k-means
An implementation of Lloyd's algorithm for data clustering on GPUs and computational accelerators.
clustering cuda gpu k-means unsupervised-clustering
Last synced: 05 Jan 2025
https://github.com/ran-2012/inversion
solve geophysics using CUDA & TensorFlow
cpp cuda geophysics inversion-method python
Last synced: 10 Jan 2025
https://github.com/jasmcaus/hazel
A Tensor Library written in C++.
artificial-intelligence autodiff autograd automatic-differentiation computing cpp cuda deep-learning differentiation gpu hazel-lang ml neural neural-network python pytorch scientific-computing tensor tensor-library
Last synced: 11 Nov 2024
https://github.com/sean-bradley/cudalookupripemd60
RipeMD160 Lookup using parallel processing on NVidia CUDA Graphics card
cuda parallel-processing ripemd160
Last synced: 13 Nov 2024
https://github.com/dayyass/hpc
My experiments with MPI and OpenMP
cpp cuda gpu high-performance-computing hpc mpi nvidia openmp parallel-computing super-computing
Last synced: 14 Oct 2024
https://github.com/BrosnanYuen/RayBNN_Raytrace
Ray tracing library using GPUs, CPUs, and FPGAs via CUDA, OpenCL, and oneAPI
arrayfire cuda gpu gpu-computing opencl parallel parallel-computing ray ray-tracing raybnn raylib raytracer raytracing rust
Last synced: 05 Nov 2024
https://github.com/lu-zero/nvidia-video-codec
Redistributable headers to build cuvid and nvenc
cuda cuvid nvenc nvidia nvidia-video-codec
Last synced: 16 Nov 2024
https://github.com/yashkathe/image-noise-reduction-with-cuda
This project conducts an analysis of image denoising technique - median blur, comparing GPU-accelerated (Numba) and CPU-based (OpenCV) processing speeds.
cuda cuda-programming gpu-programming hardware-speed-analysis image-analysis image-processing numba nvidia nvidia-cuda nvidia-gpu opencv parallel-programming
Last synced: 25 Dec 2024
https://github.com/redhat-na-ssa/gpu-workshop
Using GPUs on Red Hat Platforms
Last synced: 04 Dec 2024
https://github.com/egororachyov/spbench
Benchmark for sparse linear algebra libraries for CPU and GPU platforms.
benchmark cpp cpu cuda gpu-computing graphblas opencl sparse-matrices
Last synced: 19 Nov 2024
https://github.com/harrism/nsys_easy
Easier, quicker command-line CUDA profiling
Last synced: 13 Oct 2024
https://github.com/guilt/rocm-programming-masterclass
Udemy's CUDA programming Masterclass with Examples in ROCM/HIP.
Last synced: 15 Nov 2024
https://github.com/kdeps/kdeps
Kdeps reduces the complexity of building self-hosted RAG AI Agents and APIs powered by open-source LLMs
agents ai-agents api artificial-intelligence cuda docker dockerized fine-tuning huggingface llama llm llm-agent llmops mistral mlops model-inference multimodal nvidia opensource vicuna
Last synced: 02 Feb 2025
https://github.com/thomasjo/cudalicious
C++ header library intended to reduce CUDA boilerplate code
boilerplate cpp cuda header-only
Last synced: 21 Jan 2025