Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
CUDA
CUDA® is a parallel computing platform and programming model developed by NVIDIA for general computing on graphical processing units (GPUs). With CUDA, developers are able to dramatically speed up computing applications by harnessing the power of GPUs.
- GitHub: https://github.com/topics/cuda
- Wikipedia: https://en.wikipedia.org/wiki/CUDA
- Created by: Nvidia
- Released: June 23, 2007
- Related Topics: nvcc,
- Last updated: 2025-01-31 00:06:47 UTC
- JSON Representation
https://github.com/roflmaostc/radonka.jl
A simple yet sufficiently fast (attenuated) Radon and backproject implementation using KernelAbstractions.jl. Runs on CPU, CUDA, ...
automatic-differentiation computed-tomography ct cuda gpu julia julia-language optimization radon radon-transform tomography x-ray
Last synced: 12 Oct 2024
https://github.com/tcoppex/cudaraster-linux
Linux port of cudaraster, Nvidia's GPU rasterizer.
Last synced: 16 Nov 2024
https://github.com/erkaman/parle-cuda
A reference implementation of RLE in CUDA
c-plus-plus compression cuda data-compression demo gpgpu gpu parle rle run-length-encoding
Last synced: 12 Nov 2024
https://github.com/qengineering/tensorflow-addons-jetson-nano
TensorFlow Addons installation wheels for Jetson Nano
aarch64 cuda cudnn installation-wheel jetson-nano linux python3 tensorflow-addons wheel
Last synced: 27 Nov 2024
https://github.com/ktaletsk/gpu_dsm
🔗Accessible quantitative polymer rheology predictions with slip-links on GPU
c-plus-plus cuda gpu polymer rheology
Last synced: 31 Dec 2024
https://github.com/joaomlneto/cpds-heat
Heat Equation using different solvers (Jacobi, Red-Black, Gaussian) in C using different paradigms (sequential, OpenMP, MPI, CUDA) - Assignments for the Concurrent, Parallel and Distributed Systems course @ UPC 2013
cuda cuda-support gauss-seidel gaussian heat-equation jacobi mpi mpi-applications openmp openmp-applications openmp-parallelization openmp-support openmpi paradigms performance red-black solvers
Last synced: 09 Nov 2024
https://github.com/maxilevi/raytracer
C++ raytracer that supports custom models. Supports running the calculations on the CPU using C++11 threads or in the GPU via CUDA.
bvh cuda graphics-programming intersection raytracer
Last synced: 11 Nov 2024
https://github.com/zephirfxec/hnanosolver
Houdini GPU Fluid Solver powered by NanoVDB
cpp cuda fluid-dynamics houdini nanovdb openvdb
Last synced: 23 Oct 2024
https://github.com/NAGAGroup/Scalix
Scalix is a data parallel compute library that automatically scales to the available compute resources.
Last synced: 02 Nov 2024
https://github.com/aniketsingh03/processing-history-of-images
:bulb: Detecting processing history of images by using Deep Learning
cuda deep-learning image-forensics matlab python3 pytorch
Last synced: 19 Dec 2024
https://github.com/enp1s0/shgemm
Fast multiplication of single-precision and half-precision matrices on Tensor Cores
Last synced: 26 Dec 2024
https://github.com/ghost---shadow/near-duplicate-image-detector
CUDA implementation of some perceptual hashing algorithms
Last synced: 11 Oct 2024
https://github.com/tillahoffmann/universal_tensorflow_image
Develop tensorflow models with or without a GPU accelerator using the same Docker image. 🥳
Last synced: 11 Oct 2024
https://github.com/adamtiger/tinygpulang
Tutorial on building a gpu compiler backend in LLVM
Last synced: 14 Oct 2024
https://github.com/yhmtsai/ci_windows_cuda
This Repo creates the dockerfiles for using cuda in windows docker and provides the gitlab/github windows shared vm runner config.
continuous-integration cuda docker github-actions gitlab windows
Last synced: 27 Nov 2024
https://github.com/jacobtomlinson/advent-of-gpu-code-2020
Solutions for Advent of Code 2020 written for the GPU in Python
advent-of-code cuda gpu jupyter-notebooks numba python
Last synced: 29 Oct 2024
https://github.com/yalue/cudabrot
A CUDA renderer for the Buddhabrot fractal
amd buddhabrot buddhabrot-fractal cuda gpu hip mandelbrot mandelbrot-fractal rocm
Last synced: 23 Oct 2024
https://github.com/neka-nat/cuimage
Rust implementation of image processing library with CUDA
Last synced: 14 Oct 2024
https://github.com/pkestene/cuda-proj-tmpl
A minimal cmake based project skeleton for developping a CUDA application
cea cmake cuda gpu gpu-computing parallel-computing parallel-programming template
Last synced: 18 Dec 2024
https://github.com/elftausend/gradients
Deep Learning library written in Rust (OpenCL, CUDA & CPU)
cpu cuda deep-learning gpu gpu-acceleration machine-learning mlp neural-networks opencl rust
Last synced: 08 Dec 2024
https://github.com/lanl/stcuda
StCUDA allows Smalltalk to call CUDA Driver APIs to do GPU computing
Last synced: 09 Dec 2024
https://github.com/bokutotu/zenu
Deep Learning Framework Written in Rust
ai autograd blas cublas cuda cudnn deep-learning deep-neural-networks gpu-computing hpc rust
Last synced: 15 Dec 2024
https://github.com/superlinear-ai/python-gpu
🐳 Python GPU adds a minimal install of CUDA and cuDNN on top of the official python:3.x-slim base image
cuda cudnn docker docker-image python
Last synced: 11 Nov 2024
https://github.com/eve-ning/glcm-cupy
GLCM in CUDA
computer-vision cuda cupy feature-engineering glcm python
Last synced: 27 Oct 2024
https://github.com/bhattbhavesh91/cudf-rapids-demo
A simple demo of cuDF which is a RAPIDS GPU-Accelerated Dataframe Library!
arrow cuda cudf demo gpu gpu-dataframe pandas python rapids
Last synced: 16 Nov 2024
https://github.com/p-ranav/vulkan-earth
Vulkan-based 3D Rendering of Earth
3d cuda engine gpu rendering simulation vulkan
Last synced: 13 Nov 2024
https://github.com/dancing-ui/uestc_vhm
使用yolov8、fast-reid、deepsort完成目标跟踪
cuda deepsort dockerfile fast-reid tensorrt yolov8n
Last synced: 23 Oct 2024
https://github.com/benediktalkin/kappaprofiler
lightweight simple profiling for python/pytorch
Last synced: 09 Nov 2024
https://github.com/abus-aikorea/aria-coversong
The best gradio web-ui for creating cover song that uses mdx-net and rvc. Easy one click installation. Fully portable.
cuda demucs gradio karaoke mdx-net nvidia python pytorch rvc song-covers uvr vocal-remover voice-conversion
Last synced: 10 Nov 2024
https://github.com/valentingol/deep-learning-installation
This tutorial provide a step-by-step pipeline to install an effective Python set-up optimized for deep learning for Ubuntu LTS, containing libraries to use efficiently the last versions of Tensorflow and Pytorch with the GPU and a comfortable environment of work with flexible and highly customizable IDE (VSCode) and environment manager (Virtualenv/VirtualenvWrapper).
cuda deep-learning deep-learning-library deep-learning-tutorial setup tutorial virtualenv virtualenvwrapper vscode
Last synced: 11 Oct 2024
https://github.com/mnicely/computeworks_examples
Matrix multiplication example performed with OpenMP, OpenACC, BLAS, cuBLABS, and CUDA
blas cublas cuda docker eclipse-plugin nsight nvidia nvidia-docker openacc openmp pgi-compiler
Last synced: 15 Oct 2024
https://github.com/dusanerdeljan/stereo-depth
Bachelor thesis - GPU accelerated single view passive stereo depth estimation pipeline
convolutional-neural-networks cuda depth-estimation pytorch real-time stereo-matching stereo-vision
Last synced: 11 Oct 2024
https://github.com/flyingfathead/dvr-yolov8-detection
Python+YOLOv8-based human/animal/object detection DVR framework with GUI, webUI and Telegram alerts
automation cctv cuda dvr dvr-tool human-activity-recognition human-detection object-detection opencv opencv-python opencv2 opencv2-python python rtmp security video-processing yolo yolo-detection-framework yolov8
Last synced: 12 Nov 2024
https://github.com/jundaf2/gpu-tensor-permute
permute sequence data on GPU with high bandwidth
cuda gpu-acceleration sequence-to-sequence
Last synced: 15 Nov 2024
https://github.com/torinos-yt/nnonnx
Using CUDA for Faster Machine Learning Inference on Unity
cuda machine-learning onnxruntime unity
Last synced: 20 Nov 2024
https://github.com/dendenxu/bvh-ray-tracing
CUDA Ray Tracing using BVH. Forked and modified from https://github.com/YuliangXiu/bvh-distance-queries
bvh cuda pytorch ray-tracing ray-triangle-intersection
Last synced: 27 Jan 2025
https://github.com/tigercosmos/simple-vgg16-cu
Simple VGG16 implemented in CUDA
Last synced: 15 Oct 2024
https://github.com/rmiguelkelly/quickcluster
A KMeans implemented in C++ with Python bindings and GPU acceleration
clustering clustering-algorithm cpp cuda gpu kmeans kmeans-clustering metal objective-c python python3 unsupervised-learning
Last synced: 12 Oct 2024
https://github.com/frgfm/torch-cuda-template
Template for CUDA / C++ extension writing with PyTorch
cpp cuda pytorch pytorch-extension
Last synced: 05 Dec 2024
https://github.com/microsoft/hat
TOML-annotated C header file format for packaging binary files, from Microsoft Research
benchmarking cpp cprogramming cuda metadata platform-independent python-library rocm toml
Last synced: 12 Oct 2024
https://github.com/elftausend/nvjpeg-rs
Rust bindings to the nvJPEG library.
bindings cuda ffi ffi-bindings ffi-wrapper image-processing jpg nvjpeg rust rust-lang
Last synced: 08 Dec 2024
https://github.com/bybatkhuu/wiki
Personal wiki for public.
cuda docker docker-compose linux manuals nvidia-docker nvidia-gpu wiki
Last synced: 24 Oct 2024
https://github.com/alesiong/template-matching
Simple template matching by GPU (CUDA)
computer-vision cuda template-matching
Last synced: 29 Nov 2024
https://github.com/cms-patatrack/cluestering
Density-based clustering algorithm developed at CERN
alpaka cern clustering cpp cuda pybind11 python tbb
Last synced: 30 Oct 2024
https://github.com/romnn/microgpusim
Cycle-level, trace-driven, parallel GPU simulator for NVIDIA Pascal.
cuda cycle-level design-space-exploration gpgpu gpu nvbit nvidia performance-engineering rust simulation trace-driven
Last synced: 18 Nov 2024
https://github.com/mr-technologies/imagebroker
Example of image export from MRTech IFF SDK
c camera cuda demosaicing dng genicam gpu h264 h265 image-processing jetson json low-latency machine-vision mipi opencv rest-api rtsp tiff vulkan
Last synced: 24 Nov 2024
https://github.com/chrxh/alien-docs
Documentation for ALIEN
cuda evolution physics-simulation simulation
Last synced: 15 Oct 2024
https://github.com/nolmoonen/jpeggpu
Low-latency CUDA JPEG decoder by parallelizing Huffman decoding
Last synced: 23 Oct 2024
https://github.com/marcogarlet/cuda_cubeattack
CUDA implementation of Cube Attack
Last synced: 11 Oct 2024
https://github.com/neoblizz/hip_template
🖤 Template for starting HIP/C++ project using CMake with Github Action for CI.
cpp cuda cuda-programming gpgpu gpu hip rocm template-project template-repository
Last synced: 30 Oct 2024
https://github.com/radenmuaz/slope-ad
A small automatic differentiation engine, supporting higher-order derivatives
array autograd automatic-differentiation cuda gradient iree jvp machine-learning metal mlir onnx onnxruntime tensor vjp
Last synced: 08 Dec 2024
https://github.com/hrntsm/ghgpucomputingtest
Test using CUDA with Alea GPU in grasshopper.
Last synced: 27 Nov 2024
https://github.com/NCAR/micm
A model-independent chemistry module for atmosphere models
atmospheric-chemistry atmospheric-modeling atmospheric-science cuda gpu gpu-acceleration hpc ode-solver
Last synced: 27 Nov 2024
https://github.com/pinto0309/realsense-cuda-opengl-docker
RealSense execution environment built on a Docker container on Ubuntu 20.04. NIVIDA GPU and OpenGL capable. CUADA 11.4.
cuda docker opengl realsense realsense2 ubuntu wsl2
Last synced: 29 Oct 2024
https://github.com/gapi505/sparky-2
This is a discord bot running on llama cpp with the llama 3 model and image geneartion
ai cuda llama3 llamacpp stable-diffusion torch transformers
Last synced: 24 Jan 2025
https://github.com/pyhf/cuda-images
pyhf Docker images built on Nvidia Container Toolkit enabled base images
cuda jax nvidia nvidia-cuda nvidia-docker pyhf
Last synced: 23 Nov 2024
https://github.com/raymondcm/blockmatching
CPU and CUDA implementation of Full Exhaustive Block Matching Algorithm using Integral Images
block-matching-algorithm cuda integral-image parallel vision
Last synced: 21 Oct 2024
https://github.com/coderonion/cuda-beginner-course-rust-version
bilibili视频【CUDA 12.x 并行编程入门(Rust版)】配套代码
candle cpp cublas cuda cuda-programming cudarc cudnn gpu gpu-programming nvcc nvidia parellel-programming python rust
Last synced: 19 Nov 2024
https://github.com/coderonion/cuda-beginner-course-python-version
bilibili视频【CUDA 12.x 并行编程入门(Python版)】配套代码
cpp cublas cuda cuda-programming cudnn cupy gpu gpu-programming nvcc nvidia parallel-programming python rust
Last synced: 19 Nov 2024
https://github.com/theochem/cugbasis
High performance CUDA/Python library for computing quantum chemistry density-based descriptors for larger systems using GPUs.
atoms-in-molecules computational-chemistry conceptual-dft cuda electron-density gpu python qtaim quantum quantum-chemistry theoretical-chemistry
Last synced: 17 Nov 2024
https://github.com/rogerallen/smandelbrotr
SDL2 CUDA OpenGL Mandelbrot explorer.
cuda mandelbrot-viewer opengl sdl2
Last synced: 25 Nov 2024
https://github.com/drsnowbird/cuda-pytorch-docker
Nvidia CUDA for GPU + PyTorch (latest) in Docker
cuda deep-learning docker gpu jupyter-notebook nvidia-gpu pytorch ssl-proxy
Last synced: 14 Nov 2024
https://github.com/taeguk/dist-prog-assignment
Sogang Univ. Distributed Programming (CSE5414) Assignments.
assignment cuda distributed mpi-library openmp parallel pthreads sogang
Last synced: 03 Dec 2024
https://github.com/tudasc/cusan
A data race detector for CUDA C and C++ based on ThreadSanitizer
c cpp cuda datarace threadsanitizer
Last synced: 14 Dec 2024
https://github.com/anantzoid/cuda-genetic-algorithm-travelling-salesman-problem
Implementation of Parallel Genetic Algorithm in CUDA to solve TSP (Berlin52)
c cuda genetic-algorithm tsp tsp-solver
Last synced: 01 Dec 2024
https://github.com/bkraad47/fat_llama
fat_llama is a Python package for upscaling audio files to FLAC or WAV formats using advanced audio processing techniques. It utilizes CUDA-accelerated calculations to enhance audio quality by upsampling and adding missing frequencies through FFT, resulting in richer and more detailed audio.
audio audio-engineering audio-processing audiophile cuda cufft cupy fft flac hi-res hpc mp3 music nvidia ogg parallel-computing physics upscaling wav
Last synced: 23 Oct 2024
https://github.com/belval/raytracing
Using CUDA to implement "Raytracing in one weekend" by Peter Shirley
cuda raytracing raytracing-in-one-weekend
Last synced: 14 Oct 2024
https://github.com/alpaka-group/bactria
Broadly Applicable C++ Tracing and Instrumentation API :camel:
cuda hardware-counters instrumentation-api metrics rocm tracing-events
Last synced: 09 Nov 2024
https://github.com/bryanoliveira/cellular-automata
A cellular automata program built with C++, OpenGL, CUDA and OpenMP.
cellular-automata cuda life opengl openmp
Last synced: 03 Jan 2025
https://github.com/elsa-lab/base-env
Basis of ELSA computational platform
cuda machine-learning server-utility ubuntu
Last synced: 11 Nov 2024
https://github.com/pkestene/tsp
traveling salesman problem solved with different programing models
cea cpp cuda kokkos nvidia-gpu openacc openmp performance-portability stdpar sycl
Last synced: 18 Dec 2024
https://github.com/potato3d/grid
GPU-accelerated uniform grid construction for ray tracing
cuda glsl gpu grid ray-tracing
Last synced: 10 Jan 2025
https://github.com/neoheartbeats/neoheartbeats
An architecture for LLMs' continual-learning and long-term memories
cuda fine-tuning llama-factory llm
Last synced: 15 Nov 2024
https://github.com/jpuigcerver/nnutils
CPU & CUDA implementation of several neural network utils
cuda deep-learning neural-networks openmp pytorch
Last synced: 02 Dec 2024
https://github.com/cascadingradium/cuda-hungarian-clustering
A GPU-Accelerated Clustering Algorithm that uses the Hungarian method
clustering cpp cuda gpu hungarian-algorithm parallel-computing
Last synced: 19 Nov 2024
https://github.com/ashvardanian/scaling-democracy
GPU-accelerated Schulze voting method in Python, Numba, and CUDA, using ideas from Algebraic Graph Theory
cuda cuda-kernels dynamic-programming gpgpu graph-algorithms graph-theory pybind11 python voting
Last synced: 07 Nov 2024
https://github.com/mr-technologies/farsight
Basic MRTech IFF SDK sample application
basler camera cuda demosaicing dng gpu h264 h265 image-processing jetson json low-latency machine-vision mipi nvidia-gpu rest-api rtsp sdk tiff ximea
Last synced: 24 Nov 2024
https://github.com/pmeier/tox-ltt
Install PyTorch distributions with light-the-torch
cuda install light-the-torch pip plugin pytorch tox
Last synced: 22 Dec 2024
https://github.com/yuvix25/py2cuda
Convert Python 3 code to CUDA code.
converter cuda gpu gpu-acceleration python python3
Last synced: 05 Jan 2025
https://github.com/willigarneau/astar-pathfinding
🗺📌 Implementation of the A* pathfinding algorithm with OpenCV and Cuda in C++ 💪
a-star algorithm axis-camera cuda detection implementation opencv pathfinding
Last synced: 23 Nov 2024
https://github.com/bencardoen/singularity_slurm_cuda
Example on how to get started with Singularity and CUDA on a SLURM cluster
cuda nvidia singularity-container slurm-cluster tensorflow
Last synced: 23 Oct 2024
https://github.com/enp1s0/culip
Library for profiling the execution time of CUDA official library functions
Last synced: 06 Nov 2024
https://github.com/toruniina/spray
molecular viewer based on ray-tracing
c-plus-plus computer-graphics cuda molecular-graphics molecular-viewer opengl raytracing
Last synced: 20 Dec 2024
https://github.com/aespinosadev/opengl-renderer
OpenGL renderer showcasing all basic functionality to render 3D scenes.
computer-graphics cuda gpgpu graphics-engine graphics-programming opengl rendering rendering-3d-graphics shaders video-game
Last synced: 23 Jan 2025
https://github.com/cascadingradium/air-traffic-distribution
A GPU-Accelerated Multi-Objective Genetic Algorithm for Air Traffic Management
air-traffic-control air-traffic-management c cuda genetic-algorithm gpu-acceleration
Last synced: 19 Nov 2024
https://github.com/enfiskutensykkel/cuda-rdma-bench
NVIDIA GPU direct RDMA using SISCI API
cuda dma gpudirect-rdma pcie rdma sisci
Last synced: 01 Nov 2024
https://github.com/rfsantacruz/mycudasamples
This is a series of CUDA C++ programming samples developed to study CUDA technology and its parallel programming model.
Last synced: 07 Nov 2024
https://github.com/lucaangioloni/parallelcomputingexam
Parallel Computing Exam
c cuda histogram-equalization integral-image java java-thread openmp parallel-computing
Last synced: 08 Dec 2024