Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
CUDA
CUDA® is a parallel computing platform and programming model developed by NVIDIA for general computing on graphical processing units (GPUs). With CUDA, developers are able to dramatically speed up computing applications by harnessing the power of GPUs.
- GitHub: https://github.com/topics/cuda
- Wikipedia: https://en.wikipedia.org/wiki/CUDA
- Created by: Nvidia
- Released: June 23, 2007
- Related Topics: nvcc,
- Last updated: 2025-02-06 00:06:50 UTC
- JSON Representation
https://github.com/lebedov/cudamps
Python interface to CUDA Multi-Process Service
Last synced: 23 Oct 2024
https://github.com/ktaletsk/gpu_dsm
🔗Accessible quantitative polymer rheology predictions with slip-links on GPU
c-plus-plus cuda gpu polymer rheology
Last synced: 31 Dec 2024
https://github.com/enp1s0/cumpsgemm
Fast SGEMM emulation on Tensor Cores
cuda fp32 gemm gpu half-precision mixed-precision tensorcore tensorcores
Last synced: 06 Nov 2024
https://github.com/tillahoffmann/universal_tensorflow_image
Develop tensorflow models with or without a GPU accelerator using the same Docker image. 🥳
Last synced: 11 Oct 2024
https://github.com/aknvictor/culingam
CULiNGAM accelerates LiNGAM analysis on GPUs.
Last synced: 01 Nov 2024
https://github.com/aniketsingh03/processing-history-of-images
:bulb: Detecting processing history of images by using Deep Learning
cuda deep-learning image-forensics matlab python3 pytorch
Last synced: 19 Dec 2024
https://github.com/elftausend/gradients
Deep Learning library written in Rust (OpenCL, CUDA & CPU)
cpu cuda deep-learning gpu gpu-acceleration machine-learning mlp neural-networks opencl rust
Last synced: 08 Dec 2024
https://github.com/neka-nat/cuimage
Rust implementation of image processing library with CUDA
Last synced: 14 Oct 2024
https://github.com/josonchan1998/opencv_install
Build OpenCV from sources with cuda in anaconda3
anaconda3 cuda opencv shell-script
Last synced: 29 Jan 2025
https://github.com/qengineering/tensorflow-addons-jetson-nano
TensorFlow Addons installation wheels for Jetson Nano
aarch64 cuda cudnn installation-wheel jetson-nano linux python3 tensorflow-addons wheel
Last synced: 27 Nov 2024
https://github.com/ikergarcia1996/matrix-benchmark
A cupy (GPU) / numpy benchmark to measure how fast different hardware can perform matrix operations.
benchmark cuda cupy embedding gpu matrix numpy python word-embeddings
Last synced: 23 Jan 2025
https://github.com/pkestene/cuda-proj-tmpl
A minimal cmake based project skeleton for developping a CUDA application
cea cmake cuda gpu gpu-computing parallel-computing parallel-programming template
Last synced: 18 Dec 2024
https://github.com/yhmtsai/ci_windows_cuda
This Repo creates the dockerfiles for using cuda in windows docker and provides the gitlab/github windows shared vm runner config.
continuous-integration cuda docker github-actions gitlab windows
Last synced: 27 Nov 2024
https://github.com/ghost---shadow/near-duplicate-image-detector
CUDA implementation of some perceptual hashing algorithms
Last synced: 11 Oct 2024
https://github.com/zephirfxec/hnanosolver
Houdini GPU Fluid Solver powered by NanoVDB
cpp cuda fluid-dynamics houdini nanovdb openvdb
Last synced: 23 Oct 2024
https://github.com/pkestene/kokkos-proj-tmpl
A minimal cmake based project skeleton for developping a kokkos application
cea cuda gpu kokkos openmp parallel-computing parallelization performance-portability
Last synced: 18 Dec 2024
https://github.com/maxilevi/raytracer
C++ raytracer that supports custom models. Supports running the calculations on the CPU using C++11 threads or in the GPU via CUDA.
bvh cuda graphics-programming intersection raytracer
Last synced: 11 Nov 2024
https://github.com/erkaman/parle-cuda
A reference implementation of RLE in CUDA
c-plus-plus compression cuda data-compression demo gpgpu gpu parle rle run-length-encoding
Last synced: 12 Nov 2024
https://github.com/adamtiger/tinygpulang
Tutorial on building a gpu compiler backend in LLVM
Last synced: 14 Oct 2024
https://github.com/roflmaostc/radonka.jl
A simple yet sufficiently fast (attenuated) Radon and backproject implementation using KernelAbstractions.jl. Runs on CPU, CUDA, ...
automatic-differentiation computed-tomography ct cuda gpu julia julia-language optimization radon radon-transform tomography x-ray
Last synced: 12 Oct 2024
https://github.com/alankrantas/tensorflow-cuda-gpu-devcontainer
Tensorflow CUDA DevContainer Configuration for Supporting NVIDIA GPU
cuda cudnn deep-learning devcontainer gpu-acceleration keras machine-learning nvidia nvidia-gpu tensorflow
Last synced: 11 Nov 2024
https://github.com/bigsk1/podcast-ai
AI podcast summary from a youtube video using Anthropic or XAI and Elevenlabs voices
ai-podcast anthropic-claude claude-ai claude-api cuda cudnn elevenlabs elevenlabs-api faster-whisper ffpmeg podcast review-tools xai xai-api youtube yt-dlp
Last synced: 11 Jan 2025
https://github.com/jacobtomlinson/advent-of-gpu-code-2020
Solutions for Advent of Code 2020 written for the GPU in Python
advent-of-code cuda gpu jupyter-notebooks numba python
Last synced: 29 Oct 2024
https://github.com/101001000/tfg-pathtracer
CUDA Path tracing render engine, with MIS and the Disney BRDF
cuda pathtracing raytracing renderer
Last synced: 14 Nov 2024
https://github.com/landslidesim/materialpointsolver.jl
🧮 High-performance Material Point Method (MPM) Solver in Julia.
cuda material-point-method mpm parallel-computing rocm
Last synced: 05 Feb 2025
https://github.com/torinos-yt/nnonnx
Using CUDA for Faster Machine Learning Inference on Unity
cuda machine-learning onnxruntime unity
Last synced: 20 Nov 2024
https://github.com/romnn/microgpusim
Cycle-level, trace-driven, parallel GPU simulator for NVIDIA Pascal.
cuda cycle-level design-space-exploration gpgpu gpu nvbit nvidia performance-engineering rust simulation trace-driven
Last synced: 18 Nov 2024
https://github.com/mr-technologies/imagebroker
Example of image export from MRTech IFF SDK
c camera cuda demosaicing dng genicam gpu h264 h265 image-processing jetson json low-latency machine-vision mipi opencv rest-api rtsp tiff vulkan
Last synced: 24 Nov 2024
https://github.com/daschr/cuda_firewall
Implementing a Firewall using dpdk and CUDA
Last synced: 03 Feb 2025
https://github.com/alesiong/template-matching
Simple template matching by GPU (CUDA)
computer-vision cuda template-matching
Last synced: 29 Nov 2024
https://github.com/chrxh/alien-docs
Documentation for ALIEN
cuda evolution physics-simulation simulation
Last synced: 15 Oct 2024
https://github.com/mnicely/computeworks_examples
Matrix multiplication example performed with OpenMP, OpenACC, BLAS, cuBLABS, and CUDA
blas cublas cuda docker eclipse-plugin nsight nvidia nvidia-docker openacc openmp pgi-compiler
Last synced: 15 Oct 2024
https://github.com/dendenxu/bvh-ray-tracing
CUDA Ray Tracing using BVH. Forked and modified from https://github.com/YuliangXiu/bvh-distance-queries
bvh cuda pytorch ray-tracing ray-triangle-intersection
Last synced: 27 Jan 2025
https://github.com/tigercosmos/simple-vgg16-cu
Simple VGG16 implemented in CUDA
Last synced: 15 Oct 2024
https://github.com/dusanerdeljan/stereo-depth
Bachelor thesis - GPU accelerated single view passive stereo depth estimation pipeline
convolutional-neural-networks cuda depth-estimation pytorch real-time stereo-matching stereo-vision
Last synced: 11 Oct 2024
https://github.com/eve-ning/glcm-cupy
GLCM in CUDA
computer-vision cuda cupy feature-engineering glcm python
Last synced: 27 Oct 2024
https://github.com/bybatkhuu/wiki
Personal wiki for public.
cuda docker docker-compose linux manuals nvidia-docker nvidia-gpu wiki
Last synced: 24 Oct 2024
https://github.com/valentingol/deep-learning-installation
This tutorial provide a step-by-step pipeline to install an effective Python set-up optimized for deep learning for Ubuntu LTS, containing libraries to use efficiently the last versions of Tensorflow and Pytorch with the GPU and a comfortable environment of work with flexible and highly customizable IDE (VSCode) and environment manager (Virtualenv/VirtualenvWrapper).
cuda deep-learning deep-learning-library deep-learning-tutorial setup tutorial virtualenv virtualenvwrapper vscode
Last synced: 11 Oct 2024
https://github.com/cms-patatrack/cluestering
Density-based clustering algorithm developed at CERN
alpaka cern clustering cpp cuda pybind11 python tbb
Last synced: 30 Oct 2024
https://github.com/frgfm/torch-cuda-template
Template for CUDA / C++ extension writing with PyTorch
cpp cuda pytorch pytorch-extension
Last synced: 05 Dec 2024
https://github.com/benediktalkin/kappaprofiler
lightweight simple profiling for python/pytorch
Last synced: 09 Nov 2024
https://github.com/flyingfathead/dvr-yolov8-detection
Python+YOLOv8-based human/animal/object detection DVR framework with GUI, webUI and Telegram alerts
automation cctv cuda dvr dvr-tool human-activity-recognition human-detection object-detection opencv opencv-python opencv2 opencv2-python python rtmp security video-processing yolo yolo-detection-framework yolov8
Last synced: 12 Nov 2024
https://github.com/abus-aikorea/aria-coversong
The best gradio web-ui for creating cover song that uses mdx-net and rvc. Easy one click installation. Fully portable.
cuda demucs gradio karaoke mdx-net nvidia python pytorch rvc song-covers uvr vocal-remover voice-conversion
Last synced: 10 Nov 2024
https://github.com/microsoft/hat
TOML-annotated C header file format for packaging binary files, from Microsoft Research
benchmarking cpp cprogramming cuda metadata platform-independent python-library rocm toml
Last synced: 12 Oct 2024
https://github.com/lanl/stcuda
StCUDA allows Smalltalk to call CUDA Driver APIs to do GPU computing
Last synced: 09 Dec 2024
https://github.com/superlinear-ai/python-gpu
🐳 Python GPU adds a minimal install of CUDA and cuDNN on top of the official python:3.x-slim base image
cuda cudnn docker docker-image python
Last synced: 11 Nov 2024
https://github.com/bhattbhavesh91/cudf-rapids-demo
A simple demo of cuDF which is a RAPIDS GPU-Accelerated Dataframe Library!
arrow cuda cudf demo gpu gpu-dataframe pandas python rapids
Last synced: 16 Nov 2024
https://github.com/p-ranav/vulkan-earth
Vulkan-based 3D Rendering of Earth
3d cuda engine gpu rendering simulation vulkan
Last synced: 13 Nov 2024
https://github.com/elftausend/nvjpeg-rs
Rust bindings to the nvJPEG library.
bindings cuda ffi ffi-bindings ffi-wrapper image-processing jpg nvjpeg rust rust-lang
Last synced: 08 Dec 2024
https://github.com/rmiguelkelly/quickcluster
A KMeans implemented in C++ with Python bindings and GPU acceleration
clustering clustering-algorithm cpp cuda gpu kmeans kmeans-clustering metal objective-c python python3 unsupervised-learning
Last synced: 12 Oct 2024
https://github.com/jundaf2/gpu-tensor-permute
permute sequence data on GPU with high bandwidth
cuda gpu-acceleration sequence-to-sequence
Last synced: 15 Nov 2024
https://github.com/dancing-ui/uestc_vhm
使用yolov8、fast-reid、deepsort完成目标跟踪
cuda deepsort dockerfile fast-reid tensorrt yolov8n
Last synced: 23 Oct 2024
https://github.com/bokutotu/zenu
Deep Learning Framework Written in Rust
ai autograd blas cublas cuda cudnn deep-learning deep-neural-networks gpu-computing hpc rust
Last synced: 15 Dec 2024
https://github.com/elsa-lab/base-env
Basis of ELSA computational platform
cuda machine-learning server-utility ubuntu
Last synced: 11 Nov 2024
https://github.com/nolmoonen/jpeggpu
Low-latency CUDA JPEG decoder by parallelizing Huffman decoding
Last synced: 23 Oct 2024
https://github.com/radenmuaz/slope-ad
A small automatic differentiation engine, supporting higher-order derivatives
array autograd automatic-differentiation cuda gradient iree jvp machine-learning metal mlir onnx onnxruntime tensor vjp
Last synced: 08 Dec 2024
https://github.com/jpuigcerver/nnutils
CPU & CUDA implementation of several neural network utils
cuda deep-learning neural-networks openmp pytorch
Last synced: 02 Dec 2024
https://github.com/marcogarlet/cuda_cubeattack
CUDA implementation of Cube Attack
Last synced: 11 Oct 2024
https://github.com/neoheartbeats/neoheartbeats
An architecture for LLMs' continual-learning and long-term memories
cuda fine-tuning llama-factory llm
Last synced: 15 Nov 2024
https://github.com/gapi505/sparky-2
This is a discord bot running on llama cpp with the llama 3 model and image geneartion
ai cuda llama3 llamacpp stable-diffusion torch transformers
Last synced: 24 Jan 2025
https://github.com/bryanoliveira/cellular-automata
A cellular automata program built with C++, OpenGL, CUDA and OpenMP.
cellular-automata cuda life opengl openmp
Last synced: 03 Jan 2025
https://github.com/tudasc/cusan
A data race detector for CUDA C and C++ based on ThreadSanitizer
c cpp cuda datarace threadsanitizer
Last synced: 14 Dec 2024
https://github.com/alpaka-group/bactria
Broadly Applicable C++ Tracing and Instrumentation API :camel:
cuda hardware-counters instrumentation-api metrics rocm tracing-events
Last synced: 09 Nov 2024
https://github.com/pinto0309/realsense-cuda-opengl-docker
RealSense execution environment built on a Docker container on Ubuntu 20.04. NIVIDA GPU and OpenGL capable. CUADA 11.4.
cuda docker opengl realsense realsense2 ubuntu wsl2
Last synced: 29 Oct 2024
https://github.com/ashvardanian/scaling-democracy
GPU-accelerated Schulze voting method in Python, Numba, and CUDA, using ideas from Algebraic Graph Theory
cuda cuda-kernels dynamic-programming gpgpu graph-algorithms graph-theory pybind11 python voting
Last synced: 07 Nov 2024
https://github.com/neoblizz/hip_template
🖤 Template for starting HIP/C++ project using CMake with Github Action for CI.
cpp cuda cuda-programming gpgpu gpu hip rocm template-project template-repository
Last synced: 30 Oct 2024
https://github.com/coderonion/cuda-beginner-course-rust-version
bilibili视频【CUDA 12.x 并行编程入门(Rust版)】配套代码
candle cpp cublas cuda cuda-programming cudarc cudnn gpu gpu-programming nvcc nvidia parellel-programming python rust
Last synced: 19 Nov 2024
https://github.com/theochem/cugbasis
High performance CUDA/Python library for computing quantum chemistry density-based descriptors for larger systems using GPUs.
atoms-in-molecules computational-chemistry conceptual-dft cuda electron-density gpu python qtaim quantum quantum-chemistry theoretical-chemistry
Last synced: 17 Nov 2024
https://github.com/taeguk/dist-prog-assignment
Sogang Univ. Distributed Programming (CSE5414) Assignments.
assignment cuda distributed mpi-library openmp parallel pthreads sogang
Last synced: 03 Dec 2024
https://github.com/drsnowbird/cuda-pytorch-docker
Nvidia CUDA for GPU + PyTorch (latest) in Docker
cuda deep-learning docker gpu jupyter-notebook nvidia-gpu pytorch ssl-proxy
Last synced: 14 Nov 2024
https://github.com/bkraad47/fat_llama
fat_llama is a Python package for upscaling audio files to FLAC or WAV formats using advanced audio processing techniques. It utilizes CUDA-accelerated calculations to enhance audio quality by upsampling and adding missing frequencies through FFT, resulting in richer and more detailed audio.
audio audio-engineering audio-processing audiophile cuda cufft cupy fft flac hi-res hpc mp3 music nvidia ogg parallel-computing physics upscaling wav
Last synced: 23 Oct 2024
https://github.com/anantzoid/cuda-genetic-algorithm-travelling-salesman-problem
Implementation of Parallel Genetic Algorithm in CUDA to solve TSP (Berlin52)
c cuda genetic-algorithm tsp tsp-solver
Last synced: 01 Dec 2024
https://github.com/coderonion/cuda-beginner-course-python-version
bilibili视频【CUDA 12.x 并行编程入门(Python版)】配套代码
cpp cublas cuda cuda-programming cudnn cupy gpu gpu-programming nvcc nvidia parallel-programming python rust
Last synced: 19 Nov 2024
https://github.com/rogerallen/smandelbrotr
SDL2 CUDA OpenGL Mandelbrot explorer.
cuda mandelbrot-viewer opengl sdl2
Last synced: 25 Nov 2024
https://github.com/potato3d/grid
GPU-accelerated uniform grid construction for ray tracing
cuda glsl gpu grid ray-tracing
Last synced: 10 Jan 2025
https://github.com/pkestene/tsp
traveling salesman problem solved with different programing models
cea cpp cuda kokkos nvidia-gpu openacc openmp performance-portability stdpar sycl
Last synced: 18 Dec 2024
https://github.com/NCAR/micm
A model-independent chemistry module for atmosphere models
atmospheric-chemistry atmospheric-modeling atmospheric-science cuda gpu gpu-acceleration hpc ode-solver
Last synced: 27 Nov 2024
https://github.com/belval/raytracing
Using CUDA to implement "Raytracing in one weekend" by Peter Shirley
cuda raytracing raytracing-in-one-weekend
Last synced: 14 Oct 2024
https://github.com/pyhf/cuda-images
pyhf Docker images built on Nvidia Container Toolkit enabled base images
cuda jax nvidia nvidia-cuda nvidia-docker pyhf
Last synced: 23 Nov 2024
https://github.com/cascadingradium/cuda-hungarian-clustering
A GPU-Accelerated Clustering Algorithm that uses the Hungarian method
clustering cpp cuda gpu hungarian-algorithm parallel-computing
Last synced: 19 Nov 2024
https://github.com/hrntsm/ghgpucomputingtest
Test using CUDA with Alea GPU in grasshopper.
Last synced: 27 Nov 2024
https://github.com/raymondcm/blockmatching
CPU and CUDA implementation of Full Exhaustive Block Matching Algorithm using Integral Images
block-matching-algorithm cuda integral-image parallel vision
Last synced: 21 Oct 2024
https://github.com/rogerallen/qtmandelbrotr
Qt CUDA Mandelbrot explorer
cuda cuda-opengl mandelbrot-viewer qt5
Last synced: 25 Nov 2024
https://github.com/yuvix25/py2cuda
Convert Python 3 code to CUDA code.
converter cuda gpu gpu-acceleration python python3
Last synced: 05 Jan 2025
https://github.com/mr-technologies/crowsnest
MRTech IFF SDK web interface sample
camera cuda demosaicing dng frontend genicam gpu h264 image-processing jetson json low-latency machine-vision mipi rest-api rtsp tiff vulkan webrtc ximea
Last synced: 24 Jan 2025
https://github.com/enfiskutensykkel/cuda-rdma-bench
NVIDIA GPU direct RDMA using SISCI API
cuda dma gpudirect-rdma pcie rdma sisci
Last synced: 01 Nov 2024
https://github.com/shunk031/nvinfo-go
Rewrite of ikr7/nvinfo, a simple utility for monitoring your CUDA-enabled GPUs, with Golang
cli cuda go golang gpu nvidia nvidia-smi
Last synced: 15 Dec 2024