Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
CUDA
![](https://explore-feed.github.com/topics/cuda/cuda.png)
CUDA® is a parallel computing platform and programming model developed by NVIDIA for general computing on graphical processing units (GPUs). With CUDA, developers are able to dramatically speed up computing applications by harnessing the power of GPUs.
- GitHub: https://github.com/topics/cuda
- Wikipedia: https://en.wikipedia.org/wiki/CUDA
- Created by: Nvidia
- Released: June 23, 2007
- Related Topics: nvcc,
- Last updated: 2025-02-13 00:07:16 UTC
- JSON Representation
https://github.com/neoblizz/spmv
Efficient Sparse Matrix-Vector Multiplication (SpMV) using ModernGPU (MTX + CSR formats).
csr cuda gpgpu load-balancing mtx spmv
Last synced: 09 Feb 2025
https://github.com/matthewfeickert/cuda-tf-torch
An Ubuntu 18.04 NVIDIA Docker image with CUDA 10.1 CuDNN 7 with TensorFlow and PyTorch
cuda cuda-101 cudnn cudnn-v7 docker docker-image gpu nvidia-docker nvidia-gpu pytorch tensorflow torch
Last synced: 01 Feb 2025
https://github.com/yingding/applyllm
A python package for applying LLM with LangChain and Hugging Face on local CUDA/MPS host
accelerator batch cuda framework inference kubeflow langchain llm mps pipeline slurm transformers
Last synced: 22 Dec 2024
https://github.com/lchsk/ney
A header-only parallel functions library for Intel Xeon/Xeon Phi/GPUs
cuda gpu linux parallel phi scientific xeon xeonphi
Last synced: 08 Jan 2025
https://github.com/shikha-code36/cuda-programming-beginner-guide
A beginner's guide to CUDA programming
cuda cuda-basic cuda-basics cuda-cpp cuda-demo cuda-kernel cuda-kernels cuda-library cuda-programming cuda-support cuda-toolkit
Last synced: 13 Jan 2025
https://github.com/cfries/javagpuexperiments
Repository used to demo OpenCL, JOCL, JCuda.
Last synced: 27 Dec 2024
https://github.com/ehwan/r-star-tree
HeaderOnly STL-like template N-dimensional R*Tree implementation on C++14
algorithm cplusplus-14 cuda eigen3 geometric-algorithms gpgpu header-only linear-algebra modern-cpp opencl rtree stl-like template traits tree-structure
Last synced: 27 Oct 2024
https://github.com/isazi/aoflagger
AOFlagger Radio Frequency Interference mitigation algorithm.
Last synced: 30 Jan 2025
https://github.com/peri044/cuda
GPU implementations of algorithms
cuda gauss-jordan parallel-programming
Last synced: 08 Feb 2025
https://github.com/babak2/optimizedsum
Optimized Parallel Sum program demonstrating CPU vs GPU performance
cuda cuda-programming gpu-acceleration gpu-computing gpu-parallelism visual-studio
Last synced: 01 Feb 2025
https://github.com/dpbm/qml-course
Minicurso de quantum Machine learning
cuda cuda-q cuquantum docker ml python qml quantum quantum-computing tensorflow
Last synced: 21 Dec 2024
https://github.com/nachovizzo/saxpy_openacc_cpp
My way of thinking about OpenACC, C++, and Parallel computing in general
Last synced: 30 Jan 2025
https://github.com/trilliwon/cuda-examples
CUDA examples
cuda gpu-computing nvidia-cuda parallel parallel-computing parallel-programming
Last synced: 30 Jan 2025
https://github.com/stdogpkg/cukuramoto
A python/CUDA pkg which solves numerically the kuramoto model through the Heun's method
complex-networks cuda kuramoto-model
Last synced: 29 Jan 2025
https://github.com/dereklstinson/nccl
golang wrapper for nccl
cuda deep-learning go nccl parallel-computing
Last synced: 15 Jan 2025
https://github.com/frozenassassine/neuralnetwork-fromscratch
Neural Network from scratch in C# with CUDA support
ai classification csharp cuda gpu gpu-acceleration neural-network neural-networks nvidia
Last synced: 13 Nov 2024
https://github.com/cppalliance/crypt
A C++20 module of cryptographic utilities for CPU and GPU
Last synced: 09 Jan 2025
https://github.com/B1-663R/docker-mining
Dockerfiles to build docker images to start mining with an NVIDIA Docker architecture
cryptocurrency cuda docker-image docker-nvidia mining
Last synced: 31 Oct 2024
https://github.com/mulx10/firefly
Enhancing Object Detection in using Thermal Imaging for thin cross-section unidentifiable objects(eg. cyclist, pedestrians).
autonomous-cars autonomous-navigation autonomous-vehicles c cuda object-detection thermal-camera yolov3
Last synced: 30 Dec 2024
https://github.com/dzimiks/cuda-matrix-multiplication
CUDA Matrix Multiplication
cuda matrix matrix-multiplication python
Last synced: 03 Jan 2025
https://github.com/grakshith/parallel-k-means
K-Means clustering for Image Colour Quantization and Image Compression
cuda image-color-quantization image-compression k-means mpi opencv openmp
Last synced: 06 Jan 2025
https://github.com/mrglaster/cuda-acfcalc
Calculation of the smallest ACF for signals of length N using CUDA technology.
acf c calculations cpp cuda google-colaboratory google-colaboratory-notebooks isu
Last synced: 15 Jan 2025
https://github.com/sagdrip/cudarrows
CUDA port of Logic Arrows
cellular-automata cuda gpu-acceleration logic-gates
Last synced: 21 Jan 2025
https://github.com/pfcclab/open3d
Open3D: A Modern Library for 3D Data Processing
3d 3d-perception arm computer-graphics cpp cuda gpu gui machine-learning mesh-processing odometry opengl paddle pointcloud python reconstruction registration rendering tensorflow visualization
Last synced: 21 Jan 2025
https://github.com/pothosware/pothosgpu
Pothos toolkit for ArrayFire API support
arrayfire cuda dataflow dataflow-programming gpu opencl pothos
Last synced: 15 Jan 2025
https://github.com/ilyasmoutawwakil/optimum-whisper-autobenchmark
A set of benchmarks on OpenAI's Whisper model, using AutoBenchmark and Optimum's OnnxRuntime Optimizations.
Last synced: 30 Jan 2025
https://github.com/programmer-rd-ai/digivis
A PyTorch-based deep learning implementation for MNIST digit recognition featuring CNNs, GPU acceleration, experiment tracking, and comprehensive testing capabilities.
cnn computer-vision cuda data-science deep-learning digit-recognition image-classification machine-learning mnist neural-networks python pytorch wandb
Last synced: 12 Jan 2025
https://github.com/andygeiss/machine-learning-golang
This repository provides a basic setup to do Machine Learning with Golang and Python, TensorFlow 1.15 and CUDA 10.0.
benchmark cuda docker go golang machine-learning python tensorflow
Last synced: 06 Feb 2025
https://github.com/snoopy3476/t-espresso
A CUDA Library for Low-overhead Host-to-Device Transmission of Patterned Profile Data
Last synced: 07 Nov 2024
https://github.com/kaierikniermann/hpc-uzh-notes
These are some notes for the High Performance Computing course taught at UZH
cuda high-performance-computing mpi openacc openmp
Last synced: 12 Jan 2025
https://github.com/gunrock/template
Template repository for essentials applications to get you started asap!
cpp cuda essentials gpu graph-algorithms graph-analytics gunrock
Last synced: 10 Jan 2025
https://github.com/tensorbfs/cutropicalgemm.jl
The fastest Tropical number matrix multiplication on GPU
Last synced: 20 Dec 2024
https://github.com/ashwani-rathee/imagesgpu.jl
Image Processing on GPU in Julia
cuda gpu image image-processing julia
Last synced: 21 Nov 2024
https://github.com/pvdberg1998/cufft_rust
A safe Rust wrapper around a subset of cuFFT.
Last synced: 12 Dec 2024
https://github.com/bjornmelin/deep-learning-evolution
🧠 Deep-Learning Evolution: Unified collection of TensorFlow & PyTorch projects, featuring custom CUDA kernels, distributed training, memory‑efficient methods, and production‑ready pipelines. Showcases advanced GPU optimizations, from foundational models to cutting‑edge architectures. 🚀
ai-research cuda data-science deep-learning distributed-training gan gpu-acceleration machine-learning model-optimization neural-networks python pytorch tensorflow training-pipeline transformers
Last synced: 05 Feb 2025
https://github.com/lcsb-biocore/cufluxsampler.jl
GPU-accelerated algorithms for flux sampling in CUDA.jl
cobra cuda gpu julia metabolic-network metabolism sampling
Last synced: 30 Jan 2025
https://github.com/matx64/rs-netbot
Old School Runescape (MMORPG) Bot created using a Convolutional Neural Network for object identification
Last synced: 09 Feb 2025
https://github.com/abdulfatir/subkmeans
Numpy and pyCUDA implementation of subKmeans
clustering cuda kdd kmeans numpy pycuda python subspace-clustering
Last synced: 09 Feb 2025
https://github.com/quantum-integrated-technologies/deepforge
DeepForge : framework for working with machine learning.
ai artificial-intelligence cuda library machine-learning ml neural-network
Last synced: 10 Feb 2025
https://github.com/sartajbhuvaji/cuda
Deloped CUDA kernel functions to load and train a Convolution Neural Network from scratch.
cuda cuda-programming gpu-programming neural-network nvidia-cuda
Last synced: 05 Feb 2025
https://github.com/david-palma/cuda-programming
Educational CUDA C/C++ programming repository with commented examples on GPU parallel computing, matrix operations, and performance profiling. Requires a CUDA-enabled NVIDIA GPU.
c-cpp cpp cuda cuda-toolkit education gpu gpu-programming kernel matrix-operations nvcc nvidia parallel-computing parallel-programming practice profiling threads
Last synced: 31 Jan 2025
https://github.com/whutao/artificial-art
Image approximation with triangles using evolutionary algorithm.
cuda evolutionary-algorithm python3
Last synced: 16 Jan 2025
https://github.com/nolmoonen/cuda-sdf
CUDA-accelerated path traced Menger sponge using ray marching.
cuda menger path-tracer ray-marching sdf
Last synced: 05 Feb 2025
https://github.com/ruturaj4/cuda_nvidia_tutorial
cuda projects
cuda cuda-vector-addition nvidia nvidia-cuda parallel
Last synced: 16 Jan 2025
https://github.com/weiyu0824/flash-attention-lite
Basic Flash attention Implmentation
Last synced: 05 Feb 2025
https://github.com/tyler-hilbert/cuda-linearregression
Linear Regression written from scratch in CUDA
ai cublas cuda gpu linear-regression nsight
Last synced: 05 Feb 2025
https://github.com/adamczykpiotr/cudamatrixlibrary
Matrix operation library using single, n-threads or CUDA supported GPU
agh agh-ust cpp cuda cuda-library matrix matrix-computations matrix-functions matrix-multiplication
Last synced: 19 Jan 2025
https://github.com/rhysdg/whisper-onnx-python
A low-footprint GPU accelerated Speech to Text Python package for the Jetpack 5 era bolstered by an optimized graph
ai chatbot cuda machine-learning onnxruntime speech-to-text whisper
Last synced: 08 Feb 2025
https://github.com/alegau03/parallel-k-means
Implementation of C programs for the K-Means algorithm for parallel computing.
c c-programming cuda parallel parallel-programming
Last synced: 05 Feb 2025
https://github.com/mala13f/statistical-learning-in-finance
This Repository contains all the codes, papers and related data for assignments done during the course.
cuda gpu-acceleration jupyter-notebook machine-learning python statistical-learning
Last synced: 31 Jan 2025
https://github.com/nickolasrm/gpuvscpumatrixmultiplication
CPU and GPU optimized matrix multiplication (AVX, transposition, CUDA and other)
avx comparison cuda hpc matrix multiplication
Last synced: 28 Dec 2024
https://github.com/fynv/cudainline
A CUDA interface for Python. A distillation of the engine part of ThrustRTC.
Last synced: 05 Feb 2025
https://github.com/speedcell4/torchdevice
Setup CUDA_VISIBLE_DEVICES
cuda deep-learning gpu machine-learning pytorch
Last synced: 08 Feb 2025
https://github.com/andih/cuda-fortran-stream
Variant of STREAM Benchmark in CUDA Fortran
cuda cuda-fortran gpu stream-benchmarks variants
Last synced: 12 Jan 2025
https://github.com/fandreuz/parallel-programming-for-hpc
Scientific codes in C/C++ with CUDA, OpenACC, FFTW, (cu)BLAS
Last synced: 21 Jan 2025
https://github.com/mayukhdeb/patrick
Tiny neural net library written from scratch with cupy :warning: under construction :warning:
cuda deep-learning gpu-computing machine-learning neural-network regression
Last synced: 12 Feb 2025
https://github.com/sbstndb/grayscott_k
A simple 3D GrayScott simulation using Kokkos enabling CUDA or OpenMP backend
cuda finite-difference grayscott grid kokkos laplacian openmp simulation visualisation
Last synced: 05 Feb 2025
https://github.com/inventwithdean/cuda_mlp
Implementation of a simple Multilayer Perceptron in pure CUDA
cuda cuda-programming deep-learning neural-networks
Last synced: 05 Feb 2025
https://github.com/dvhh/masscorrelation
An exercise in writing an efficient correlation calculator
calculations correlation-calculation cuda matrix multi-threading openmp
Last synced: 25 Jan 2025
https://github.com/franciscoda/psvm
R package and C++ library that allows training SVM models in a GPU using CUDA and predicting out-of-sample data. A support vector machine (SVM) is a type of machine learning model that is trained using supervised data to classify samples.
cpp cpp17 cuda machine-learning r svm-classifier svm-training
Last synced: 28 Jan 2025
https://github.com/dotblueshoes/robertscross
The Roberts cross operator is used in image processing and computer vision for edge detection.
cuda edge-detection image-processing
Last synced: 05 Feb 2025
https://github.com/antonioberna/nn-gpu-logic-gates
Neural Network implementation on GPU using CUDA C++ to learn logic gates operations
cpp cuda gpu logic-gates neural-networks nvidia
Last synced: 05 Feb 2025
https://github.com/5had3z/torch-discounted-cumsum-nd
PyTorch Discounted Cumsum with Autograd (CPU + CUDA)
Last synced: 05 Feb 2025
https://github.com/ergonomech/comfyui-windows-installer
Automated setup for ComfyUI on Windows with CUDA, custom plugins, and optimized PyTorch settings. Made to Run as Server and Error Correct,. Easy installation and launch using Miniconda.
automation comfy conda conda-environment cuda hosting-deployment setup windows
Last synced: 06 Feb 2025
https://github.com/poodarchu/vision-lab
Computer Vision Experiments in all.
computer-vision cuda object-detection
Last synced: 28 Jan 2025
https://github.com/dansolombrino/gphungarian
A GPU-accelerated implementation of the Hungarian Algorithm, written in CUDA
Last synced: 07 Feb 2025
https://github.com/wallneradam/docker-ccminer
CCMiner (tpruvot version) Docker Builder
ccminer cuda docker gpu litecoin miner monero nvidia nvidia-docker
Last synced: 01 Feb 2025
https://github.com/anne-andresen/multi-modal-cuda-c-gan
Raw C/cuda implementation of 3d GAN
3d 3d-models attention-mechanism c cross-attention cross-attention-c cuda gan gan-models low-level-programming medical-imaging multimodal-deep-learning pytorch transformer-pytorch transformers transformers-c
Last synced: 05 Nov 2024
https://github.com/kilamper/matrix-multiplication
AC - Matrix multiplication using OpenMP, MPI and CUDA
Last synced: 26 Jan 2025
https://github.com/brosnanyuen/raybnn_sparse
Sparse Matrix Library for GPUs, CPUs, and FPGAs via CUDA, OpenCL, and oneAPI
arrayfire cpu cuda gpu gpu-computing opencl parallel parallel-computing parallel-programming raybnn rust sparse sparse-coding sparse-matrix sparse-neural-networks
Last synced: 13 Nov 2024
https://github.com/enriquebdel/clases-cuda-programacion-paralela-en-c-
En este repositorio encontrarás varias lecciones creadas por mí sobre la librería CUDA en C. El programa que utilizo para programar es MobaXterm.
c cuda cuda-programming gnu-linux googlecolab mobaxterm nvidia parallel-programming ubuntu university
Last synced: 26 Jan 2025
https://github.com/ssoehdata/cuda_fortran_sci_eng
Working through examples from the Cuda Fortran for Scientists and Engineers 2nd Edition Book
cuda cuda-fortran fortran hpc nvfortran
Last synced: 10 Dec 2024
https://github.com/meirbek-dev/face-mask_detector
Обнаружие маски на лице в реальном времени
artificial-intelligence covid-19 cuda cudnn deep-learning face-mask graduation-project jupyter-notebook keras machine-learning mask-detection mobilnet-v2 object-detection object-recognition object-tracking opencv4-python python real-time supervised-learning tensorflow2-gpu
Last synced: 11 Jan 2025
https://github.com/pharmcat/metidacu.jl
CUDA solver for Metida.jl
cuda julia-language metida mixed-models
Last synced: 09 Feb 2025
https://github.com/pjueon/cuda_intellisense
A simple python script to fix cuda C++ intellisense for visual studio.
Last synced: 23 Oct 2024
https://github.com/patrickm663/localglmnet.jl
This is a WIP implementation of Richman & Wüthrich (2022) using Julia's Flux.jl + CUDA.jl
cuda deep-learning flux julia neural-networks symbolic-regression xai
Last synced: 17 Jan 2025
https://github.com/abhinavsharma07/streamlit
Stable Diffusion
clip cuda denoising diffusers generative-models latent-diffusion latent-space lms-scheduler unet
Last synced: 05 Feb 2025
https://github.com/dafadey/GPGPU_OpenCL_vs_CUDA
This is a repository with sample codes for testing memory bandwidth, arithmetic latency hiding and shared/local memory performance on AMD and nVidia devices
cuda gpgpu gpgpu-computing opencl
Last synced: 19 Nov 2024
https://github.com/alextmjugador/rust-cuda-quickstart
Bring the Rust-CUDA project back to life under modern Linux environments.
cuda cuda-programming cuda-rust cuda-support docker rust
Last synced: 26 Jan 2025
https://github.com/duskvirkus/ofxarrayfire
An openFrameworks addon with pre-compiled binaries of ArrayFire.
arrayfire cuda ofxaddon openframeworks openframeworks-addon
Last synced: 25 Jan 2025
https://github.com/miniex/maidenx
Rust-based CUDA library designed for learning purposes and building my AI engines named Maiden Engine
Last synced: 28 Oct 2024
https://github.com/le-ander/msc_bioinfo-experimental_design
Using information theory to inform experimental design with GPU acceleration. Computing group project as part of the MSc in Bioinformatics and Theorectical Systems Biology at Imperial College London 2016/2017.
cuda experimental-design gpu-computing information-theory pycuda systems-biology
Last synced: 31 Jan 2025
https://github.com/aliyoussef97/triton-hub
A container of various PyTorch neural network modules written in Triton.
cuda deep-learning openai pytorch triton triton-lang
Last synced: 05 Feb 2025
https://github.com/gordonkoerner1/gordo_cuda
Library of Cython Wrappers for the NVIDIA API
cuda python sparse-linear-algebra sparse-linear-solver sparse-linear-systems
Last synced: 05 Feb 2025
https://github.com/headless-start/data-augmentation-impact
This repository contains effect of Data Augmentation of Training Set during Model Training.
augmented-images cuda data gpu keras matplotlib mnist opencv-python python3 tensorflow training-data
Last synced: 08 Feb 2025
https://github.com/microo8/micronn
Simple neural network library with backpropagation using CUDA
Last synced: 26 Jan 2025
https://github.com/brendanbignell/cuda_montecarlooptionpricer
CUDA Monte Carlo Barrier Option Pricing Demo & Jupyer lab ML models
cuda deep-learning ml pytorch quantitative-finance xgboost-regression
Last synced: 05 Feb 2025
https://github.com/hyunjinno/multicore_computing
A repository of multicore programming in Java and C.
c cpp cuda java multithreading openmp thread thrust
Last synced: 25 Jan 2025
https://github.com/emmanuelmess/firstcollisiontimesteprarefiedgassimulator
This simulator computes all possible intersections for a very small timestep for a particle model
Last synced: 15 Jan 2025
https://github.com/neoblizz/cupti-plus-plus
CUPTI++ is a C++ interface to the CUDA Profiling Tools Interface (CUPTI).
cpp cuda cuda-profiler cupti profiler
Last synced: 09 Feb 2025
https://github.com/malolm/jupyter-ml-with-gpu-support
Jupyter with GPU acceleration for Windows 10/11
cuda cudnn jupternotebook jupyter jupyterlab nvidia-gpu windows-10 windows-11
Last synced: 06 Feb 2025