Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
CUDA
![](https://explore-feed.github.com/topics/cuda/cuda.png)
CUDA® is a parallel computing platform and programming model developed by NVIDIA for general computing on graphical processing units (GPUs). With CUDA, developers are able to dramatically speed up computing applications by harnessing the power of GPUs.
- GitHub: https://github.com/topics/cuda
- Wikipedia: https://en.wikipedia.org/wiki/CUDA
- Created by: Nvidia
- Released: June 23, 2007
- Related Topics: nvcc,
- Last updated: 2025-02-13 00:07:16 UTC
- JSON Representation
https://github.com/potato3d/grid-rt
GPU-accelerated ray tracing using GLSL and CUDA
cuda glsl gpu ray-tracing real-time-rendering
Last synced: 10 Jan 2025
https://github.com/pfcclab/open3d
Open3D: A Modern Library for 3D Data Processing
3d 3d-perception arm computer-graphics cpp cuda gpu gui machine-learning mesh-processing odometry opengl paddle pointcloud python reconstruction registration rendering tensorflow visualization
Last synced: 21 Jan 2025
https://github.com/navdeep-g/dimreduce4gpu
Dimensionality reduction ("dimreduce") on GPUs ("4gpu")
cplusplus cuda dimensionality-reduction gpu linear-algebra pca python svd unsupervised-learning
Last synced: 24 Dec 2024
https://github.com/rogerallen/jmandelbrotr
Java CUDA Mandelbrot explorer
cuda cuda-opengl java jcuda joml lwjgl3 mandelbrot-viewer opengl
Last synced: 25 Jan 2025
https://github.com/bensuperpc/easyai
Make your own AI easily !
ai cuda python python3 tensorflow
Last synced: 17 Jan 2025
https://github.com/pd2871/high-performance-computing
This repo contain the logs of High Performance Computing module's final Assignment
blurred-images c cuda gaussian-blur matrix-multiplication multi-threading parallel-computing pthreads pthreads-api
Last synced: 25 Jan 2025
https://github.com/lmlsna/install-scripts
Ubuntu install scripts
cuda do-release-upgrade eol nvidia tailscale ubuntu
Last synced: 25 Nov 2024
https://github.com/aiday-mar/mpi-cuda-project
Using MPI and CUDA in order to accelerate the conjugate gradient algorithm execution in C++
c-plus-plus cuda gpu mpi university-project
Last synced: 05 Jan 2025
https://github.com/matthewfeickert/cuda-tf-torch
An Ubuntu 18.04 NVIDIA Docker image with CUDA 10.1 CuDNN 7 with TensorFlow and PyTorch
cuda cuda-101 cudnn cudnn-v7 docker docker-image gpu nvidia-docker nvidia-gpu pytorch tensorflow torch
Last synced: 01 Feb 2025
https://github.com/webis-de/pytorch-window-matmul
a custom CUDA kernel for windowed matrix multiplication
Last synced: 24 Dec 2024
https://github.com/l30nardosv/reproduce-parcosi-moleculardocking
Reproducing paper: "Benchmarking the Performance of Irregular Computations in AutoDock-GPU Molecular Docking"
autodock-gpu cpu cuda gpu molecular-docking molecular-docking-scripts opencl paper reproducible-research
Last synced: 05 Feb 2025
https://github.com/dark-art108/artistic-style-transfer-cnn
cnn-architecture colab-notebooks cuda pil vgg19
Last synced: 11 Jan 2025
https://github.com/sagdrip/cudarrows
CUDA port of Logic Arrows
cellular-automata cuda gpu-acceleration logic-gates
Last synced: 21 Jan 2025
https://github.com/betarixm/cuecc
POSTECH: Heterogeneous Parallel Computing (Fall 2023)
cryptography ctypes cuda ecc postech secp256k1
Last synced: 18 Nov 2024
https://github.com/hanzhi713/bitonic-sort
In-place GPU sort with bitonic sort
bitonic-sort cuda gpu in-place sorting
Last synced: 26 Jan 2025
https://github.com/qin-yu/julia-svm-gpu-cuda
2019 [Julia] GPU CUDAnative SVM: a stochastic decomposition implementation of support-vector machine training
cpp cuda cuda-programming gpu gpu-computing gpu-programming julia julia-language julia-package machine-learning machine-learning-algorithms machine-learning-library online-learning supervised-learning svm svm-classifier svm-learning svm-library svm-model svm-training
Last synced: 22 Jan 2025
https://github.com/tawssie/zmpy3d_cp
Python implementation of 3D Zernike moments with CuPy
3d-zernike cuda cupy gpu protein-structure python structural-bioinformatics superposition zernike-moments
Last synced: 08 Nov 2024
https://github.com/ilyasmoutawwakil/optimum-whisper-autobenchmark
A set of benchmarks on OpenAI's Whisper model, using AutoBenchmark and Optimum's OnnxRuntime Optimizations.
Last synced: 30 Jan 2025
https://github.com/tthebc01/cudaconda3
Lightweight container environment with Cuda, Miniconda3, and Jupyter Lab.
cuda docker gpu jupyterlab marimo-notebook miniconda3 reverse-proxy-application
Last synced: 03 Jan 2025
https://github.com/andreasholt/cusmc
A CUDA-accelerated Statistical Model Checker for Stochastic Timed Automata
Last synced: 02 Jan 2025
https://github.com/infotrend-inc/ctpo-demo_projects
Jupyter Notebook examples using CTPO as their source container.
cuda opencv pytroch tensorflow2
Last synced: 05 Feb 2025
https://github.com/trick-17/backends
Interchangeable backends in C++, OpenMP, CUDA, OpenCL, OpenACC
c-plus-plus cross-platform cuda cuda-backend header-only openacc openacc-backend opencl opencl-backend openmp openmp-backend
Last synced: 13 Jan 2025
https://github.com/shikha-code36/cuda-programming-beginner-guide
A beginner's guide to CUDA programming
cuda cuda-basic cuda-basics cuda-cpp cuda-demo cuda-kernel cuda-kernels cuda-library cuda-programming cuda-support cuda-toolkit
Last synced: 13 Jan 2025
https://github.com/dotblueshoes/robertscross
The Roberts cross operator is used in image processing and computer vision for edge detection.
cuda edge-detection image-processing
Last synced: 05 Feb 2025
https://github.com/brosnanyuen/raybnn_graph
Graph Manipulation Library For GPUs, CPUs, and FPGAs via CUDA, OpenCL, and oneAPI
cuda gpu graph graph-algorithms neural-network neural-networks opencl raybnn rust
Last synced: 13 Nov 2024
https://github.com/mayukhdeb/patrick
Tiny neural net library written from scratch with cupy :warning: under construction :warning:
cuda deep-learning gpu-computing machine-learning neural-network regression
Last synced: 12 Feb 2025
https://github.com/dvhh/masscorrelation
An exercise in writing an efficient correlation calculator
calculations correlation-calculation cuda matrix multi-threading openmp
Last synced: 25 Jan 2025
https://github.com/makischristou/mandelbrot
Mandelbrot set visualizer using CUDA.
cpp cuda gpu mandelbrot nvidia renderer rust
Last synced: 20 Jan 2025
https://github.com/brosnanyuen/raybnn_dataloader
Data Loader for RayBNN
arrayfire cpu csv csv-parser cuda data-structures gpu-computing oneapi opencl parallel parallel-computing rust
Last synced: 13 Jan 2025
https://github.com/ergonomech/comfyui-windows-installer
Automated setup for ComfyUI on Windows with CUDA, custom plugins, and optimized PyTorch settings. Made to Run as Server and Error Correct,. Easy installation and launch using Miniconda.
automation comfy conda conda-environment cuda hosting-deployment setup windows
Last synced: 06 Feb 2025
https://github.com/quantum-integrated-technologies/deepforge
DeepForge : framework for working with machine learning.
ai artificial-intelligence cuda library machine-learning ml neural-network
Last synced: 10 Feb 2025
https://github.com/ssoehdata/cuda_fortran_sci_eng
Working through examples from the Cuda Fortran for Scientists and Engineers 2nd Edition Book
cuda cuda-fortran fortran hpc nvfortran
Last synced: 10 Dec 2024
https://github.com/dafadey/GPGPU_OpenCL_vs_CUDA
This is a repository with sample codes for testing memory bandwidth, arithmetic latency hiding and shared/local memory performance on AMD and nVidia devices
cuda gpgpu gpgpu-computing opencl
Last synced: 19 Nov 2024
https://github.com/tlabaltoh/tlab-sharescreen-server-win
Software frame encoder using CUDA and cast encoded frames over UDP. Trying to implement a custom streaming protocol and shader based frame encoder/decoder for screencast.
cuda desktop-capture screensharing unity unity3d windows-graphics-capture
Last synced: 28 Jan 2025
https://github.com/dhruvsrikanth/cudann
A distributed implementation of a deep learning framework in CUDA.
cpp cuda deep-learning deep-learning-framework gpu-programming high-performance-computing hpc parallel-programming
Last synced: 25 Dec 2024
https://github.com/headless-start/data-augmentation-impact
This repository contains effect of Data Augmentation of Training Set during Model Training.
augmented-images cuda data gpu keras matplotlib mnist opencv-python python3 tensorflow training-data
Last synced: 08 Feb 2025
https://github.com/ezroot/gacc
GIACC - Generate Images, Art, Code and Conversations
ai codegen cuda huggingface image imagegeneration python rust stablediffusion
Last synced: 18 Jan 2025
https://github.com/ophoperhpo/dcgan-lentach-logo-generator
The Lentach logo generator. #MachineLearningFun
cuda dcgan dcgan-tensorflow keras lentach machinelearning ml
Last synced: 04 Jan 2025
https://github.com/ruturaj4/cuda_nvidia_tutorial
cuda projects
cuda cuda-vector-addition nvidia nvidia-cuda parallel
Last synced: 16 Jan 2025
https://github.com/abhisheknair10/occupancy.nn
An multi-step pipeline to train and inference Occupancy Networks
Last synced: 13 Jan 2025
https://github.com/poodarchu/vision-lab
Computer Vision Experiments in all.
computer-vision cuda object-detection
Last synced: 28 Jan 2025
https://github.com/liuyuweitarek/pytorch-docker-builder
Automate PyTorch Docker image builds with compatible Python, CUDA, and Poetry versions, including CI/CD for testing.
cicd containerd cuda docker docker-image poetry-python python python3 pytorch pytorch-docker
Last synced: 24 Jan 2025
https://github.com/programmer-rd-ai/detectx
A Pythonic approach to object detection using Detectron2, a clean, modular framework for training and deploying computer vision models. DetectX simplifies the complexity of object detection while maintaining high performance and extensibility.
coco-dataset computer-vision computer-vision-library cuda deep-learning detectron2 faster-rcnn gpu-accelerated machine-learning ml-framework object-detection object-recognition python3 pytorch retinanet
Last synced: 12 Jan 2025
https://github.com/programmer-rd-ai/digivis
A PyTorch-based deep learning implementation for MNIST digit recognition featuring CNNs, GPU acceleration, experiment tracking, and comprehensive testing capabilities.
cnn computer-vision cuda data-science deep-learning digit-recognition image-classification machine-learning mnist neural-networks python pytorch wandb
Last synced: 12 Jan 2025
https://github.com/anne-andresen/multi-modal-cuda-c-gan
Raw C/cuda implementation of 3d GAN
3d 3d-models attention-mechanism c cross-attention cross-attention-c cuda gan gan-models low-level-programming medical-imaging multimodal-deep-learning pytorch transformer-pytorch transformers transformers-c
Last synced: 05 Nov 2024
https://github.com/kaierikniermann/hpc-uzh-notes
These are some notes for the High Performance Computing course taught at UZH
cuda high-performance-computing mpi openacc openmp
Last synced: 12 Jan 2025
https://github.com/ehsanmok/cs-521
UBC CS 521: Parallel Computing and Architectures
cuda erlang parallel-algorithm parallel-computing
Last synced: 10 Jan 2025
https://github.com/assem-elqersh/tensorflow-gpu-setup
This guide provides the essential steps to get TensorFlow running with GPU support on your windows system.
anaconda conda cuda cudnn deep-learning gpu machine-learning tensorflow
Last synced: 03 Jan 2025
https://github.com/jonasricker/autocvd
Tool to automatically set CUDA_VISIBLE_DEVICES based on GPU utilization. Usable from command line and code.
cuda cuda-visible-devices gpu keras machine-learning nvidia python pytorch tensorflow
Last synced: 03 Jan 2025
https://github.com/bl33h/productoftwovectors
This code utilizes CUDA for parallel vector multiplication on a GPU, demonstrating GPU's acceleration capabilities.
cuda gpu kernel paralelism parallel-programming product vector
Last synced: 21 Jan 2025
https://github.com/aliyoussef97/triton-hub
A container of various PyTorch neural network modules written in Triton.
cuda deep-learning openai pytorch triton triton-lang
Last synced: 05 Feb 2025
https://github.com/rkv0id/automata-vtk
Multi-dimensional Cellular Automata visualization using Python's VTK bindings on top of a CUDA-parallel grid updates.
cellular-automata cuda game-of-life python vtk
Last synced: 03 Jan 2025
https://github.com/abhinavsharma07/streamlit
Stable Diffusion
clip cuda denoising diffusers generative-models latent-diffusion latent-space lms-scheduler unet
Last synced: 05 Feb 2025
https://github.com/whutao/artificial-art
Image approximation with triangles using evolutionary algorithm.
cuda evolutionary-algorithm python3
Last synced: 16 Jan 2025
https://github.com/speedcell4/torchdevice
Setup CUDA_VISIBLE_DEVICES
cuda deep-learning gpu machine-learning pytorch
Last synced: 08 Feb 2025
https://github.com/fblupi/grado_informatica-ppr
Prácticas de la asignatura Programación Paralela de la UGR
cuda mpi openmp parallel-computing
Last synced: 30 Jan 2025
https://github.com/jakubriegel/game_of_life_3d
3D game of life implemented in CUDA
concurency cuda gameoflife nvidia put-poznan
Last synced: 01 Feb 2025
https://github.com/adamczykpiotr/cudamatrixlibrary
Matrix operation library using single, n-threads or CUDA supported GPU
agh agh-ust cpp cuda cuda-library matrix matrix-computations matrix-functions matrix-multiplication
Last synced: 19 Jan 2025
https://github.com/lcsb-biocore/cufluxsampler.jl
GPU-accelerated algorithms for flux sampling in CUDA.jl
cobra cuda gpu julia metabolic-network metabolism sampling
Last synced: 30 Jan 2025
https://github.com/mala13f/statistical-learning-in-finance
This Repository contains all the codes, papers and related data for assignments done during the course.
cuda gpu-acceleration jupyter-notebook machine-learning python statistical-learning
Last synced: 31 Jan 2025
https://github.com/galaxies99/inception-cuda
CUDA Implementation of Inception
Last synced: 07 Nov 2024
https://github.com/microo8/micronn
Simple neural network library with backpropagation using CUDA
Last synced: 26 Jan 2025
https://github.com/emmanuelmess/firstcollisiontimesteprarefiedgassimulator
This simulator computes all possible intersections for a very small timestep for a particle model
Last synced: 15 Jan 2025
https://github.com/alegau03/parallel-k-means
Implementation of C programs for the K-Means algorithm for parallel computing.
c c-programming cuda parallel parallel-programming
Last synced: 05 Feb 2025
https://github.com/andih/cuda-fortran-stream
Variant of STREAM Benchmark in CUDA Fortran
cuda cuda-fortran gpu stream-benchmarks variants
Last synced: 12 Jan 2025
https://github.com/piyush26c/cuda-programming
c cuda ipynb-jupyter-notebook mathematics sppu-computer-engineering
Last synced: 12 Jan 2025
https://github.com/bjornmelin/deep-learning-evolution
🧠Deep-Learning Evolution: Unified collection of TensorFlow & PyTorch projects, featuring custom CUDA kernels, distributed training, memory‑efficient methods, and production‑ready pipelines. Showcases advanced GPU optimizations, from foundational models to cutting‑edge architectures. 🚀
ai-research cuda data-science deep-learning distributed-training gan gpu-acceleration machine-learning model-optimization neural-networks python pytorch tensorflow training-pipeline transformers
Last synced: 05 Feb 2025
https://github.com/kar-dim/fidelityfx-cas-cuda
Implementation of the AMD FidelityFX CAS (Contrast Adaptive Sharpening) algorithm on CUDA, for sharpening static images.
cpp cuda dll fidelityfx gpu image-processing parallel-computing sharpen
Last synced: 26 Dec 2024
https://github.com/dansolombrino/gphungarian
A GPU-accelerated implementation of the Hungarian Algorithm, written in CUDA
Last synced: 07 Feb 2025
https://github.com/malolm/jupyter-ml-with-gpu-support
Jupyter with GPU acceleration for Windows 10/11
cuda cudnn jupternotebook jupyter jupyterlab nvidia-gpu windows-10 windows-11
Last synced: 06 Feb 2025
https://github.com/sanaeprj/matrix-for-cpp
This repository has types that handle matrices.
cpp14 cpp14-library cuda matrix-library
Last synced: 19 Nov 2024
https://github.com/gordonkoerner1/gordo_cuda
Library of Cython Wrappers for the NVIDIA API
cuda python sparse-linear-algebra sparse-linear-solver sparse-linear-systems
Last synced: 05 Feb 2025
https://github.com/giorgiogamba/parallel_programming
Experimenting with parallel programming
cuda cuda-kernels cuda-programming cuda-toolkit parallel parallel-computing parallel-processing parallel-programming visual-studio
Last synced: 30 Dec 2024
https://github.com/enp1s0/curand_fp16
FP16 pseudo random number generator on GPU
cuda gpu half-precision random-number-generators
Last synced: 26 Dec 2024
https://github.com/duskvirkus/ofxarrayfire
An openFrameworks addon with pre-compiled binaries of ArrayFire.
arrayfire cuda ofxaddon openframeworks openframeworks-addon
Last synced: 25 Jan 2025
https://github.com/zeloe/juce_cuda_convolution
Linear realtime convolution using CUDA
audio audio-processing convolution cuda dsp juce
Last synced: 25 Dec 2024
https://github.com/le-ander/msc_bioinfo-experimental_design
Using information theory to inform experimental design with GPU acceleration. Computing group project as part of the MSc in Bioinformatics and Theorectical Systems Biology at Imperial College London 2016/2017.
cuda experimental-design gpu-computing information-theory pycuda systems-biology
Last synced: 31 Jan 2025
https://github.com/franciscoda/psvm
R package and C++ library that allows training SVM models in a GPU using CUDA and predicting out-of-sample data. A support vector machine (SVM) is a type of machine learning model that is trained using supervised data to classify samples.
cpp cpp17 cuda machine-learning r svm-classifier svm-training
Last synced: 28 Jan 2025
https://github.com/stanczakdominik/cuda_poisson
A 2D poisson solver via CUDA
Last synced: 04 Feb 2025
https://github.com/5had3z/torch-discounted-cumsum-nd
PyTorch Discounted Cumsum with Autograd (CPU + CUDA)
Last synced: 05 Feb 2025
https://github.com/antonioberna/nn-gpu-logic-gates
Neural Network implementation on GPU using CUDA C++ to learn logic gates operations
cpp cuda gpu logic-gates neural-networks nvidia
Last synced: 05 Feb 2025
https://github.com/hyunjinno/multicore_computing
A repository of multicore programming in Java and C.
c cpp cuda java multithreading openmp thread thrust
Last synced: 25 Jan 2025
https://github.com/pratikvn/nla4hpc-exercises-framework
The exercises framework for the Numerical Linear Algebra for HPC course at Karlsruhe Institute of Technology.
cuda ginkgo homeworks hpc-course teaching
Last synced: 26 Jan 2025
https://github.com/sohhamseal/scalable-systems-programs
A little less effort to learn parallel programming...
Last synced: 13 Jan 2025