Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
CUDA
![](https://explore-feed.github.com/topics/cuda/cuda.png)
CUDA® is a parallel computing platform and programming model developed by NVIDIA for general computing on graphical processing units (GPUs). With CUDA, developers are able to dramatically speed up computing applications by harnessing the power of GPUs.
- GitHub: https://github.com/topics/cuda
- Wikipedia: https://en.wikipedia.org/wiki/CUDA
- Created by: Nvidia
- Released: June 23, 2007
- Related Topics: nvcc,
- Last updated: 2025-02-13 00:07:16 UTC
- JSON Representation
https://github.com/lcsb-biocore/cufluxsampler.jl
GPU-accelerated algorithms for flux sampling in CUDA.jl
cobra cuda gpu julia metabolic-network metabolism sampling
Last synced: 30 Jan 2025
https://github.com/microo8/micronn
Simple neural network library with backpropagation using CUDA
Last synced: 26 Jan 2025
https://github.com/patrickm663/localglmnet.jl
This is a WIP implementation of Richman & Wüthrich (2022) using Julia's Flux.jl + CUDA.jl
cuda deep-learning flux julia neural-networks symbolic-regression xai
Last synced: 17 Jan 2025
https://github.com/donpablonows/coin
🪙 Crypto Optimization Interface Network (aka COIN) is a high-performance Bitcoin address generator using CUDA acceleration and multi-threading. It optimizes GPU and CPU resources for fast address generation, ensures secure private key creation, and includes real-time monitoring and automatic system optimizations.
bitcoin blockchain cryptography cuda gpu-acceleration
Last synced: 07 Jan 2025
https://github.com/kilamper/matrix-multiplication
AC - Matrix multiplication using OpenMP, MPI and CUDA
Last synced: 26 Jan 2025
https://github.com/jakubriegel/game_of_life_3d
3D game of life implemented in CUDA
concurency cuda gameoflife nvidia put-poznan
Last synced: 01 Feb 2025
https://github.com/pkestene/mandelbrot_kokkos
cuda gpu gpu-computing kokkos mandelbrot openmp performance-portability
Last synced: 10 Feb 2025
https://github.com/nellogan/makefileexamples
Makefile examples of how to automate testing and building of applications/systems that use multiple: languages, compilers, and testing tools.
automated-testing c cuda makefile python valgrind
Last synced: 21 Jan 2025
https://github.com/daelsepara/hipslm
CPU and GPU (using HIP) implementations of phase pattern generators for use with spatial light modulators
computer-generated-holography cuda gpu hip hologram holography phase phase-pattern slm spatial-light-modulator
Last synced: 29 Dec 2024
https://github.com/tensorbfs/cutropicalgemm.jl
The fastest Tropical number matrix multiplication on GPU
Last synced: 13 Feb 2025
https://github.com/quantum-integrated-technologies/deepforge
DeepForge : framework for working with machine learning.
ai artificial-intelligence cuda library machine-learning ml neural-network
Last synced: 10 Feb 2025
https://github.com/sunsided/rust-arrayfire-experiments
Toying around with ArrayFire in Rust
arrayfire conways-game-of-life cuda gpgpu gpu-acceleration gpu-computing opencl rust
Last synced: 13 Feb 2025
https://github.com/di-hal/vision-pro-max
A Raspberry Pi-based object detection system for assisting visually impaired individuals. This project utilizes YOLO object detection and a Hailo 8L TPU to identify obstacles like manholes, potholes, and bumps, providing real-time audio feedback to aid navigation.
bash computer-vision cuda fine-tuning gtts jupyter-notebook object-detection opencv python pytorch raspberry-pi rpi-camera ssh text-to-speech ultralytics yolo yolov8
Last synced: 26 Jan 2025
https://github.com/nexusgpu/tensor-fusion-site
TensorFusion landing page and product docs
ai cuda gpu gpu-acceleration gpu-management gpu-monitoring gpu-pooling gpu-sharing gpu-usage gpu-virtualization nvidia nvidia-cuda pytorch rcuda tensorflow
Last synced: 26 Jan 2025
https://github.com/tudasc/cusan-tests
A test suite for CUDA-aware MPI race detection
Last synced: 13 Feb 2025
https://github.com/mre/talks
...mostly Computer Science related.
computer-science cuda talks tech-talks
Last synced: 06 Feb 2025
https://github.com/brosnanyuen/raybnn_sparse
Sparse Matrix Library for GPUs, CPUs, and FPGAs via CUDA, OpenCL, and oneAPI
arrayfire cpu cuda gpu gpu-computing opencl parallel parallel-computing parallel-programming raybnn rust sparse sparse-coding sparse-matrix sparse-neural-networks
Last synced: 13 Feb 2025
https://github.com/malolm/jupyter-ml-with-gpu-support
Jupyter with GPU acceleration for Windows 10/11
cuda cudnn jupternotebook jupyter jupyterlab nvidia-gpu windows-10 windows-11
Last synced: 06 Feb 2025
https://github.com/tlabaltoh/tlab-sharescreen-server-win
Software frame encoder using CUDA and cast encoded frames over UDP. Trying to implement a custom streaming protocol and shader based frame encoder/decoder for screencast.
cuda desktop-capture screensharing unity unity3d windows-graphics-capture
Last synced: 28 Jan 2025
https://github.com/headless-start/data-augmentation-impact
This repository contains effect of Data Augmentation of Training Set during Model Training.
augmented-images cuda data gpu keras matplotlib mnist opencv-python python3 tensorflow training-data
Last synced: 08 Feb 2025
https://github.com/galaxies99/inception-cuda
CUDA Implementation of Inception
Last synced: 07 Nov 2024
https://github.com/thisalmandula/gpu_accelerated_lpt_cfd_code
This repository contains GPU accelerated version of the particle tracking model developed by Merel Kooi for biofouled microplastic particles ( available at: https://pubs.acs.org/doi/10.1021/acs.est.6b04702) written in CUDA Fortran and CUDA Python. This repository is intended as a learning tool for GPU programming.
biofouling computational-fluid-dynamics cuda fortran lagrangian-particle-tracking microplastics python
Last synced: 02 Feb 2025
https://github.com/ergonomech/comfyui-windows-installer
Automated setup for ComfyUI on Windows with CUDA, custom plugins, and optimized PyTorch settings. Made to Run as Server and Error Correct,. Easy installation and launch using Miniconda.
automation comfy conda conda-environment cuda hosting-deployment setup windows
Last synced: 06 Feb 2025
https://github.com/dotblueshoes/robertscross
The Roberts cross operator is used in image processing and computer vision for edge detection.
cuda edge-detection image-processing
Last synced: 05 Feb 2025
https://github.com/inventwithdean/cuda_mlp
Implementation of a simple Multilayer Perceptron in pure CUDA
cuda cuda-programming deep-learning neural-networks
Last synced: 05 Feb 2025
https://github.com/tyler-hilbert/cuda-linearregression
Linear Regression written from scratch in CUDA
ai cublas cuda gpu linear-regression nsight
Last synced: 05 Feb 2025
https://github.com/weiyu0824/flash-attention-lite
Basic Flash attention Implmentation
Last synced: 05 Feb 2025
https://github.com/sartajbhuvaji/cuda
Deloped CUDA kernel functions to load and train a Convolution Neural Network from scratch.
cuda cuda-programming gpu-programming neural-network nvidia-cuda
Last synced: 05 Feb 2025
https://github.com/abdulfatir/subkmeans
Numpy and pyCUDA implementation of subKmeans
clustering cuda kdd kmeans numpy pycuda python subspace-clustering
Last synced: 09 Feb 2025
https://github.com/matx64/rs-netbot
Old School Runescape (MMORPG) Bot created using a Convolutional Neural Network for object identification
Last synced: 09 Feb 2025
https://github.com/pvdberg1998/cufft_rust
A safe Rust wrapper around a subset of cuFFT.
Last synced: 12 Dec 2024
https://github.com/gunrock/template
Template repository for essentials applications to get you started asap!
cpp cuda essentials gpu graph-algorithms graph-analytics gunrock
Last synced: 10 Jan 2025
https://github.com/andygeiss/machine-learning-golang
This repository provides a basic setup to do Machine Learning with Golang and Python, TensorFlow 1.15 and CUDA 10.0.
benchmark cuda docker go golang machine-learning python tensorflow
Last synced: 06 Feb 2025
https://github.com/xza85hrf/ml-framework_checker
ML Framework and CUDA Checker is a Python-based GUI application for checking PyTorch, TensorFlow, and CUDA installations. It provides detailed system specs, compatibility checks, advanced GPU management, and offers options to view instructions, export logs, and update machine learning frameworks.
compatibility cuda gpu-management gui-application machine-learning python pytorch system-checker system-specs tensorflow
Last synced: 30 Jan 2025
https://github.com/bhattbhavesh91/rapids-cudf-cuml-example
Running KNN algorithm much faster on GPU for free using RAPIDS packages like cuML and cuDF
cuda cuml deep-learning nvidia-gpu rapids rapidsai
Last synced: 17 Jan 2025
https://github.com/chintak/theano-lasagne-docker
Dockerfile for Lasagne with Cuda support. Look at the branches for relevant Dockerfiles - ``cpu`` and ``gpu``.
caffe cuda docker dockerfile install-script lasagne machine-learning machine-learning-library theano
Last synced: 23 Dec 2024
https://github.com/abhisheknair10/occupancy.nn
An multi-step pipeline to train and inference Occupancy Networks
Last synced: 13 Jan 2025
https://github.com/vietdoo/seam-carving-cuda
CUDA Seam Carving: Accelerating Image Resizing with GPU Computing
cc cuda cuda-programming gpu-computing parrallel-computing seam-carving
Last synced: 07 Feb 2025
https://github.com/zeloe/juce_cuda_convolution
Linear realtime convolution using CUDA
audio audio-processing convolution cuda dsp juce
Last synced: 25 Dec 2024
https://github.com/enp1s0/curand_fp16
FP16 pseudo random number generator on GPU
cuda gpu half-precision random-number-generators
Last synced: 26 Dec 2024
https://github.com/kar-dim/fidelityfx-cas-cuda
Implementation of the AMD FidelityFX CAS (Contrast Adaptive Sharpening) algorithm on CUDA, for sharpening static images.
cpp cuda dll fidelityfx gpu image-processing parallel-computing sharpen
Last synced: 26 Dec 2024
https://github.com/fblupi/grado_informatica-ppr
Prácticas de la asignatura Programación Paralela de la UGR
cuda mpi openmp parallel-computing
Last synced: 30 Jan 2025
https://github.com/rkv0id/automata-vtk
Multi-dimensional Cellular Automata visualization using Python's VTK bindings on top of a CUDA-parallel grid updates.
cellular-automata cuda game-of-life python vtk
Last synced: 03 Jan 2025
https://github.com/jonasricker/autocvd
Tool to automatically set CUDA_VISIBLE_DEVICES based on GPU utilization. Usable from command line and code.
cuda cuda-visible-devices gpu keras machine-learning nvidia python pytorch tensorflow
Last synced: 03 Jan 2025
https://github.com/assem-elqersh/tensorflow-gpu-setup
This guide provides the essential steps to get TensorFlow running with GPU support on your windows system.
anaconda conda cuda cudnn deep-learning gpu machine-learning tensorflow
Last synced: 03 Jan 2025
https://github.com/ehsanmok/cs-521
UBC CS 521: Parallel Computing and Architectures
cuda erlang parallel-algorithm parallel-computing
Last synced: 10 Jan 2025
https://github.com/matteogianferrari/qr-decomposition
Tthis project implements different methods to exploit caches usage, the multicore CPU and the GPU architectures, on the Gram-Schmidt QR Decomposition algorithm and measure the performance of the different implementations.
cuda openmp parallel-computing
Last synced: 10 Feb 2025
https://github.com/ophoperhpo/dcgan-lentach-logo-generator
The Lentach logo generator. #MachineLearningFun
cuda dcgan dcgan-tensorflow keras lentach machinelearning ml
Last synced: 04 Jan 2025
https://github.com/ezroot/gacc
GIACC - Generate Images, Art, Code and Conversations
ai codegen cuda huggingface image imagegeneration python rust stablediffusion
Last synced: 18 Jan 2025
https://github.com/liuyuweitarek/pytorch-docker-builder
Automate PyTorch Docker image builds with compatible Python, CUDA, and Poetry versions, including CI/CD for testing.
cicd containerd cuda docker docker-image poetry-python python python3 pytorch pytorch-docker
Last synced: 24 Jan 2025
https://github.com/programmer-rd-ai/detectx
A Pythonic approach to object detection using Detectron2, a clean, modular framework for training and deploying computer vision models. DetectX simplifies the complexity of object detection while maintaining high performance and extensibility.
coco-dataset computer-vision computer-vision-library cuda deep-learning detectron2 faster-rcnn gpu-accelerated machine-learning ml-framework object-detection object-recognition python3 pytorch retinanet
Last synced: 12 Jan 2025
https://github.com/programmer-rd-ai/digivis
A PyTorch-based deep learning implementation for MNIST digit recognition featuring CNNs, GPU acceleration, experiment tracking, and comprehensive testing capabilities.
cnn computer-vision cuda data-science deep-learning digit-recognition image-classification machine-learning mnist neural-networks python pytorch wandb
Last synced: 12 Jan 2025
https://github.com/kaierikniermann/hpc-uzh-notes
These are some notes for the High Performance Computing course taught at UZH
cuda high-performance-computing mpi openacc openmp
Last synced: 12 Jan 2025
https://github.com/miniex/maidenx
Rust-based CUDA library designed for learning purposes and building my AI engines named Maiden Engine
Last synced: 28 Oct 2024
https://github.com/adamczykpiotr/cudamatrixlibrary
Matrix operation library using single, n-threads or CUDA supported GPU
agh agh-ust cpp cuda cuda-library matrix matrix-computations matrix-functions matrix-multiplication
Last synced: 19 Jan 2025
https://github.com/mala13f/statistical-learning-in-finance
This Repository contains all the codes, papers and related data for assignments done during the course.
cuda gpu-acceleration jupyter-notebook machine-learning python statistical-learning
Last synced: 31 Jan 2025
https://github.com/alextmjugador/rust-cuda-quickstart
Bring the Rust-CUDA project back to life under modern Linux environments.
cuda cuda-programming cuda-rust cuda-support docker rust
Last synced: 26 Jan 2025
https://github.com/dansolombrino/gphungarian
A GPU-accelerated implementation of the Hungarian Algorithm, written in CUDA
Last synced: 07 Feb 2025
https://github.com/duskvirkus/ofxarrayfire
An openFrameworks addon with pre-compiled binaries of ArrayFire.
arrayfire cuda ofxaddon openframeworks openframeworks-addon
Last synced: 25 Jan 2025
https://github.com/le-ander/msc_bioinfo-experimental_design
Using information theory to inform experimental design with GPU acceleration. Computing group project as part of the MSc in Bioinformatics and Theorectical Systems Biology at Imperial College London 2016/2017.
cuda experimental-design gpu-computing information-theory pycuda systems-biology
Last synced: 31 Jan 2025
https://github.com/brosnanyuen/raybnn_graph
Graph Manipulation Library For GPUs, CPUs, and FPGAs via CUDA, OpenCL, and oneAPI
cuda gpu graph graph-algorithms neural-network neural-networks opencl raybnn rust
Last synced: 13 Feb 2025
https://github.com/hyunjinno/multicore_computing
A repository of multicore programming in Java and C.
c cpp cuda java multithreading openmp thread thrust
Last synced: 25 Jan 2025
https://github.com/alekseyscorpi/vacancies_server
This is a server for vacancies generation using LLM (Saiga3)
code cuda cuda-toolkit docker dockerfile flask llama3 llamacpp llm ngrok pydantic saiga
Last synced: 01 Feb 2025
https://github.com/xavierjiezou/gpu-compute-capability
An application for querying the computing power of each gpu released by NVIDIA.
Last synced: 01 Feb 2025
https://github.com/hartorn/docker-python
Repository to build python image, based on ubuntu and CUDA
cuda docker mkl-dnn onednn python3 ubuntu ubuntu1804
Last synced: 12 Jan 2025
https://github.com/speedcell4/torchdevice
Setup CUDA_VISIBLE_DEVICES
cuda deep-learning gpu machine-learning pytorch
Last synced: 08 Feb 2025
https://github.com/mortafix/quickshift
A working implementation of Quickshift algorithm in CUDA, GPU-compatible.
Last synced: 13 Jan 2025
https://github.com/dhruvsrikanth/cudann
A distributed implementation of a deep learning framework in CUDA.
cpp cuda deep-learning deep-learning-framework gpu-programming high-performance-computing hpc parallel-programming
Last synced: 25 Dec 2024
https://github.com/bolner/totally-diffused
Debian/NVIDIA Docker image for AUTOMATIC1111's Stable Diffusion application.
automatic1111 cuda debian docker-image nvidia stable-diffusion xformers
Last synced: 08 Feb 2025
https://github.com/pratikvn/nla4hpc-exercises-framework
The exercises framework for the Numerical Linear Algebra for HPC course at Karlsruhe Institute of Technology.
cuda ginkgo homeworks hpc-course teaching
Last synced: 26 Jan 2025
https://github.com/rhysdg/whisper-onnx-python
A low-footprint GPU accelerated Speech to Text Python package for the Jetpack 5 era bolstered by an optimized graph
ai chatbot cuda machine-learning onnxruntime speech-to-text whisper
Last synced: 08 Feb 2025
https://github.com/brosnanyuen/raybnn_dataloader
Data Loader for RayBNN
arrayfire cpu csv csv-parser cuda data-structures gpu-computing oneapi opencl parallel parallel-computing rust
Last synced: 13 Jan 2025
https://github.com/mhaseeb123/gcb
GCB includes a suite of benchmarks and basic tests for CUDA-aware MPI and C++ compilers.
cpp cpp23 cuda mpi partitioned-communication st-mpi
Last synced: 24 Jan 2025
https://github.com/pjueon/cuda_intellisense
A simple python script to fix cuda C++ intellisense for visual studio.
Last synced: 23 Oct 2024
https://github.com/nickolasrm/gpuvscpumatrixmultiplication
CPU and GPU optimized matrix multiplication (AVX, transposition, CUDA and other)
avx comparison cuda hpc matrix multiplication
Last synced: 28 Dec 2024
https://github.com/whutao/artificial-art
Image approximation with triangles using evolutionary algorithm.
cuda evolutionary-algorithm python3
Last synced: 16 Jan 2025
https://github.com/meirbek-dev/face-mask_detector
Обнаружие маски на лице в реальном времени
artificial-intelligence covid-19 cuda cudnn deep-learning face-mask graduation-project jupyter-notebook keras machine-learning mask-detection mobilnet-v2 object-detection object-recognition object-tracking opencv4-python python real-time supervised-learning tensorflow2-gpu
Last synced: 11 Jan 2025
https://github.com/makischristou/mandelbrot
Mandelbrot set visualizer using CUDA.
cpp cuda gpu mandelbrot nvidia renderer rust
Last synced: 20 Jan 2025
https://github.com/ginkgo-project/cudaarchitectureselector
A CMake module simplifying the specification of CUDA architectures
Last synced: 27 Dec 2024
https://github.com/michaelfranzl/image_debian-gpgpu
Dockerfile for a Debian base image with AMD and Nvidia GPGPU support
amd container container-image cuda debian docker gpgpu nvidia opencl
Last synced: 21 Jan 2025
https://github.com/enapiuz/logic-circuit-simulator
Logic circuit (based on NAND gates) simulator using OpenCL
c circuit-simulator cuda digital-logic gpgpu logic-gates opencl simulator
Last synced: 06 Feb 2025
https://github.com/sferez/sspp_sparse_matrix_cuda
Small Scale Parallel Programming, Sparse Matrix multiplication with CUDA
cpp cuda omp omp-parallel parallel-computing small-scale-parallel-programming sparse-matrix
Last synced: 13 Jan 2025
https://github.com/r00tens/text-classifier
Naive Bayes classifier for text classification with CPU and GPU (CUDA)
classification classifier cpp cuda machine-learning naive-bayes
Last synced: 05 Feb 2025
https://github.com/vectorworksreal/sd-forge-docker
sd forge webui docker image.
ai-art artificial-intelligence containerization cuda docker docker-image forge image-to-image machine-learning sd-forge stable-diffusion stable-diffusion-webui text-to-image ubuntu webui
Last synced: 10 Feb 2025
https://github.com/vectorworksreal/ooba-text-docker
ooba text gen webui docker image.
artificial-intelligence containerization cuda docker docker-image large-language-model llm machine-learning python python3 text-generation text-generation-webui ubuntu webui
Last synced: 10 Feb 2025
https://github.com/h1me01/cuda-neural-network
CUDA version of my previous AVX-512 based Neural Network. (Still in development)
chess cuda cuda-programming neural-network neural-networks-from-scratch
Last synced: 05 Feb 2025
https://github.com/fatlipp/toyslam
SLAM implementation from scratch w/o external graph optimization libs
cuda gpu lidar-slam mapping odometry robotics slam
Last synced: 05 Feb 2025
https://github.com/emanuelemessina/cuda-benchmark
Evaluate matrix calculations time between CPU and GPU (CUDA)
benchmark cuda matrix-calculations
Last synced: 10 Feb 2025
https://github.com/efecaliskannn/pneumonia-detection-with-cnn--vgg16--and-resnet50-deep-learning-models
In this project, pneumonia detection using deep learning, a subset of artificial intelligence, is aimed. The performance of deep learning algorithms, including CNN, VGG16, and ResNet50 models, in detecting pneumonia has been examined.(Bu projede yapay zekanın alt kümesi olan derin öğrenme ile zatürre tespiti amaçlanmaktadır.)
artificial-intelligence convolutional-neural-networks cuda deep-learning keras-tensorflow nvidia-cuda pyhton transfer-learning
Last synced: 05 Feb 2025
https://github.com/mateuszk098/parallel-programming-examples
Simple parallel programming examples with CUDA, MPI and OpenMP.
cpp cuda mpi openmp parallel-programming
Last synced: 28 Dec 2024
https://github.com/chrisdalvit/gpu-matrix-transpose
Implementation and benchmarking of different matrix transpose with CUDA
c cpp cuda cuda-kernels cuda-programming gpu-acceleration gpu-computing gpu-programming matrix-transpose nvidia-gpu
Last synced: 13 Feb 2025
https://github.com/rog0d/gpuss_watchers
"The GPU Watchers swore upon their shared memory hierarchy, from L1 to global memory, which also served as their mandate as lords of parallel computation."
cuda gpu-acceleration gpu-monitoring gpu-profiling
Last synced: 13 Feb 2025
https://github.com/m-torhan/advent-of-code
🎄 Solutions for the Advent of Code
advent-of-code advent-of-code-2024 cuda
Last synced: 13 Feb 2025