Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

CUDA

CUDA® is a parallel computing platform and programming model developed by NVIDIA for general computing on graphical processing units (GPUs). With CUDA, developers are able to dramatically speed up computing applications by harnessing the power of GPUs.

https://github.com/ilyasmoutawwakil/optimum-whisper-autobenchmark

A set of benchmarks on OpenAI's Whisper model, using AutoBenchmark and Optimum's OnnxRuntime Optimizations.

benchmark cuda deep-learning

Last synced: 30 Jan 2025

https://github.com/yomi4486/zundamon_v3

マスター、お冷ショットで。

cuda discord-bot discord-py docker docker-compose python tts voicevox zundamon

Last synced: 27 Nov 2024

https://github.com/teodutu/asc

Arhitectura Sistemelor de Calcul - UPB 2020

cache-optimization cuda parallel-programming profiling python-threading

Last synced: 30 Jan 2025

https://github.com/deftruth/hgemm-tensorcores-mma

⚡️Write HGEMM from scratch using Tensor Cores with WMMA, MMA PTX and CuTe API. 🎉🎉

cuda hgemm tensor-cores

Last synced: 04 Dec 2024

https://github.com/infotrend-inc/ctpo-demo_projects

Jupyter Notebook examples using CTPO as their source container.

cuda opencv pytroch tensorflow2

Last synced: 05 Feb 2025

https://github.com/bensuperpc/easyai

Make your own AI easily !

ai cuda python python3 tensorflow

Last synced: 17 Jan 2025

https://github.com/abaksy/cuda-examples

A repository of examples coded in CUDA C/C++

cuda

Last synced: 17 Jan 2025

https://github.com/elftausend/sliced

Array operations with automatic differentiation on CPU and GPU

autograd automatic-differentiation cuda custos matrix opencl

Last synced: 14 Feb 2025

https://github.com/potato3d/grid-rt

GPU-accelerated ray tracing using GLSL and CUDA

cuda glsl gpu ray-tracing real-time-rendering

Last synced: 10 Jan 2025

https://github.com/dqbd/cuda-btree

Implementation of B-Trees on NVIDIA CUDA

b-tree cuda nvidia

Last synced: 13 Feb 2025

https://github.com/kagof/julia-image-processing

Image processing programs written in Julia

cuda image-processing julia

Last synced: 12 Feb 2025

https://github.com/denzp/current

CUDA high-level Rust framework

cuda rust

Last synced: 24 Dec 2024

https://github.com/zeloe/rtconvolver

A realtime convolution VST3

c convolution cplusplus cuda juce

Last synced: 25 Dec 2024

https://github.com/dujonwalker/nixos-config-x86_64-cuda

This repository contains my NixOS configuration optimized for 64-bit x86 systems with NVIDIA CUDA support, featuring a Plasma 6 desktop environment and a variety of essential applications for development, multimedia, and productivity. It serves as a backup for easy restoration and setup on new installations.

cuda flatpak nix nixos nixos-configuration ollama

Last synced: 26 Dec 2024

https://github.com/andreasholt/cusmc

A CUDA-accelerated Statistical Model Checker for Stochastic Timed Automata

cuda smc

Last synced: 02 Jan 2025

https://github.com/tthebc01/cudaconda3

Lightweight container environment with Cuda, Miniconda3, and Jupyter Lab.

cuda docker gpu jupyterlab marimo-notebook miniconda3 reverse-proxy-application

Last synced: 03 Jan 2025

https://github.com/archibate/cuda_aero_lbm

A toy 3D LBM solver in CUDA

cuda graphics simulation

Last synced: 03 Jan 2025

https://github.com/kar-dim/watermarking-gpu

Code for my Diploma thesis at Information and Communication Systems Engineering (University of the Aegean, School of Engineering) with title "Efficient implementation of watermark and watermark detection algorithms for image and video using the graphics processing unit". Part 2 / GPU

arrayfire cpp cuda gpu image-processing opencl parallel-computing video-processing watermark-image watermarking

Last synced: 04 Jan 2025

https://github.com/szymon423/tsp-cpu-vs-gpu

Simple brute force approach to solve travelling salesman problem with CPU and GPU

cuda tsp

Last synced: 18 Jan 2025

https://github.com/dito97/gol

High-performance Computing (90535) final project at UniGe

cuda mpi openmp

Last synced: 14 Feb 2025

https://github.com/nellogan/distributed_compy

Distributed_compy is a distributed computing library that offers multi-threading, heterogeneous (CPU + mult-GPU), and multi-node support

cluster cuda heterogeneous-parallel-programming multi-threading multigpu openmp openmpi

Last synced: 12 Jan 2025

https://github.com/pd2871/high-performance-computing

This repo contain the logs of High Performance Computing module's final Assignment

blurred-images c cuda gaussian-blur matrix-multiplication multi-threading parallel-computing pthreads pthreads-api

Last synced: 25 Jan 2025

https://github.com/orlandopalmeira/trabalho-cp-2023-2024

Repositório do trabalho prático no âmbito da UC de Computação Paralela (CP) - Mestrado em Engenharia Informática (MEI/MIEI) - Universidade do Minho (UMinho)

computacao-paralela cp cuda cuda-programming mei miei nvidia nvidia-cuda openmp optimization optimization-problem parallelism performance uminho uminho-mei uminho-miei

Last synced: 25 Jan 2025

https://github.com/xmas7/cudampi

A large hybrid CPU/GPU sorting network using CUDA and MPI. The sorting network uses a standard Quicksort for CPUs and a custom Bitonic Sort for GPUs. These two algorithms were the fastest in a number of prior benchmarks.

cpu cuda gpu hybrid mpi network

Last synced: 01 Feb 2025

https://github.com/raad-labs/raad-video

A high-performance video loading library for machine learning, designed for efficient training data preparation.

cuda machine-learning training-data

Last synced: 09 Feb 2025

https://github.com/kilamper/matrix-multiplication

AC - Matrix multiplication using OpenMP, MPI and CUDA

cuda ms-mpi openmp

Last synced: 26 Jan 2025

https://github.com/brosnanyuen/raybnn_graph

Graph Manipulation Library For GPUs, CPUs, and FPGAs via CUDA, OpenCL, and oneAPI

cuda gpu graph graph-algorithms neural-network neural-networks opencl raybnn rust

Last synced: 13 Feb 2025

https://github.com/liuyuweitarek/pytorch-docker-builder

Automate PyTorch Docker image builds with compatible Python, CUDA, and Poetry versions, including CI/CD for testing.

cicd containerd cuda docker docker-image poetry-python python python3 pytorch pytorch-docker

Last synced: 24 Jan 2025

https://github.com/programmer-rd-ai/detectx

A Pythonic approach to object detection using Detectron2, a clean, modular framework for training and deploying computer vision models. DetectX simplifies the complexity of object detection while maintaining high performance and extensibility.

coco-dataset computer-vision computer-vision-library cuda deep-learning detectron2 faster-rcnn gpu-accelerated machine-learning ml-framework object-detection object-recognition python3 pytorch retinanet

Last synced: 12 Jan 2025

https://github.com/programmer-rd-ai/digivis

A PyTorch-based deep learning implementation for MNIST digit recognition featuring CNNs, GPU acceleration, experiment tracking, and comprehensive testing capabilities.

cnn computer-vision cuda data-science deep-learning digit-recognition image-classification machine-learning mnist neural-networks python pytorch wandb

Last synced: 12 Jan 2025

https://github.com/michaelfranzl/image_debian-gpgpu

Dockerfile for a Debian base image with AMD and Nvidia GPGPU support

amd container container-image cuda debian docker gpgpu nvidia opencl

Last synced: 21 Jan 2025

https://github.com/kaierikniermann/hpc-uzh-notes

These are some notes for the High Performance Computing course taught at UZH

cuda high-performance-computing mpi openacc openmp

Last synced: 12 Jan 2025

https://github.com/fandreuz/parallel-programming-for-hpc

Scientific codes in C/C++ with CUDA, OpenACC, FFTW, (cu)BLAS

cpp cuda hpc mpi

Last synced: 21 Jan 2025

https://github.com/willigarneau/object-detection-cuda

🕺 Put my knowledge of OpenCV and Cuda into practice to create an object detection system. 💻

camera cplusplus cuda detector filter opencv

Last synced: 23 Jan 2025

https://github.com/maelstrom6/mandelpy

A Mandelbrot and Buddhabrot viewer with GPU acceleration

buddhabrot cuda gpu mandelbrot python3

Last synced: 05 Feb 2025

https://github.com/ergonomech/comfyui-windows-installer

Automated setup for ComfyUI on Windows with CUDA, custom plugins, and optimized PyTorch settings. Made to Run as Server and Error Correct,. Easy installation and launch using Miniconda.

automation comfy conda conda-environment cuda hosting-deployment setup windows

Last synced: 06 Feb 2025

https://github.com/whutao/artificial-art

Image approximation with triangles using evolutionary algorithm.

cuda evolutionary-algorithm python3

Last synced: 16 Jan 2025

https://github.com/speedcell4/torchdevice

Setup CUDA_VISIBLE_DEVICES

cuda deep-learning gpu machine-learning pytorch

Last synced: 08 Feb 2025

https://github.com/alpha74/hungarianalgocuda

Hungarian Algorithm for Linear Assignment Problem implemented using CUDA.

cuda nvcc parallel-computing parallel-programming

Last synced: 16 Jan 2025

https://github.com/mortafix/quickshift

A working implementation of Quickshift algorithm in CUDA, GPU-compatible.

cuda gpu-computing quickshift

Last synced: 13 Jan 2025

https://github.com/dhruvsrikanth/cudann

A distributed implementation of a deep learning framework in CUDA.

cpp cuda deep-learning deep-learning-framework gpu-programming high-performance-computing hpc parallel-programming

Last synced: 25 Dec 2024

https://github.com/jonathanraiman/mini_cuda_rtc

Miniature CUDA Array library with Runtime Compilation

cpp11 cuda jit runtime-compilation

Last synced: 22 Jan 2025

https://github.com/m-torhan/cuda-stl-renderer

CUDA C++ implementation of STL file renderer using ray tracing method

cuda

Last synced: 31 Dec 2024

https://github.com/aliyoussef97/triton-hub

A container of various PyTorch neural network modules written in Triton.

cuda deep-learning openai pytorch triton triton-lang

Last synced: 05 Feb 2025

https://github.com/larygwil/ffmpeg-static-cuda

ffmpeg static binaries for Linux that work on some old Nvidia gpu (not tested)

avc cuda cuvid ffmpeg h264 h265 hevc nvdec nvenc

Last synced: 02 Feb 2025

https://github.com/dolongbien/cuda

CUDA and Caffe/Caffe2 installation Ubuntu 16.04

c3d-intel-caffe caffe caffe2 cuda cudnn deep-learning ubuntu

Last synced: 21 Jan 2025

https://github.com/vietdoo/seam-carving-cuda

CUDA Seam Carving: Accelerating Image Resizing with GPU Computing

cc cuda cuda-programming gpu-computing parrallel-computing seam-carving

Last synced: 07 Feb 2025

https://github.com/adamczykpiotr/cudamatrixlibrary

Matrix operation library using single, n-threads or CUDA supported GPU

agh agh-ust cpp cuda cuda-library matrix matrix-computations matrix-functions matrix-multiplication

Last synced: 19 Jan 2025

https://github.com/jessetg/cuda-practice

Working through the chapters of Cuda by Example

c cpp cuda cuda-by-example gpgpu

Last synced: 14 Jan 2025

https://github.com/wallneradam/docker-ccminer

CCMiner (tpruvot version) Docker Builder

ccminer cuda docker gpu litecoin miner monero nvidia nvidia-docker

Last synced: 01 Feb 2025

https://github.com/mala13f/statistical-learning-in-finance

This Repository contains all the codes, papers and related data for assignments done during the course.

cuda gpu-acceleration jupyter-notebook machine-learning python statistical-learning

Last synced: 31 Jan 2025

https://github.com/dotblueshoes/robertscross

The Roberts cross operator is used in image processing and computer vision for edge detection.

cuda edge-detection image-processing

Last synced: 05 Feb 2025

https://github.com/mhaseeb123/gcb

GCB includes a suite of benchmarks and basic tests for CUDA-aware MPI and C++ compilers.

cpp cpp23 cuda mpi partitioned-communication st-mpi

Last synced: 24 Jan 2025

https://github.com/inventwithdean/cuda_mlp

Implementation of a simple Multilayer Perceptron in pure CUDA

cuda cuda-programming deep-learning neural-networks

Last synced: 05 Feb 2025

https://github.com/nickolasrm/gpuvscpumatrixmultiplication

CPU and GPU optimized matrix multiplication (AVX, transposition, CUDA and other)

avx comparison cuda hpc matrix multiplication

Last synced: 28 Dec 2024

https://github.com/poodarchu/vision-lab

Computer Vision Experiments in all.

computer-vision cuda object-detection

Last synced: 28 Jan 2025

https://github.com/neoblizz/cupti-plus-plus

CUPTI++ is a C++ interface to the CUDA Profiling Tools Interface (CUPTI).

cpp cuda cuda-profiler cupti profiler

Last synced: 09 Feb 2025

https://github.com/kayuii/ironfish-miner

docker nvidia/amd Gpu hpool-dev/ironfish-miner ironfish-miner

amdgpu cuda docker gpu nvidia rocm

Last synced: 31 Jan 2025

https://github.com/bjornmelin/deep-learning-evolution

🧠 Deep-Learning Evolution: Unified collection of TensorFlow & PyTorch projects, featuring custom CUDA kernels, distributed training, memory‑efficient methods, and production‑ready pipelines. Showcases advanced GPU optimizations, from foundational models to cutting‑edge architectures. 🚀

ai-research cuda data-science deep-learning distributed-training gan gpu-acceleration machine-learning model-optimization neural-networks python pytorch tensorflow training-pipeline transformers

Last synced: 05 Feb 2025

https://github.com/maawad/ptx_bcht

Bucketed Cuckoo hash set written in PTX and JIT-compiled.

cuckoo cuda gpu hash hashset ptx

Last synced: 09 Feb 2025

https://github.com/tlabaltoh/tlab-sharescreen-server-win

Software frame encoder using CUDA and cast encoded frames over UDP. Trying to implement a custom streaming protocol and shader based frame encoder/decoder for screencast.

cuda desktop-capture screensharing unity unity3d windows-graphics-capture

Last synced: 28 Jan 2025

https://github.com/alextmjugador/rust-cuda-quickstart

Bring the Rust-CUDA project back to life under modern Linux environments.

cuda cuda-programming cuda-rust cuda-support docker rust

Last synced: 26 Jan 2025

https://github.com/zeloe/juce_cuda_convolution

Linear realtime convolution using CUDA

audio audio-processing convolution cuda dsp juce

Last synced: 25 Dec 2024

https://github.com/lcsb-biocore/cufluxsampler.jl

GPU-accelerated algorithms for flux sampling in CUDA.jl

cobra cuda gpu julia metabolic-network metabolism sampling

Last synced: 30 Jan 2025

https://github.com/enp1s0/curand_fp16

FP16 pseudo random number generator on GPU

cuda gpu half-precision random-number-generators

Last synced: 26 Dec 2024

https://github.com/garciparedes/cuda-examples

Cuda examples who I develop to learn HPC based on GPU

c c-plus-plus cuda examples gpgpu gpu hpc

Last synced: 16 Jan 2025

https://github.com/dansolombrino/gphungarian

A GPU-accelerated implementation of the Hungarian Algorithm, written in CUDA

cuda gpu hpc opencl

Last synced: 07 Feb 2025

https://github.com/romaingrx/ml-nix-flake

A simple nix flake to start ML env with uv and cuda out of the box

cuda ml nix nix-flake uv

Last synced: 28 Jan 2025

https://github.com/tyler-hilbert/cuda-linearregression

Linear Regression written from scratch in CUDA

ai cublas cuda gpu linear-regression nsight

Last synced: 05 Feb 2025

https://github.com/orgh0/highperformancecnn

Implementation of a High Performance CNN for MNIST dataset

cnn cpp cuda

Last synced: 22 Jan 2025

https://github.com/kar-dim/fidelityfx-cas-cuda

Implementation of the AMD FidelityFX CAS (Contrast Adaptive Sharpening) algorithm on CUDA, for sharpening static images.

cpp cuda dll fidelityfx gpu image-processing parallel-computing sharpen

Last synced: 26 Dec 2024

https://github.com/weiyu0824/flash-attention-lite

Basic Flash attention Implmentation

attention cuda torch

Last synced: 05 Feb 2025

https://github.com/pjueon/cuda_intellisense

A simple python script to fix cuda C++ intellisense for visual studio.

cuda visual-studio

Last synced: 23 Oct 2024

https://github.com/galaxies99/inception-cuda

CUDA Implementation of Inception

cuda inception-v3

Last synced: 07 Nov 2024

https://github.com/rhysdg/whisper-onnx-python

A low-footprint GPU accelerated Speech to Text Python package for the Jetpack 5 era bolstered by an optimized graph

ai chatbot cuda machine-learning onnxruntime speech-to-text whisper

Last synced: 08 Feb 2025

https://github.com/brendanbignell/cuda_montecarlooptionpricer

CUDA Monte Carlo Barrier Option Pricing Demo & Jupyer lab ML models

cuda deep-learning ml pytorch quantitative-finance xgboost-regression

Last synced: 05 Feb 2025

https://github.com/bokutotu/cudnn_graph_api_example

cudnn graph api example

cuda cudnn cudnn-v8

Last synced: 14 Feb 2025

https://github.com/pratikvn/nla4hpc-exercises-framework

The exercises framework for the Numerical Linear Algebra for HPC course at Karlsruhe Institute of Technology.

cuda ginkgo homeworks hpc-course teaching

Last synced: 26 Jan 2025

https://github.com/mayukhdeb/patrick

Tiny neural net library written from scratch with cupy :warning: under construction :warning:

cuda deep-learning gpu-computing machine-learning neural-network regression

Last synced: 12 Feb 2025

https://github.com/duskvirkus/ofxarrayfire

An openFrameworks addon with pre-compiled binaries of ArrayFire.

arrayfire cuda ofxaddon openframeworks openframeworks-addon

Last synced: 25 Jan 2025

https://github.com/miniex/maidenx

Rust-based CUDA library designed for learning purposes and building my AI engines named Maiden Engine

ai cuda rust

Last synced: 28 Oct 2024

https://github.com/le-ander/msc_bioinfo-experimental_design

Using information theory to inform experimental design with GPU acceleration. Computing group project as part of the MSc in Bioinformatics and Theorectical Systems Biology at Imperial College London 2016/2017.

cuda experimental-design gpu-computing information-theory pycuda systems-biology

Last synced: 31 Jan 2025

https://github.com/anras5/parallel-computing

Comparing CPU and GPU

cuda gpu openmp

Last synced: 21 Jan 2025

https://github.com/sartajbhuvaji/cuda

Deloped CUDA kernel functions to load and train a Convolution Neural Network from scratch.

cuda cuda-programming gpu-programming neural-network nvidia-cuda

Last synced: 05 Feb 2025

https://github.com/pharmcat/metidacu.jl

CUDA solver for Metida.jl

cuda julia-language metida mixed-models

Last synced: 09 Feb 2025

https://github.com/stanczakdominik/cuda_poisson

A 2D poisson solver via CUDA

cuda electromagnetism pde

Last synced: 04 Feb 2025

https://github.com/thisalmandula/gpu_accelerated_lpt_cfd_code

This repository contains GPU accelerated version of the particle tracking model developed by Merel Kooi for biofouled microplastic particles ( available at: https://pubs.acs.org/doi/10.1021/acs.est.6b04702) written in CUDA Fortran and CUDA Python. This repository is intended as a learning tool for GPU programming.

biofouling computational-fluid-dynamics cuda fortran lagrangian-particle-tracking microplastics python

Last synced: 02 Feb 2025

https://github.com/thomasonzhou/minitorch

rebuilding pytorch: from autograd to convolutions in CUDA

cuda numba numpy

Last synced: 30 Dec 2024

https://github.com/ashwanirathee/imagesgpu.jl

Image Processing on GPU in Julia

cuda gpu image image-processing julia

Last synced: 08 Jan 2025