Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

CUDA

CUDA® is a parallel computing platform and programming model developed by NVIDIA for general computing on graphical processing units (GPUs). With CUDA, developers are able to dramatically speed up computing applications by harnessing the power of GPUs.

https://github.com/lchsk/ney

A header-only parallel functions library for Intel Xeon/Xeon Phi/GPUs

cuda gpu linux parallel phi scientific xeon xeonphi

Last synced: 08 Jan 2025

https://github.com/dpbm/qml-course

Minicurso de quantum Machine learning

cuda cuda-q cuquantum docker ml python qml quantum quantum-computing tensorflow

Last synced: 21 Dec 2024

https://github.com/dhruvsrikanth/cudann

A distributed implementation of a deep learning framework in CUDA.

cpp cuda deep-learning deep-learning-framework gpu-programming high-performance-computing hpc parallel-programming

Last synced: 16 Feb 2025

https://github.com/mulx10/firefly

Enhancing Object Detection in using Thermal Imaging for thin cross-section unidentifiable objects(eg. cyclist, pedestrians).

autonomous-cars autonomous-navigation autonomous-vehicles c cuda object-detection thermal-camera yolov3

Last synced: 30 Dec 2024

https://github.com/superlinear-ai/scipy-notebook-gpu

jupyter/scipy-notebook with CUDA Toolkit, cuDNN, NCCL, and TensorRT

cuda cudnn docker nccl scipy-notebook tensorflow tensorrt

Last synced: 09 Jan 2025

https://github.com/elftausend/sliced

Array operations with automatic differentiation on CPU and GPU

autograd automatic-differentiation cuda custos matrix opencl

Last synced: 14 Feb 2025

https://github.com/pothosware/pothosgpu

Pothos toolkit for ArrayFire API support

arrayfire cuda dataflow dataflow-programming gpu opencl pothos

Last synced: 15 Jan 2025

https://github.com/andreasholt/cusmc

A CUDA-accelerated Statistical Model Checker for Stochastic Timed Automata

cuda smc

Last synced: 02 Jan 2025

https://github.com/denzp/current

CUDA high-level Rust framework

cuda rust

Last synced: 16 Feb 2025

https://github.com/huwzpf/parallel-processing-cpu-and-gpu-env-and-lib-with-powercap

(2024/2025) A library and environment for parallel processing in a power-limited CPU+GPU cluster environment.

c cpu cuda gpu mpi openmp parallel powercap

Last synced: 19 Feb 2025

https://github.com/teodutu/asc

Arhitectura Sistemelor de Calcul - UPB 2020

cache-optimization cuda parallel-programming profiling python-threading

Last synced: 30 Jan 2025

https://github.com/tthebc01/cudaconda3

Lightweight container environment with Cuda, Miniconda3, and Jupyter Lab.

cuda docker gpu jupyterlab marimo-notebook miniconda3 reverse-proxy-application

Last synced: 03 Jan 2025

https://github.com/pd2871/high-performance-computing

This repo contain the logs of High Performance Computing module's final Assignment

blurred-images c cuda gaussian-blur matrix-multiplication multi-threading parallel-computing pthreads pthreads-api

Last synced: 25 Jan 2025

https://github.com/sean-bradley/cudalookupsha256

SHA256 Lookup using parallel processing on a NVidia CUDA Compatible Graphics card

cuda parallel-processing sha256

Last synced: 13 Nov 2024

https://github.com/markdtw/parallel-programming

Basic Pthread, OpenMP, CUDA examples

cuda openmp parallel-programming pthreads

Last synced: 12 Jan 2025

https://github.com/nachovizzo/saxpy_openacc_cpp

My way of thinking about OpenACC, C++, and Parallel computing in general

cpp cuda gpu openacc

Last synced: 30 Jan 2025

https://github.com/dito97/gol

High-performance Computing (90535) final project at UniGe

cuda mpi openmp

Last synced: 14 Feb 2025

https://github.com/brosnanyuen/raybnn_neural

Neural Networks with Sparse Weights in Rust using GPUs, CPUs, and FPGAs via CUDA, OpenCL, and oneAPI

cpu cuda deep-learning gpu machine-learning machine-learning-algorithms neural-network neural-networks opencl parallel raybnn rust sparse-network sparse-neural-networks

Last synced: 13 Nov 2024

https://github.com/babak2/optimizedsum

Optimized Parallel Sum program demonstrating CPU vs GPU performance

cuda cuda-programming gpu-acceleration gpu-computing gpu-parallelism visual-studio

Last synced: 01 Feb 2025

https://github.com/l30nardosv/reproduce-parcosi-moleculardocking

Reproducing paper: "Benchmarking the Performance of Irregular Computations in AutoDock-GPU Molecular Docking"

autodock-gpu cpu cuda gpu molecular-docking molecular-docking-scripts opencl paper reproducible-research

Last synced: 05 Feb 2025

https://github.com/kagof/julia-image-processing

Image processing programs written in Julia

cuda image-processing julia

Last synced: 12 Feb 2025

https://github.com/acrlakshman/gradient-augmented-levelset-cuda

Implementation of Gradient Augmented Levelset method for CPU and GPU

cfd cuda levelset

Last synced: 13 Feb 2025

https://github.com/isazi/aoflagger

AOFlagger Radio Frequency Interference mitigation algorithm.

cuda gpu many-core rfi

Last synced: 30 Jan 2025

https://github.com/brocbyte/realtime-deformations

Snow simulation (Material Point Method)

cuda glm material-point-method opengl

Last synced: 09 Nov 2024

https://github.com/orlandopalmeira/trabalho-cp-2023-2024

Repositório do trabalho prático no âmbito da UC de Computação Paralela (CP) - Mestrado em Engenharia Informática (MEI/MIEI) - Universidade do Minho (UMinho)

computacao-paralela cp cuda cuda-programming mei miei nvidia nvidia-cuda openmp optimization optimization-problem parallelism performance uminho uminho-mei uminho-miei

Last synced: 25 Jan 2025

https://github.com/nellogan/distributed_compy

Distributed_compy is a distributed computing library that offers multi-threading, heterogeneous (CPU + mult-GPU), and multi-node support

cluster cuda heterogeneous-parallel-programming multi-threading multigpu openmp openmpi

Last synced: 12 Jan 2025

https://github.com/navdeep-g/dimreduce4gpu

Dimensionality reduction ("dimreduce") on GPUs ("4gpu")

cplusplus cuda dimensionality-reduction gpu linear-algebra pca python svd unsupervised-learning

Last synced: 24 Dec 2024

https://github.com/biodasturchi/gmx

🔬 Gromacs yordamida molekular modellashtirish

cuda gpu gromacs mdp topology tpr trr

Last synced: 21 Jan 2025

https://github.com/arminms/p2rng

A modern header-only C++ library for parallel algorithmic (pseudo) random number generation supporting OpenMP, CUDA, ROCm and oneAPI

cpp cuda cxx header-only heterogeneous-computing library linux macos multiplatorm oneapi openmp parallel pcg-random prng pseudorandom-number-generator random-number-distributions random-number-generation rocm stl-algorithms windows

Last synced: 05 Nov 2024

https://github.com/neoblizz/spmv

Efficient Sparse Matrix-Vector Multiplication (SpMV) using ModernGPU (MTX + CSR formats).

csr cuda gpgpu load-balancing mtx spmv

Last synced: 09 Feb 2025

https://github.com/tlabaltoh/tlab-sharescreen-server-win

Software frame encoder using CUDA and cast encoded frames over UDP. Trying to implement a custom streaming protocol and shader based frame encoder/decoder for screencast.

cuda desktop-capture screensharing unity unity3d windows-graphics-capture

Last synced: 28 Jan 2025

https://github.com/makischristou/mandelbrot

Mandelbrot set visualizer using CUDA.

cpp cuda gpu mandelbrot nvidia renderer rust

Last synced: 20 Jan 2025

https://github.com/ashwanirathee/imagesgpu.jl

Image Processing on GPU in Julia

cuda gpu image image-processing julia

Last synced: 08 Jan 2025

https://github.com/shivendrra/axgrad

lightweight tensor library that contains it's own auto-diff engine like pytorch

autograd cuda pytorch scratch-implementation tinygrad

Last synced: 06 Feb 2025

https://github.com/abdulfatir/subkmeans

Numpy and pyCUDA implementation of subKmeans

clustering cuda kdd kmeans numpy pycuda python subspace-clustering

Last synced: 09 Feb 2025

https://github.com/lightshade12/kittlespt

A hobby CUDA pathtracing renderer.

3d-graphics computer-graphics cuda gpu path-tracing ray-tracing

Last synced: 24 Jan 2025

https://github.com/matx64/rs-netbot

Old School Runescape (MMORPG) Bot created using a Convolutional Neural Network for object identification

cuda numpy python pytorch

Last synced: 09 Feb 2025

https://github.com/bolner/totally-diffused

Debian/NVIDIA Docker image for AUTOMATIC1111's Stable Diffusion application.

automatic1111 cuda debian docker-image nvidia stable-diffusion xformers

Last synced: 08 Feb 2025

https://github.com/snoopy3476/t-espresso

A CUDA Library for Low-overhead Host-to-Device Transmission of Patterned Profile Data

cuda profiler

Last synced: 07 Nov 2024

https://github.com/alextmjugador/rust-cuda-quickstart

Bring the Rust-CUDA project back to life under modern Linux environments.

cuda cuda-programming cuda-rust cuda-support docker rust

Last synced: 26 Jan 2025

https://github.com/rhysdg/whisper-onnx-python

A low-footprint GPU accelerated Speech to Text Python package for the Jetpack 5 era bolstered by an optimized graph

ai chatbot cuda machine-learning onnxruntime speech-to-text whisper

Last synced: 08 Feb 2025

https://github.com/pvdberg1998/cufft_rust

A safe Rust wrapper around a subset of cuFFT.

cuda cufft fft rust

Last synced: 12 Dec 2024

https://github.com/alpha74/hungarianalgocuda

Hungarian Algorithm for Linear Assignment Problem implemented using CUDA.

cuda nvcc parallel-computing parallel-programming

Last synced: 16 Jan 2025

https://github.com/ssoehdata/cuda_fortran_sci_eng

Working through examples from the Cuda Fortran for Scientists and Engineers 2nd Edition Book

cuda cuda-fortran fortran hpc nvfortran

Last synced: 10 Dec 2024

https://github.com/pharmcat/metidacu.jl

CUDA solver for Metida.jl

cuda julia-language metida mixed-models

Last synced: 09 Feb 2025

https://github.com/sleeepyjack/multisplit

Simple multisplit for CUDA accelerators

cpp cuda gpu nvidia parallel-programming primitive split

Last synced: 22 Jan 2025

https://github.com/tensorbfs/cutropicalgemm.jl

The fastest Tropical number matrix multiplication on GPU

cuda gemm tropical-algebra

Last synced: 13 Feb 2025

https://github.com/qervas/cn_chess_ai

chinese chess(Xiangqi) AI

ai cpp cuda dqn qt6

Last synced: 23 Oct 2024

https://github.com/brendanbignell/cuda_montecarlooptionpricer

CUDA Monte Carlo Barrier Option Pricing Demo & Jupyer lab ML models

cuda deep-learning ml pytorch quantitative-finance xgboost-regression

Last synced: 05 Feb 2025

https://github.com/abhisheknair10/occupancy.nn

An multi-step pipeline to train and inference Occupancy Networks

3d-reconstruction cuda vision

Last synced: 13 Jan 2025

https://github.com/jakubriegel/game_of_life_3d

3D game of life implemented in CUDA

concurency cuda gameoflife nvidia put-poznan

Last synced: 01 Feb 2025

https://github.com/jonathanraiman/mini_cuda_rtc

Miniature CUDA Array library with Runtime Compilation

cpp11 cuda jit runtime-compilation

Last synced: 22 Jan 2025

https://github.com/sbstndb/grayscott_k

A simple 3D GrayScott simulation using Kokkos enabling CUDA or OpenMP backend

cuda finite-difference grayscott grid kokkos laplacian openmp simulation visualisation

Last synced: 05 Feb 2025

https://github.com/galaxies99/inception-cuda

CUDA Implementation of Inception

cuda inception-v3

Last synced: 07 Nov 2024

https://github.com/gunrock/template

Template repository for essentials applications to get you started asap!

cpp cuda essentials gpu graph-algorithms graph-analytics gunrock

Last synced: 10 Jan 2025

https://github.com/dhruvsrikanth/monte-carlo-ray-tracing

In this repository, you will find a serial and distributed GPU-based implementation of the ray tracing simulation.

c cpp cuda gpu-computing gpu-programming high-performance-computing parallel-programming raytracing unified-memory-parallelism

Last synced: 16 Feb 2025

https://github.com/dhruvsrikanth/fastconv

Distributed and serial implementations of the 2D Convolution operation in c++ and CUDA.

convolution-filters cpp cuda gpu-programming high-performance-computing hpc image-editor image-processing nvidia parallel-programming

Last synced: 16 Feb 2025

https://github.com/daelsepara/hipmandelbrot

GPU Implementation of Mandelbrot Fractal Generator with Benchmarking

amd cuda fractal gpu gpu-compute gpu-computing hip mandelbrot parallel-computing rocm sdk

Last synced: 07 Nov 2024

https://github.com/andygeiss/machine-learning-golang

This repository provides a basic setup to do Machine Learning with Golang and Python, TensorFlow 1.15 and CUDA 10.0.

benchmark cuda docker go golang machine-learning python tensorflow

Last synced: 06 Feb 2025

https://github.com/bl33h/productoftwovectors

This code utilizes CUDA for parallel vector multiplication on a GPU, demonstrating GPU's acceleration capabilities.

cuda gpu kernel paralelism parallel-programming product vector

Last synced: 21 Jan 2025

https://github.com/fynv/cudainline

A CUDA interface for Python. A distillation of the engine part of ThrustRTC.

cuda gpu nvrtc pyhton

Last synced: 05 Feb 2025

https://github.com/xza85hrf/ml-framework_checker

ML Framework and CUDA Checker is a Python-based GUI application for checking PyTorch, TensorFlow, and CUDA installations. It provides detailed system specs, compatibility checks, advanced GPU management, and offers options to view instructions, export logs, and update machine learning frameworks.

compatibility cuda gpu-management gui-application machine-learning python pytorch system-checker system-specs tensorflow

Last synced: 30 Jan 2025

https://github.com/bhattbhavesh91/rapids-cudf-cuml-example

Running KNN algorithm much faster on GPU for free using RAPIDS packages like cuML and cuDF

cuda cuml deep-learning nvidia-gpu rapids rapidsai

Last synced: 17 Jan 2025

https://github.com/jessetg/cuda-practice

Working through the chapters of Cuda by Example

c cpp cuda cuda-by-example gpgpu

Last synced: 14 Jan 2025

https://github.com/bl33h/pythagoreantheorem

A program that calculates the Pythagorean theorem for a large number of elements using GPU parallel processing.

arrays cuda kernel parallel-programming pythagoras pythagorean-theorem

Last synced: 21 Jan 2025

https://github.com/enkerewpo/talaria

AI Voice Assistant for Dialogue and IoT Control Powered by GPT4o

cuda gpt-4 python3 pytorch stt tts

Last synced: 05 Feb 2025

https://github.com/wallneradam/docker-ccminer

CCMiner (tpruvot version) Docker Builder

ccminer cuda docker gpu litecoin miner monero nvidia nvidia-docker

Last synced: 01 Feb 2025

https://github.com/chintak/theano-lasagne-docker

Dockerfile for Lasagne with Cuda support. Look at the branches for relevant Dockerfiles - ``cpu`` and ``gpu``.

caffe cuda docker dockerfile install-script lasagne machine-learning machine-learning-library theano

Last synced: 15 Feb 2025

https://github.com/komorra/blackmagicengine

Nextgen, Classic/VR/AR Game Engine

core cuda dx12 game-development gameengine gpu net nvidia vulcan

Last synced: 20 Feb 2025

https://github.com/ashwani-rathee/imagesgpu.jl

Image Processing on GPU in Julia

cuda gpu image image-processing julia

Last synced: 21 Nov 2024

https://github.com/matteogianferrari/qr-decomposition

Tthis project implements different methods to exploit caches usage, the multicore CPU and the GPU architectures, on the Gram-Schmidt QR Decomposition algorithm and measure the performance of the different implementations.

cuda openmp parallel-computing

Last synced: 10 Feb 2025

https://github.com/bjornmelin/deep-learning-evolution

🧠 Deep-Learning Evolution: Unified collection of TensorFlow & PyTorch projects, featuring custom CUDA kernels, distributed training, memory‑efficient methods, and production‑ready pipelines. Showcases advanced GPU optimizations, from foundational models to cutting‑edge architectures. 🚀

ai-research cuda data-science deep-learning distributed-training gan gpu-acceleration machine-learning model-optimization neural-networks python pytorch tensorflow training-pipeline transformers

Last synced: 05 Feb 2025

https://github.com/m-torhan/cuda-stl-renderer

CUDA C++ implementation of STL file renderer using ray tracing method

cuda

Last synced: 20 Feb 2025

https://github.com/sohhamseal/scalable-systems-programs

A little less effort to learn parallel programming...

cuda mpi openmp

Last synced: 13 Jan 2025

https://github.com/alegau03/parallel-k-means

Implementation of C programs for the K-Means algorithm for parallel computing.

c c-programming cuda parallel parallel-programming

Last synced: 05 Feb 2025

https://github.com/poodarchu/vision-lab

Computer Vision Experiments in all.

computer-vision cuda object-detection

Last synced: 28 Jan 2025

https://github.com/fandreuz/parallel-programming-for-hpc

Scientific codes in C/C++ with CUDA, OpenACC, FFTW, (cu)BLAS

cpp cuda hpc mpi

Last synced: 21 Jan 2025

https://github.com/nolmoonen/cuda-sdf

CUDA-accelerated path traced Menger sponge using ray marching.

cuda menger path-tracer ray-marching sdf

Last synced: 05 Feb 2025

https://github.com/garciparedes/cuda-examples

Cuda examples who I develop to learn HPC based on GPU

c c-plus-plus cuda examples gpgpu gpu hpc

Last synced: 16 Jan 2025

https://github.com/aliyoussef97/triton-hub

A container of various PyTorch neural network modules written in Triton.

cuda deep-learning openai pytorch triton triton-lang

Last synced: 05 Feb 2025

https://github.com/dolongbien/cuda

CUDA and Caffe/Caffe2 installation Ubuntu 16.04

c3d-intel-caffe caffe caffe2 cuda cudnn deep-learning ubuntu

Last synced: 21 Jan 2025

https://github.com/romaingrx/ml-nix-flake

A simple nix flake to start ML env with uv and cuda out of the box

cuda ml nix nix-flake uv

Last synced: 28 Jan 2025

https://github.com/orgh0/highperformancecnn

Implementation of a High Performance CNN for MNIST dataset

cnn cpp cuda

Last synced: 22 Jan 2025

https://github.com/mayukhdeb/patrick

Tiny neural net library written from scratch with cupy :warning: under construction :warning:

cuda deep-learning gpu-computing machine-learning neural-network regression

Last synced: 12 Feb 2025