Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

CUDA

CUDA® is a parallel computing platform and programming model developed by NVIDIA for general computing on graphical processing units (GPUs). With CUDA, developers are able to dramatically speed up computing applications by harnessing the power of GPUs.

https://github.com/tmrob2/cuda2rust_sandpit

Minimal examples to get CUDA linear algebra programs working with Rust using CC & FFI.

cc clang cublas cuda cusparse rust

Last synced: 19 Nov 2024

https://github.com/mindstudioofficial/fl_cuda_mandelbrot

Flutter example for visualizing the Mandelbrot Set using CUDA

cuda flutter-examples fractal-rendering

Last synced: 11 Jan 2025

https://github.com/szymon423/tsp-cpu-vs-gpu

Simple brute force approach to solve travelling salesman problem with CPU and GPU

cuda tsp

Last synced: 18 Jan 2025

https://github.com/zeloe/rtconvolver

A realtime convolution VST3

c convolution cplusplus cuda juce

Last synced: 25 Dec 2024

https://github.com/triod315/cudabasics

Simple calculation with NVIDIA CUDA

cpp cuda

Last synced: 09 Jan 2025

https://github.com/l30nardosv/reproduce-parcosi-moleculardocking

Reproducing paper: "Benchmarking the Performance of Irregular Computations in AutoDock-GPU Molecular Docking"

autodock-gpu cpu cuda gpu molecular-docking molecular-docking-scripts opencl paper reproducible-research

Last synced: 05 Feb 2025

https://github.com/mrglaster/cuda-acfcalc

Calculation of the smallest ACF for signals of length N using CUDA technology.

acf c calculations cpp cuda google-colaboratory google-colaboratory-notebooks isu

Last synced: 15 Jan 2025

https://github.com/nellogan/distributed_compy

Distributed_compy is a distributed computing library that offers multi-threading, heterogeneous (CPU + mult-GPU), and multi-node support

cluster cuda heterogeneous-parallel-programming multi-threading multigpu openmp openmpi

Last synced: 12 Jan 2025

https://github.com/lintenn/cudaaddvectors-explicit-vs-unified-memory

Performance comparison of two different forms of memory management in CUDA

c cuda explicit memory memory-management performance unified-memory

Last synced: 06 Jan 2025

https://github.com/bensuperpc/easyai

Make your own AI easily !

ai cuda python python3 tensorflow

Last synced: 17 Jan 2025

https://github.com/betarixm/cuecc

POSTECH: Heterogeneous Parallel Computing (Fall 2023)

cryptography ctypes cuda ecc postech secp256k1

Last synced: 18 Nov 2024

https://github.com/peri044/cuda

GPU implementations of algorithms

cuda gauss-jordan parallel-programming

Last synced: 08 Feb 2025

https://github.com/infotrend-inc/ctpo-demo_projects

Jupyter Notebook examples using CTPO as their source container.

cuda opencv pytroch tensorflow2

Last synced: 05 Feb 2025

https://github.com/ilyasmoutawwakil/optimum-whisper-autobenchmark

A set of benchmarks on OpenAI's Whisper model, using AutoBenchmark and Optimum's OnnxRuntime Optimizations.

benchmark cuda deep-learning

Last synced: 30 Jan 2025

https://github.com/teodutu/asc

Arhitectura Sistemelor de Calcul - UPB 2020

cache-optimization cuda parallel-programming profiling python-threading

Last synced: 30 Jan 2025

https://github.com/juntyr/necsim-rust

Spatially explicit biodiversity simulations using a parallel library written in Rust

biodiversity cuda mpi necsim rust simulation

Last synced: 28 Oct 2024

https://github.com/xusworld/tars

Tars is a cool deep learning framework.

avx2 avx512 cuda deep-learning

Last synced: 05 Feb 2025

https://github.com/cppalliance/crypt

A C++20 module of cryptographic utilities for CPU and GPU

cpp20 cuda security

Last synced: 09 Jan 2025

https://github.com/hansalemaos/nvidiacheck

Monitors NVIDIA GPU information and log the data into a pandas DataFrame - Windows only.

cuda log logging nvidia torch

Last synced: 05 Feb 2025

https://github.com/markdtw/parallel-programming

Basic Pthread, OpenMP, CUDA examples

cuda openmp parallel-programming pthreads

Last synced: 12 Jan 2025

https://github.com/sean-bradley/cudalookupsha256

SHA256 Lookup using parallel processing on a NVidia CUDA Compatible Graphics card

cuda parallel-processing sha256

Last synced: 13 Nov 2024

https://github.com/samuraibupt/cuda_code

Codes for learning cuda. Implementation of multiple kernels.

cuda cuda-programming

Last synced: 23 Oct 2024

https://github.com/romaingrx/ml-nix-flake

A simple nix flake to start ML env with uv and cuda out of the box

cuda ml nix nix-flake uv

Last synced: 28 Jan 2025

https://github.com/hatamiarash7/cuda-python

GPU programming using CUDA & Python

cuda gpu gpu-computing gpu-programming python

Last synced: 03 Feb 2025

https://github.com/himeyama/cuda-nmf

NMF calculations are performed on NVIDIA GPUs using the Cuda API. (GEM released)

cublas cuda gem nmf ruby

Last synced: 29 Dec 2024

https://github.com/thomasonzhou/minitorch

rebuilding pytorch: from autograd to convolutions in CUDA

cuda numba numpy

Last synced: 30 Dec 2024

https://github.com/crcrpar/dev-chainer

Dockerfile for Chainer Development in VSCode

chainer cuda docker nvidia-docker vscode

Last synced: 09 Feb 2025

https://github.com/rjected/cuda-timelock

Solving a large number of timelock puzzles in parallel using GPU acceleration

c cgbn concurrent cpp cuda gmp graphics nvidia parallel puzzle timelock

Last synced: 09 Feb 2025

https://github.com/dolongbien/cuda

CUDA and Caffe/Caffe2 installation Ubuntu 16.04

c3d-intel-caffe caffe caffe2 cuda cudnn deep-learning ubuntu

Last synced: 21 Jan 2025

https://github.com/donpablonows/coin

🪙 Crypto Optimization Interface Network (aka COIN) is a high-performance Bitcoin address generator using CUDA acceleration and multi-threading. It optimizes GPU and CPU resources for fast address generation, ensures secure private key creation, and includes real-time monitoring and automatic system optimizations.

bitcoin blockchain cryptography cuda gpu-acceleration

Last synced: 07 Jan 2025

https://github.com/fandreuz/parallel-programming-for-hpc

Scientific codes in C/C++ with CUDA, OpenACC, FFTW, (cu)BLAS

cpp cuda hpc mpi

Last synced: 21 Jan 2025

https://github.com/bl33h/pythagoreantheorem

A program that calculates the Pythagorean theorem for a large number of elements using GPU parallel processing.

arrays cuda kernel parallel-programming pythagoras pythagorean-theorem

Last synced: 21 Jan 2025

https://github.com/bl33h/productoftwovectors

This code utilizes CUDA for parallel vector multiplication on a GPU, demonstrating GPU's acceleration capabilities.

cuda gpu kernel paralelism parallel-programming product vector

Last synced: 21 Jan 2025

https://github.com/daelsepara/hipmandelbrot

GPU Implementation of Mandelbrot Fractal Generator with Benchmarking

amd cuda fractal gpu gpu-compute gpu-computing hip mandelbrot parallel-computing rocm sdk

Last synced: 07 Nov 2024

https://github.com/nellogan/makefileexamples

Makefile examples of how to automate testing and building of applications/systems that use multiple: languages, compilers, and testing tools.

automated-testing c cuda makefile python valgrind

Last synced: 21 Jan 2025

https://github.com/lightshade12/kittlespt

A hobby CUDA pathtracing renderer.

3d-graphics computer-graphics cuda gpu path-tracing ray-tracing

Last synced: 24 Jan 2025

https://github.com/ashwanirathee/imagesgpu.jl

Image Processing on GPU in Julia

cuda gpu image image-processing julia

Last synced: 08 Jan 2025

https://github.com/daelsepara/hipslm

CPU and GPU (using HIP) implementations of phase pattern generators for use with spatial light modulators

computer-generated-holography cuda gpu hip hologram holography phase phase-pattern slm spatial-light-modulator

Last synced: 29 Dec 2024

https://github.com/stanczakdominik/cuda_poisson

A 2D poisson solver via CUDA

cuda electromagnetism pde

Last synced: 04 Feb 2025

https://github.com/quantum-integrated-technologies/deepforge

DeepForge : framework for working with machine learning.

ai artificial-intelligence cuda library machine-learning ml neural-network

Last synced: 10 Feb 2025

https://github.com/anras5/parallel-computing

Comparing CPU and GPU

cuda gpu openmp

Last synced: 21 Jan 2025

https://github.com/whutao/artificial-art

Image approximation with triangles using evolutionary algorithm.

cuda evolutionary-algorithm python3

Last synced: 16 Jan 2025

https://github.com/tlabaltoh/tlab-sharescreen-server-win

Software frame encoder using CUDA and cast encoded frames over UDP. Trying to implement a custom streaming protocol and shader based frame encoder/decoder for screencast.

cuda desktop-capture screensharing unity unity3d windows-graphics-capture

Last synced: 28 Jan 2025

https://github.com/michaelfranzl/image_debian-gpgpu

Dockerfile for a Debian base image with AMD and Nvidia GPGPU support

amd container container-image cuda debian docker gpgpu nvidia opencl

Last synced: 21 Jan 2025

https://github.com/sohhamseal/scalable-systems-programs

A little less effort to learn parallel programming...

cuda mpi openmp

Last synced: 13 Jan 2025

https://github.com/nikolaydubina/basic-openai-pytorch-server

Minimal HTTP inference server in OpenAI API with Pytorch and CUDA

cuda docker llm openai pytorch server

Last synced: 04 Feb 2025

https://github.com/andih/cuda-fortran-stream

Variant of STREAM Benchmark in CUDA Fortran

cuda cuda-fortran gpu stream-benchmarks variants

Last synced: 12 Jan 2025

https://github.com/abhisheknair10/occupancy.nn

An multi-step pipeline to train and inference Occupancy Networks

3d-reconstruction cuda vision

Last synced: 13 Jan 2025

https://github.com/sanaeprj/matrix-for-cpp

This repository has types that handle matrices.

cpp14 cpp14-library cuda matrix-library

Last synced: 19 Nov 2024

https://github.com/franciscoda/psvm

R package and C++ library that allows training SVM models in a GPU using CUDA and predicting out-of-sample data. A support vector machine (SVM) is a type of machine learning model that is trained using supervised data to classify samples.

cpp cpp17 cuda machine-learning r svm-classifier svm-training

Last synced: 28 Jan 2025

https://github.com/orgh0/highperformancecnn

Implementation of a High Performance CNN for MNIST dataset

cnn cpp cuda

Last synced: 22 Jan 2025

https://github.com/garciparedes/cuda-examples

Cuda examples who I develop to learn HPC based on GPU

c c-plus-plus cuda examples gpgpu gpu hpc

Last synced: 16 Jan 2025

https://github.com/wallneradam/docker-ccminer

CCMiner (tpruvot version) Docker Builder

ccminer cuda docker gpu litecoin miner monero nvidia nvidia-docker

Last synced: 01 Feb 2025

https://github.com/jonathanraiman/mini_cuda_rtc

Miniature CUDA Array library with Runtime Compilation

cpp11 cuda jit runtime-compilation

Last synced: 22 Jan 2025

https://github.com/alpha74/hungarianalgocuda

Hungarian Algorithm for Linear Assignment Problem implemented using CUDA.

cuda nvcc parallel-computing parallel-programming

Last synced: 16 Jan 2025

https://github.com/matteogianferrari/qr-decomposition

Tthis project implements different methods to exploit caches usage, the multicore CPU and the GPU architectures, on the Gram-Schmidt QR Decomposition algorithm and measure the performance of the different implementations.

cuda openmp parallel-computing

Last synced: 10 Feb 2025

https://github.com/david-palma/cuda-programming

Educational CUDA C/C++ programming repository with commented examples on GPU parallel computing, matrix operations, and performance profiling. Requires a CUDA-enabled NVIDIA GPU.

c-cpp cpp cuda cuda-toolkit education gpu gpu-programming kernel matrix-operations nvcc nvidia parallel-computing parallel-programming practice profiling threads

Last synced: 31 Jan 2025

https://github.com/kchristin22/ising_model

Implementation of a cellular automaton on GPU using different features of CUDA

cellular-automaton cuda gpu-programming hpc ising-model parallel-computing

Last synced: 22 Jan 2025

https://github.com/sleeepyjack/multisplit

Simple multisplit for CUDA accelerators

cpp cuda gpu nvidia parallel-programming primitive split

Last synced: 22 Jan 2025

https://github.com/dafadey/GPGPU_OpenCL_vs_CUDA

This is a repository with sample codes for testing memory bandwidth, arithmetic latency hiding and shared/local memory performance on AMD and nVidia devices

cuda gpgpu gpgpu-computing opencl

Last synced: 19 Nov 2024

https://github.com/jessetg/cuda-practice

Working through the chapters of Cuda by Example

c cpp cuda cuda-by-example gpgpu

Last synced: 14 Jan 2025

https://github.com/bjornmelin/deep-learning-evolution

🧠 Deep-Learning Evolution: Unified collection of TensorFlow & PyTorch projects, featuring custom CUDA kernels, distributed training, memory‑efficient methods, and production‑ready pipelines. Showcases advanced GPU optimizations, from foundational models to cutting‑edge architectures. 🚀

ai-research cuda data-science deep-learning distributed-training gan gpu-acceleration machine-learning model-optimization neural-networks python pytorch tensorflow training-pipeline transformers

Last synced: 05 Feb 2025

https://github.com/alegau03/parallel-k-means

Implementation of C programs for the K-Means algorithm for parallel computing.

c c-programming cuda parallel parallel-programming

Last synced: 05 Feb 2025

https://github.com/emmanuelmess/firstcollisiontimesteprarefiedgassimulator

This simulator computes all possible intersections for a very small timestep for a particle model

cpp20 cuda simulator

Last synced: 15 Jan 2025

https://github.com/brosnanyuen/raybnn_graph

Graph Manipulation Library For GPUs, CPUs, and FPGAs via CUDA, OpenCL, and oneAPI

cuda gpu graph graph-algorithms neural-network neural-networks opencl raybnn rust

Last synced: 13 Nov 2024

https://github.com/poodarchu/vision-lab

Computer Vision Experiments in all.

computer-vision cuda object-detection

Last synced: 28 Jan 2025

https://github.com/aliyoussef97/triton-hub

A container of various PyTorch neural network modules written in Triton.

cuda deep-learning openai pytorch triton triton-lang

Last synced: 05 Feb 2025

https://github.com/han-minhee/sgemm_hip

SGEMM implementations in HIP for NVIDIA / AMD GPUs

cuda gpgpu gpu hip rocm

Last synced: 05 Feb 2025

https://github.com/5had3z/torch-discounted-cumsum-nd

PyTorch Discounted Cumsum with Autograd (CPU + CUDA)

cuda machine-learning pytorch

Last synced: 05 Feb 2025

https://github.com/antonioberna/nn-gpu-logic-gates

Neural Network implementation on GPU using CUDA C++ to learn logic gates operations

cpp cuda gpu logic-gates neural-networks nvidia

Last synced: 05 Feb 2025

https://github.com/sbstndb/grayscott_k

A simple 3D GrayScott simulation using Kokkos enabling CUDA or OpenMP backend

cuda finite-difference grayscott grid kokkos laplacian openmp simulation visualisation

Last synced: 05 Feb 2025

https://github.com/fynv/cudainline

A CUDA interface for Python. A distillation of the engine part of ThrustRTC.

cuda gpu nvrtc pyhton

Last synced: 05 Feb 2025

https://github.com/enkerewpo/talaria

AI Voice Assistant for Dialogue and IoT Control Powered by GPT4o

cuda gpt-4 python3 pytorch stt tts

Last synced: 05 Feb 2025

https://github.com/nolmoonen/cuda-sdf

CUDA-accelerated path traced Menger sponge using ray marching.

cuda menger path-tracer ray-marching sdf

Last synced: 05 Feb 2025

https://github.com/pabvald/parallel-computing

Parallel computing practise with OpenMP, MPICH and CUDA

cuda mpich openmp parallel-computing

Last synced: 29 Jan 2025

https://github.com/speedcell4/torchdevice

Setup CUDA_VISIBLE_DEVICES

cuda deep-learning gpu machine-learning pytorch

Last synced: 08 Feb 2025

https://github.com/mortafix/quickshift

A working implementation of Quickshift algorithm in CUDA, GPU-compatible.

cuda gpu-computing quickshift

Last synced: 13 Jan 2025

https://github.com/tensorbfs/cutropicalgemm.jl

The fastest Tropical number matrix multiplication on GPU

cuda gemm tropical-algebra

Last synced: 20 Dec 2024