Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

CUDA

CUDA® is a parallel computing platform and programming model developed by NVIDIA for general computing on graphical processing units (GPUs). With CUDA, developers are able to dramatically speed up computing applications by harnessing the power of GPUs.

https://github.com/yashkathe/image-noise-reduction-with-cuda

This project conducts an analysis of image denoising technique - median blur, comparing GPU-accelerated (Numba) and CPU-based (OpenCV) processing speeds.

cuda cuda-programming gpu-programming hardware-speed-analysis image-analysis image-processing numba nvidia nvidia-cuda nvidia-gpu opencv parallel-programming

Last synced: 25 Dec 2024

https://github.com/hahnjo/cgxx

Object-Oriented Implementation of the Conjugate Gradients Method

cg cuda hpc openacc opencl openmp

Last synced: 29 Jan 2025

https://github.com/nodef/nvgraph.sh

CLI for nvGraph, which is a GPU-based graph analytics library written by NVIDIA, using CUDA.

analytics cli console cuda gpu graph nvgraph nvidia pagerank terminal

Last synced: 24 Oct 2024

https://github.com/shalithasuranga/cudaperformance

Compare the performance of matrix multiplication among GPU shared memory, GPU global memory and CPU

cuda cuda-demo matrix-multiplication nvidia

Last synced: 12 Feb 2025

https://github.com/krzemienski/ffmpeg-nvenc-bento4

Container to transcode and package hls and dash assets leveraging accelerated gpu for transcoding

bento4 cuda dash docker encoding ffmpeg hls mp4 nvenc nvidia video

Last synced: 08 Jan 2025

https://github.com/pnnl/cuvite

Multi-GPU Graph Community Detection using CUDA

community-detection cuda graph-clustering mpi

Last synced: 25 Nov 2024

https://github.com/egororachyov/spbench

Benchmark for sparse linear algebra libraries for CPU and GPU platforms.

benchmark cpp cpu cuda gpu-computing graphblas opencl sparse-matrices

Last synced: 19 Nov 2024

https://github.com/tristanpenman/cuda-examples

A collection of CUDA example code

cuda

Last synced: 08 Dec 2024

https://github.com/gurbaaz27/cs433a-design-exercises

Solutions of design exercises in CS433A: Parallel Programming, Spring Semester 2021-22

barriers cuda gpu-programming locks openmp parallel-programming posix-threads semaphores

Last synced: 14 Nov 2024

https://github.com/alejandrogallo/atrip

High Performance library for the CCSD(T) algorithm in quantum chemistry

asynchronous-programming coupled-cluster cuda literate-programming mpi quantum-chemistry

Last synced: 26 Nov 2024

https://github.com/ydrmaster/cuda-driver

基于 CUDA Driver API 的 cuda 运行时环境

cuda nvidia

Last synced: 10 Feb 2025

https://github.com/akira4o4/trtplus

tensorrt-plus framework

cpp cpu cuda gpu tensorrt yolo

Last synced: 27 Nov 2024

https://github.com/pothosware/pothosgpu

Pothos toolkit for ArrayFire API support

arrayfire cuda dataflow dataflow-programming gpu opencl pothos

Last synced: 15 Jan 2025

https://github.com/neoblizz/spmv

Efficient Sparse Matrix-Vector Multiplication (SpMV) using ModernGPU (MTX + CSR formats).

csr cuda gpgpu load-balancing mtx spmv

Last synced: 09 Feb 2025

https://github.com/romnn/nvbit-rs

Rust bindings to the NVIDIA NVBIT binary instrumentation API

cuda ffi gpgpu instrumentation nvbit nvidia profiling ptx rust sass tracing

Last synced: 23 Oct 2024

https://github.com/xusworld/tars

Tars is a cool deep learning framework.

avx2 avx512 cuda deep-learning

Last synced: 05 Feb 2025

https://github.com/kohulan/tensorflow-2.0-installation-with-cuda-support

A detailed step by step guide to install Tensorflow-2.0-gpu with CUDA Drivers on Ubuntu Server/ Desktop LTS

cuda gpu nvidia ubuntu

Last synced: 30 Nov 2024

https://github.com/sthysel/jtx2-tools

nvidia jtx/xavier GPU monitor tool

cuda nvidia txt2 xavier

Last synced: 20 Jan 2025

https://github.com/xmas7/cudampi

A large hybrid CPU/GPU sorting network using CUDA and MPI. The sorting network uses a standard Quicksort for CPUs and a custom Bitonic Sort for GPUs. These two algorithms were the fastest in a number of prior benchmarks.

cpu cuda gpu hybrid mpi network

Last synced: 01 Feb 2025

https://github.com/alpha74/cuda_basics

Nvidia NVCC CUDA programs for begineers.

c cpp cuda cuda-programs nvcc nvidia parallel-computing parallel-programming

Last synced: 16 Jan 2025

https://github.com/babak2/optimizedsum

Optimized Parallel Sum program demonstrating CPU vs GPU performance

cuda cuda-programming gpu-acceleration gpu-computing gpu-parallelism visual-studio

Last synced: 01 Feb 2025

https://github.com/nachovizzo/saxpy_openacc_cpp

My way of thinking about OpenACC, C++, and Parallel computing in general

cpp cuda gpu openacc

Last synced: 30 Jan 2025

https://github.com/maj0rrr/parallel-processing-cpu-and-gpu-env-and-lib-with-powercap

(2024/2025) A library and environment for parallel processing in a power-limited CPU+GPU cluster environment.

c cpu cuda gpu mpi openmp parallel powercap

Last synced: 01 Nov 2024

https://github.com/lchsk/ney

A header-only parallel functions library for Intel Xeon/Xeon Phi/GPUs

cuda gpu linux parallel phi scientific xeon xeonphi

Last synced: 08 Jan 2025

https://github.com/stdogpkg/cukuramoto

A python/CUDA pkg which solves numerically the kuramoto model through the Heun's method

complex-networks cuda kuramoto-model

Last synced: 29 Jan 2025

https://github.com/brosnanyuen/raybnn_neural

Neural Networks with Sparse Weights in Rust using GPUs, CPUs, and FPGAs via CUDA, OpenCL, and oneAPI

cpu cuda deep-learning gpu machine-learning machine-learning-algorithms neural-network neural-networks opencl parallel raybnn rust sparse-network sparse-neural-networks

Last synced: 13 Nov 2024

https://github.com/orlandopalmeira/trabalho-cp-2023-2024

Repositório do trabalho prático no âmbito da UC de Computação Paralela (CP) - Mestrado em Engenharia Informática (MEI/MIEI) - Universidade do Minho (UMinho)

computacao-paralela cp cuda cuda-programming mei miei nvidia nvidia-cuda openmp optimization optimization-problem parallelism performance uminho uminho-mei uminho-miei

Last synced: 25 Jan 2025

https://github.com/raad-labs/raad-video

A high-performance video loading library for machine learning, designed for efficient training data preparation.

cuda machine-learning training-data

Last synced: 09 Feb 2025

https://github.com/DefTruth/hgemm-tensorcores-mma

⚡️Write HGEMM from scratch using Tensor Cores with WMMA, MMA PTX and CuTe API. 🎉🎉

cuda hgemm tensor-cores

Last synced: 06 Dec 2024

https://github.com/muhac/jupyter-pytorch-docker

JupyterLab for AI in Docker! Anaconda and PyTorch GPU supported.

conda-environment cuda docker jupyterlab pytorch

Last synced: 21 Jan 2025

https://github.com/B1-663R/docker-mining

Dockerfiles to build docker images to start mining with an NVIDIA Docker architecture

cryptocurrency cuda docker-image docker-nvidia mining

Last synced: 31 Oct 2024

https://github.com/sean-bradley/cudalookupsha256

SHA256 Lookup using parallel processing on a NVidia CUDA Compatible Graphics card

cuda parallel-processing sha256

Last synced: 13 Nov 2024

https://github.com/zeloe/rtconvolver

A realtime convolution VST3

c convolution cplusplus cuda juce

Last synced: 25 Dec 2024

https://github.com/elftausend/sliced

Array operations with automatic differentiation on CPU and GPU

autograd automatic-differentiation cuda custos matrix opencl

Last synced: 08 Dec 2024

https://github.com/cppalliance/crypt

A C++20 module of cryptographic utilities for CPU and GPU

cpp20 cuda security

Last synced: 09 Jan 2025

https://github.com/yomi4486/zundamon_v3

マスター、お冷ショットで。

cuda discord-bot discord-py docker docker-compose python tts voicevox zundamon

Last synced: 27 Nov 2024

https://github.com/denzp/current

CUDA high-level Rust framework

cuda rust

Last synced: 24 Dec 2024

https://github.com/arminms/p2rng

A modern header-only C++ library for parallel algorithmic (pseudo) random number generation supporting OpenMP, CUDA, ROCm and oneAPI

cpp cuda cxx header-only heterogeneous-computing library linux macos multiplatorm oneapi openmp parallel pcg-random prng pseudorandom-number-generator random-number-distributions random-number-generation rocm stl-algorithms windows

Last synced: 05 Nov 2024

https://github.com/mindstudioofficial/fl_cuda_mandelbrot

Flutter example for visualizing the Mandelbrot Set using CUDA

cuda flutter-examples fractal-rendering

Last synced: 11 Jan 2025

https://github.com/soumyasen1809/cuda_basics

Basic concepts on CUDA

c cpp cuda nvidia-gpu

Last synced: 06 Jan 2025

https://github.com/andreabak/whispersubs

Generate subtitles for your video or audio files using the power of AI

ai cuda deep-learning gpu-acceleration machine-learning srt subtitles transcribe transcription translate whisper

Last synced: 16 Nov 2024

https://github.com/liny18/image-to-ascii

🖼️ A command-line tool for converting images to ASCII art

ascii ascii-art cli command-line cpp cuda docker image-processing image-to-ascii mpi opencv terminal

Last synced: 21 Nov 2024

https://github.com/superlinear-ai/scipy-notebook-gpu

jupyter/scipy-notebook with CUDA Toolkit, cuDNN, NCCL, and TensorRT

cuda cudnn docker nccl scipy-notebook tensorflow tensorrt

Last synced: 09 Jan 2025

https://github.com/acrlakshman/gradient-augmented-levelset-cuda

Implementation of Gradient Augmented Levelset method for CPU and GPU

cfd cuda levelset

Last synced: 13 Feb 2025

https://github.com/kagof/julia-image-processing

Image processing programs written in Julia

cuda image-processing julia

Last synced: 12 Feb 2025

https://github.com/vorticity-inc/vtensor

VTensor, a C++ library, facilitates tensor manipulation on GPUs, emulating the python-numpy style for ease of use. It leverages RMM (RAPIDS Memory Manager) for efficient device memory management. It also supports xtensor for host memory operations.

cublas cuda curand cusolver gpu numpy rmm tensor xarray xtensor

Last synced: 10 Dec 2024

https://github.com/xkevio/cuda-raytracer

A simple ray tracer written with CUDA that saves its output in a .ppm file, CPU version included for reference.

cpu cuda cuda-raytracer gpu

Last synced: 12 Feb 2025

https://github.com/mrglaster/cuda-acfcalc

Calculation of the smallest ACF for signals of length N using CUDA technology.

acf c calculations cpp cuda google-colaboratory google-colaboratory-notebooks isu

Last synced: 15 Jan 2025

https://github.com/cfries/javagpuexperiments

Repository used to demo OpenCL, JOCL, JCuda.

cuda

Last synced: 27 Dec 2024

https://github.com/kim-hwiwon/t-espresso

A CUDA Library for Low-overhead Host-to-Device Transmission of Patterned Profile Data

cuda profiler

Last synced: 27 Dec 2024

https://github.com/gjbex/gpu-programming

Material for a training on portable GPU programming

cuda gpu kokkos openmp openmp-off stl thrust

Last synced: 22 Nov 2024

https://github.com/dujonwalker/nixos-config-x86_64-cuda

This repository contains my NixOS configuration optimized for 64-bit x86 systems with NVIDIA CUDA support, featuring a Plasma 6 desktop environment and a variety of essential applications for development, multimedia, and productivity. It serves as a backup for easy restoration and setup on new installations.

cuda flatpak nix nixos nixos-configuration ollama

Last synced: 26 Dec 2024

https://github.com/kishore-narendran/eecs221-highperformancecomputing

Assignments done during the graduate course EECS 221 - Introduction to HPC that I took in the Spring Quarter of 2016 at University of California, Irvine. Involves assignments that use OpenMP, MPI and CUDA.

cuda hpc mpi openmp

Last synced: 05 Jan 2025

https://github.com/dereklstinson/nccl

golang wrapper for nccl

cuda deep-learning go nccl parallel-computing

Last synced: 15 Jan 2025

https://github.com/dqbd/cuda-btree

Implementation of B-Trees on NVIDIA CUDA

b-tree cuda nvidia

Last synced: 13 Feb 2025

https://github.com/samuraibupt/cuda_code

Codes for learning cuda. Implementation of multiple kernels.

cuda cuda-programming

Last synced: 23 Oct 2024

https://github.com/teodutu/asc

Arhitectura Sistemelor de Calcul - UPB 2020

cache-optimization cuda parallel-programming profiling python-threading

Last synced: 30 Jan 2025

https://github.com/lintenn/cudaaddvectors-explicit-vs-unified-memory

Performance comparison of two different forms of memory management in CUDA

c cuda explicit memory memory-management performance unified-memory

Last synced: 06 Jan 2025

https://github.com/brosnanyuen/raybnn_optimizer

Gradient Descent Optimizers and Genetic Algorithms using GPUs, CPUs, and FPGAs via CUDA, OpenCL, and oneAPI

arrayfire cuda genetic-algorithm genetic-algorithms gpu gpu-computing gradient gradient-descent parallel parallel-computing raybnn rust

Last synced: 23 Oct 2024

https://github.com/tawssie/zmpy3d_pt

Python implementation of 3D Zernike moments with PyTorch

3d-zernike cuda gpu protein-structure python pytorch structural-bioinformatics superposition zernike-moments

Last synced: 09 Feb 2025

https://github.com/deftruth/hgemm-tensorcores-mma

⚡️Write HGEMM from scratch using Tensor Cores with WMMA, MMA PTX and CuTe API. 🎉🎉

cuda hgemm tensor-cores

Last synced: 04 Dec 2024

https://github.com/abaksy/cuda-examples

A repository of examples coded in CUDA C/C++

cuda

Last synced: 17 Jan 2025

https://github.com/yingding/applyllm

A python package for applying LLM with LangChain and Hugging Face on local CUDA/MPS host

accelerator batch cuda framework inference kubeflow langchain llm mps pipeline slurm transformers

Last synced: 22 Dec 2024

https://github.com/droduit/multiprocessor-architecture

Introduction to Multiprocessor Architecture @ EPFL

cuda multiprocessor multithreading openmp-parallelization

Last synced: 02 Jan 2025

https://github.com/dpbm/qml-course

Minicurso de quantum Machine learning

cuda cuda-q cuquantum docker ml python qml quantum quantum-computing tensorflow

Last synced: 21 Dec 2024

https://github.com/coreylowman/tenten

A tiny tensor library in rust with fused JIT operations.

cuda jit rust tensor

Last synced: 07 Jan 2025

https://github.com/triod315/cudabasics

Simple calculation with NVIDIA CUDA

cpp cuda

Last synced: 09 Jan 2025

https://github.com/isazi/aoflagger

AOFlagger Radio Frequency Interference mitigation algorithm.

cuda gpu many-core rfi

Last synced: 30 Jan 2025

https://github.com/peri044/cuda

GPU implementations of algorithms

cuda gauss-jordan parallel-programming

Last synced: 08 Feb 2025

https://github.com/mulx10/firefly

Enhancing Object Detection in using Thermal Imaging for thin cross-section unidentifiable objects(eg. cyclist, pedestrians).

autonomous-cars autonomous-navigation autonomous-vehicles c cuda object-detection thermal-camera yolov3

Last synced: 30 Dec 2024

https://github.com/szymon423/tsp-cpu-vs-gpu

Simple brute force approach to solve travelling salesman problem with CPU and GPU

cuda tsp

Last synced: 18 Jan 2025

https://github.com/markdtw/parallel-programming

Basic Pthread, OpenMP, CUDA examples

cuda openmp parallel-programming pthreads

Last synced: 12 Jan 2025

https://github.com/hansalemaos/nvidiacheck

Monitors NVIDIA GPU information and log the data into a pandas DataFrame - Windows only.

cuda log logging nvidia torch

Last synced: 05 Feb 2025

https://github.com/csvancea/gpu-hashtable

GPU-backed linear-probing hash table implemented in CUDA. Supports batch operations such as insert and retrieval.

cuda hashtable

Last synced: 19 Jan 2025

https://github.com/lawmurray/gpu-gemm

CUDA kernel for matrix-matrix multiplication on Nvidia GPUs, using a Hilbert curve to improve L2 cache utilization.

cplusplus cuda cuda-kernels cuda-programming gpu gpu-computing gpu-programming matrix-multiplication numerical-methods scientific-computing

Last synced: 01 Nov 2024

https://github.com/akhuntsaria/canny-edge-detection

Canny edge detector implemented in CUDA C/C++

cuda image-processing video-processing

Last synced: 23 Oct 2024

https://github.com/brocbyte/realtime-deformations

Snow simulation (Material Point Method)

cuda glm material-point-method opengl

Last synced: 09 Nov 2024

https://github.com/dito97/gol

High-performance Computing (90535) final project at UniGe

cuda mpi openmp

Last synced: 22 Dec 2024

https://github.com/grakshith/parallel-k-means

K-Means clustering for Image Colour Quantization and Image Compression

cuda image-color-quantization image-compression k-means mpi opencv openmp

Last synced: 06 Jan 2025

https://github.com/kar-dim/watermarking-gpu

Code for my Diploma thesis at Information and Communication Systems Engineering (University of the Aegean, School of Engineering) with title "Efficient implementation of watermark and watermark detection algorithms for image and video using the graphics processing unit". Part 2 / GPU

arrayfire cpp cuda gpu image-processing opencl parallel-computing video-processing watermark-image watermarking

Last synced: 04 Jan 2025

https://github.com/nellogan/distributed_compy

Distributed_compy is a distributed computing library that offers multi-threading, heterogeneous (CPU + mult-GPU), and multi-node support

cluster cuda heterogeneous-parallel-programming multi-threading multigpu openmp openmpi

Last synced: 12 Jan 2025

https://github.com/juntyr/necsim-rust

Spatially explicit biodiversity simulations using a parallel library written in Rust

biodiversity cuda mpi necsim rust simulation

Last synced: 28 Oct 2024