Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

CUDA

CUDA® is a parallel computing platform and programming model developed by NVIDIA for general computing on graphical processing units (GPUs). With CUDA, developers are able to dramatically speed up computing applications by harnessing the power of GPUs.

https://github.com/muhac/jupyter-pytorch-docker

JupyterLab for AI in Docker! Anaconda and PyTorch GPU supported.

conda-environment cuda docker jupyterlab pytorch

Last synced: 21 Jan 2025

https://github.com/ilyasmoutawwakil/optimum-whisper-autobenchmark

A set of benchmarks on OpenAI's Whisper model, using AutoBenchmark and Optimum's OnnxRuntime Optimizations.

benchmark cuda deep-learning

Last synced: 30 Jan 2025

https://github.com/coreylowman/tenten

A tiny tensor library in rust with fused JIT operations.

cuda jit rust tensor

Last synced: 07 Jan 2025

https://github.com/lintenn/cudaaddvectors-explicit-vs-unified-memory

Performance comparison of two different forms of memory management in CUDA

c cuda explicit memory memory-management performance unified-memory

Last synced: 06 Jan 2025

https://github.com/yomi4486/zundamon_v3

マスター、お冷ショットで。

cuda discord-bot discord-py docker docker-compose python tts voicevox zundamon

Last synced: 27 Nov 2024

https://github.com/cfries/javagpuexperiments

Repository used to demo OpenCL, JOCL, JCuda.

cuda

Last synced: 27 Dec 2024

https://github.com/DefTruth/hgemm-tensorcores-mma

⚡️Write HGEMM from scratch using Tensor Cores with WMMA, MMA PTX and CuTe API. 🎉🎉

cuda hgemm tensor-cores

Last synced: 06 Dec 2024

https://github.com/pd2871/high-performance-computing

This repo contain the logs of High Performance Computing module's final Assignment

blurred-images c cuda gaussian-blur matrix-multiplication multi-threading parallel-computing pthreads pthreads-api

Last synced: 25 Jan 2025

https://github.com/biodasturchi/gmx

🔬 Gromacs yordamida molekular modellashtirish

cuda gpu gromacs mdp topology tpr trr

Last synced: 21 Jan 2025

https://github.com/triod315/cudabasics

Simple calculation with NVIDIA CUDA

cpp cuda

Last synced: 09 Jan 2025

https://github.com/navdeep-g/dimreduce4gpu

Dimensionality reduction ("dimreduce") on GPUs ("4gpu")

cplusplus cuda dimensionality-reduction gpu linear-algebra pca python svd unsupervised-learning

Last synced: 24 Dec 2024

https://github.com/matthewfeickert/cuda-tf-torch

An Ubuntu 18.04 NVIDIA Docker image with CUDA 10.1 CuDNN 7 with TensorFlow and PyTorch

cuda cuda-101 cudnn cudnn-v7 docker docker-image gpu nvidia-docker nvidia-gpu pytorch tensorflow torch

Last synced: 01 Feb 2025

https://github.com/deftruth/hgemm-tensorcores-mma

⚡️Write HGEMM from scratch using Tensor Cores with WMMA, MMA PTX and CuTe API. 🎉🎉

cuda hgemm tensor-cores

Last synced: 04 Dec 2024

https://github.com/demoriarty/doksparse

sparse DOK tensors on GPU, pytorch

cuda pytorch sparse

Last synced: 05 Jan 2025

https://github.com/markdtw/parallel-programming

Basic Pthread, OpenMP, CUDA examples

cuda openmp parallel-programming pthreads

Last synced: 12 Jan 2025

https://github.com/sean-bradley/cudalookupsha256

SHA256 Lookup using parallel processing on a NVidia CUDA Compatible Graphics card

cuda parallel-processing sha256

Last synced: 13 Nov 2024

https://github.com/hansalemaos/nvidiacheck

Monitors NVIDIA GPU information and log the data into a pandas DataFrame - Windows only.

cuda log logging nvidia torch

Last synced: 05 Feb 2025

https://github.com/nachovizzo/saxpy_openacc_cpp

My way of thinking about OpenACC, C++, and Parallel computing in general

cpp cuda gpu openacc

Last synced: 30 Jan 2025

https://github.com/kilamper/matrix-multiplication

AC - Matrix multiplication using OpenMP, MPI and CUDA

cuda ms-mpi openmp

Last synced: 26 Jan 2025

https://github.com/skillfulelectro/integral-solver

Simple integral solver

c cpp cuda math mathematics

Last synced: 01 Feb 2025

https://github.com/hartorn/docker-python

Repository to build python image, based on ubuntu and CUDA

cuda docker mkl-dnn onednn python3 ubuntu ubuntu1804

Last synced: 12 Jan 2025

https://github.com/alextmjugador/rust-cuda-quickstart

Bring the Rust-CUDA project back to life under modern Linux environments.

cuda cuda-programming cuda-rust cuda-support docker rust

Last synced: 26 Jan 2025

https://github.com/galaxies99/inception-cuda

CUDA Implementation of Inception

cuda inception-v3

Last synced: 07 Nov 2024

https://github.com/jakubriegel/game_of_life_3d

3D game of life implemented in CUDA

concurency cuda gameoflife nvidia put-poznan

Last synced: 01 Feb 2025

https://github.com/brendanbignell/cuda_montecarlooptionpricer

CUDA Monte Carlo Barrier Option Pricing Demo & Jupyer lab ML models

cuda deep-learning ml pytorch quantitative-finance xgboost-regression

Last synced: 05 Feb 2025

https://github.com/qervas/cn_chess_ai

chinese chess(Xiangqi) AI

ai cpp cuda dqn qt6

Last synced: 23 Oct 2024

https://github.com/ssoehdata/cuda_fortran_sci_eng

Working through examples from the Cuda Fortran for Scientists and Engineers 2nd Edition Book

cuda cuda-fortran fortran hpc nvfortran

Last synced: 10 Dec 2024

https://github.com/snoopy3476/t-espresso

A CUDA Library for Low-overhead Host-to-Device Transmission of Patterned Profile Data

cuda profiler

Last synced: 07 Nov 2024

https://github.com/shivendrra/axgrad

lightweight tensor library that contains it's own auto-diff engine like pytorch

autograd cuda pytorch scratch-implementation tinygrad

Last synced: 06 Feb 2025

https://github.com/makischristou/mandelbrot

Mandelbrot set visualizer using CUDA.

cpp cuda gpu mandelbrot nvidia renderer rust

Last synced: 20 Jan 2025

https://github.com/ginkgo-project/cudaarchitectureselector

A CMake module simplifying the specification of CUDA architectures

cmake cmake-modules cuda

Last synced: 27 Dec 2024

https://github.com/giovaneiwamoto/cuda-shortest-paths

🧩 Cuda Shortest Paths - Parallel Dijkstra and Floyd algorithms using Nvidia CUDA to calculate All-Pairs Shortest Path (APSP) in a given graph represented by its adjacency matrix.

all-pairs-shortest-path cuda nvidia

Last synced: 11 Nov 2024

https://github.com/xlisp/learn-vllm

vllm learning

cuda nvidia pytorch vllm

Last synced: 02 Feb 2025

https://github.com/enriquebdel/clases-cuda-programacion-paralela-en-c-

En este repositorio encontrarás varias lecciones creadas por mí sobre la librería CUDA en C. El programa que utilizo para programar es MobaXterm.

c cuda cuda-programming gnu-linux googlecolab mobaxterm nvidia parallel-programming ubuntu university

Last synced: 26 Jan 2025

https://github.com/microo8/micronn

Simple neural network library with backpropagation using CUDA

c cuda neural-network

Last synced: 26 Jan 2025

https://github.com/di-hal/vision-pro-max

A Raspberry Pi-based object detection system for assisting visually impaired individuals. This project utilizes YOLO object detection and a Hailo 8L TPU to identify obstacles like manholes, potholes, and bumps, providing real-time audio feedback to aid navigation.

bash computer-vision cuda fine-tuning gtts jupyter-notebook object-detection opencv python pytorch raspberry-pi rpi-camera ssh text-to-speech ultralytics yolo yolov8

Last synced: 26 Jan 2025

https://github.com/thisalmandula/gpu_accelerated_lpt_cfd_code

This repository contains GPU accelerated version of the particle tracking model developed by Merel Kooi for biofouled microplastic particles ( available at: https://pubs.acs.org/doi/10.1021/acs.est.6b04702) written in CUDA Fortran and CUDA Python. This repository is intended as a learning tool for GPU programming.

biofouling computational-fluid-dynamics cuda fortran lagrangian-particle-tracking microplastics python

Last synced: 02 Feb 2025

https://github.com/miniex/maidenx

Rust-based CUDA library designed for learning purposes and building my AI engines named Maiden Engine

ai cuda rust

Last synced: 28 Oct 2024

https://github.com/pratikvn/nla4hpc-exercises-framework

The exercises framework for the Numerical Linear Algebra for HPC course at Karlsruhe Institute of Technology.

cuda ginkgo homeworks hpc-course teaching

Last synced: 26 Jan 2025

https://github.com/pjueon/cuda_intellisense

A simple python script to fix cuda C++ intellisense for visual studio.

cuda visual-studio

Last synced: 23 Oct 2024

https://github.com/maawad/ptx_bcht

Bucketed Cuckoo hash set written in PTX and JIT-compiled.

cuckoo cuda gpu hash hashset ptx

Last synced: 09 Feb 2025

https://github.com/neoblizz/cupti-plus-plus

CUPTI++ is a C++ interface to the CUDA Profiling Tools Interface (CUPTI).

cpp cuda cuda-profiler cupti profiler

Last synced: 09 Feb 2025

https://github.com/nickolasrm/gpuvscpumatrixmultiplication

CPU and GPU optimized matrix multiplication (AVX, transposition, CUDA and other)

avx comparison cuda hpc matrix multiplication

Last synced: 28 Dec 2024

https://github.com/mhaseeb123/gcb

GCB includes a suite of benchmarks and basic tests for CUDA-aware MPI and C++ compilers.

cpp cpp23 cuda mpi partitioned-communication st-mpi

Last synced: 24 Jan 2025

https://github.com/larygwil/ffmpeg-static-cuda

ffmpeg static binaries for Linux that work on some old Nvidia gpu (not tested)

avc cuda cuvid ffmpeg h264 h265 hevc nvdec nvenc

Last synced: 02 Feb 2025

https://github.com/dhruvsrikanth/cudann

A distributed implementation of a deep learning framework in CUDA.

cpp cuda deep-learning deep-learning-framework gpu-programming high-performance-computing hpc parallel-programming

Last synced: 25 Dec 2024

https://github.com/tensorbfs/cutropicalgemm.jl

The fastest Tropical number matrix multiplication on GPU

cuda gemm tropical-algebra

Last synced: 20 Dec 2024

https://github.com/mortafix/quickshift

A working implementation of Quickshift algorithm in CUDA, GPU-compatible.

cuda gpu-computing quickshift

Last synced: 13 Jan 2025

https://github.com/speedcell4/torchdevice

Setup CUDA_VISIBLE_DEVICES

cuda deep-learning gpu machine-learning pytorch

Last synced: 08 Feb 2025

https://github.com/brosnanyuen/raybnn_graph

Graph Manipulation Library For GPUs, CPUs, and FPGAs via CUDA, OpenCL, and oneAPI

cuda gpu graph graph-algorithms neural-network neural-networks opencl raybnn rust

Last synced: 13 Nov 2024

https://github.com/matteogianferrari/qr-decomposition

Tthis project implements different methods to exploit caches usage, the multicore CPU and the GPU architectures, on the Gram-Schmidt QR Decomposition algorithm and measure the performance of the different implementations.

cuda openmp parallel-computing

Last synced: 10 Feb 2025

https://github.com/abhisheknair10/occupancy.nn

An multi-step pipeline to train and inference Occupancy Networks

3d-reconstruction cuda vision

Last synced: 13 Jan 2025

https://github.com/tlabaltoh/tlab-sharescreen-server-win

Software frame encoder using CUDA and cast encoded frames over UDP. Trying to implement a custom streaming protocol and shader based frame encoder/decoder for screencast.

cuda desktop-capture screensharing unity unity3d windows-graphics-capture

Last synced: 28 Jan 2025

https://github.com/quantum-integrated-technologies/deepforge

DeepForge : framework for working with machine learning.

ai artificial-intelligence cuda library machine-learning ml neural-network

Last synced: 10 Feb 2025

https://github.com/daelsepara/hipslm

CPU and GPU (using HIP) implementations of phase pattern generators for use with spatial light modulators

computer-generated-holography cuda gpu hip hologram holography phase phase-pattern slm spatial-light-modulator

Last synced: 29 Dec 2024

https://github.com/nellogan/makefileexamples

Makefile examples of how to automate testing and building of applications/systems that use multiple: languages, compilers, and testing tools.

automated-testing c cuda makefile python valgrind

Last synced: 21 Jan 2025

https://github.com/donpablonows/coin

🪙 Crypto Optimization Interface Network (aka COIN) is a high-performance Bitcoin address generator using CUDA acceleration and multi-threading. It optimizes GPU and CPU resources for fast address generation, ensures secure private key creation, and includes real-time monitoring and automatic system optimizations.

bitcoin blockchain cryptography cuda gpu-acceleration

Last synced: 07 Jan 2025

https://github.com/rjected/cuda-timelock

Solving a large number of timelock puzzles in parallel using GPU acceleration

c cgbn concurrent cpp cuda gmp graphics nvidia parallel puzzle timelock

Last synced: 09 Feb 2025

https://github.com/crcrpar/dev-chainer

Dockerfile for Chainer Development in VSCode

chainer cuda docker nvidia-docker vscode

Last synced: 09 Feb 2025

https://github.com/himeyama/cuda-nmf

NMF calculations are performed on NVIDIA GPUs using the Cuda API. (GEM released)

cublas cuda gem nmf ruby

Last synced: 29 Dec 2024

https://github.com/hatamiarash7/cuda-python

GPU programming using CUDA & Python

cuda gpu gpu-computing gpu-programming python

Last synced: 03 Feb 2025

https://github.com/thomasonzhou/minitorch

rebuilding pytorch: from autograd to convolutions in CUDA

cuda numba numpy

Last synced: 30 Dec 2024

https://github.com/dolongbien/cuda

CUDA and Caffe/Caffe2 installation Ubuntu 16.04

c3d-intel-caffe caffe caffe2 cuda cudnn deep-learning ubuntu

Last synced: 21 Jan 2025

https://github.com/fandreuz/parallel-programming-for-hpc

Scientific codes in C/C++ with CUDA, OpenACC, FFTW, (cu)BLAS

cpp cuda hpc mpi

Last synced: 21 Jan 2025

https://github.com/bl33h/pythagoreantheorem

A program that calculates the Pythagorean theorem for a large number of elements using GPU parallel processing.

arrays cuda kernel parallel-programming pythagoras pythagorean-theorem

Last synced: 21 Jan 2025

https://github.com/bl33h/productoftwovectors

This code utilizes CUDA for parallel vector multiplication on a GPU, demonstrating GPU's acceleration capabilities.

cuda gpu kernel paralelism parallel-programming product vector

Last synced: 21 Jan 2025

https://github.com/daelsepara/hipmandelbrot

GPU Implementation of Mandelbrot Fractal Generator with Benchmarking

amd cuda fractal gpu gpu-compute gpu-computing hip mandelbrot parallel-computing rocm sdk

Last synced: 07 Nov 2024

https://github.com/lightshade12/kittlespt

A hobby CUDA pathtracing renderer.

3d-graphics computer-graphics cuda gpu path-tracing ray-tracing

Last synced: 24 Jan 2025

https://github.com/ashwanirathee/imagesgpu.jl

Image Processing on GPU in Julia

cuda gpu image image-processing julia

Last synced: 08 Jan 2025

https://github.com/stanczakdominik/cuda_poisson

A 2D poisson solver via CUDA

cuda electromagnetism pde

Last synced: 04 Feb 2025

https://github.com/anras5/parallel-computing

Comparing CPU and GPU

cuda gpu openmp

Last synced: 21 Jan 2025

https://github.com/whutao/artificial-art

Image approximation with triangles using evolutionary algorithm.

cuda evolutionary-algorithm python3

Last synced: 16 Jan 2025

https://github.com/michaelfranzl/image_debian-gpgpu

Dockerfile for a Debian base image with AMD and Nvidia GPGPU support

amd container container-image cuda debian docker gpgpu nvidia opencl

Last synced: 21 Jan 2025

https://github.com/sohhamseal/scalable-systems-programs

A little less effort to learn parallel programming...

cuda mpi openmp

Last synced: 13 Jan 2025

https://github.com/nikolaydubina/basic-openai-pytorch-server

Minimal HTTP inference server in OpenAI API with Pytorch and CUDA

cuda docker llm openai pytorch server

Last synced: 04 Feb 2025

https://github.com/andih/cuda-fortran-stream

Variant of STREAM Benchmark in CUDA Fortran

cuda cuda-fortran gpu stream-benchmarks variants

Last synced: 12 Jan 2025

https://github.com/sanaeprj/matrix-for-cpp

This repository has types that handle matrices.

cpp14 cpp14-library cuda matrix-library

Last synced: 19 Nov 2024

https://github.com/franciscoda/psvm

R package and C++ library that allows training SVM models in a GPU using CUDA and predicting out-of-sample data. A support vector machine (SVM) is a type of machine learning model that is trained using supervised data to classify samples.

cpp cpp17 cuda machine-learning r svm-classifier svm-training

Last synced: 28 Jan 2025