Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

CUDA

CUDA® is a parallel computing platform and programming model developed by NVIDIA for general computing on graphical processing units (GPUs). With CUDA, developers are able to dramatically speed up computing applications by harnessing the power of GPUs.

https://github.com/sthysel/jtx2-tools

nvidia jtx/xavier GPU monitor tool

cuda nvidia txt2 xavier

Last synced: 20 Jan 2025

https://github.com/kishore-narendran/eecs221-highperformancecomputing

Assignments done during the graduate course EECS 221 - Introduction to HPC that I took in the Spring Quarter of 2016 at University of California, Irvine. Involves assignments that use OpenMP, MPI and CUDA.

cuda hpc mpi openmp

Last synced: 05 Jan 2025

https://github.com/deftruth/hgemm-tensorcores-mma

⚡️Write HGEMM from scratch using Tensor Cores with WMMA, MMA PTX and CuTe API. 🎉🎉

cuda hgemm tensor-cores

Last synced: 04 Dec 2024

https://github.com/tmrob2/cuda2rust_sandpit

Minimal examples to get CUDA linear algebra programs working with Rust using CC & FFI.

cc clang cublas cuda cusparse rust

Last synced: 19 Nov 2024

https://github.com/yingding/applyllm

A python package for applying LLM with LangChain and Hugging Face on local CUDA/MPS host

accelerator batch cuda framework inference kubeflow langchain llm mps pipeline slurm transformers

Last synced: 22 Dec 2024

https://github.com/isazi/aoflagger

AOFlagger Radio Frequency Interference mitigation algorithm.

cuda gpu many-core rfi

Last synced: 30 Jan 2025

https://github.com/peri044/cuda

GPU implementations of algorithms

cuda gauss-jordan parallel-programming

Last synced: 08 Feb 2025

https://github.com/dereklstinson/nccl

golang wrapper for nccl

cuda deep-learning go nccl parallel-computing

Last synced: 15 Jan 2025

https://github.com/teodutu/asc

Arhitectura Sistemelor de Calcul - UPB 2020

cache-optimization cuda parallel-programming profiling python-threading

Last synced: 30 Jan 2025

https://github.com/droduit/multiprocessor-architecture

Introduction to Multiprocessor Architecture @ EPFL

cuda multiprocessor multithreading openmp-parallelization

Last synced: 02 Jan 2025

https://github.com/mrglaster/cuda-acfcalc

Calculation of the smallest ACF for signals of length N using CUDA technology.

acf c calculations cpp cuda google-colaboratory google-colaboratory-notebooks isu

Last synced: 15 Jan 2025

https://github.com/stdogpkg/cukuramoto

A python/CUDA pkg which solves numerically the kuramoto model through the Heun's method

complex-networks cuda kuramoto-model

Last synced: 29 Jan 2025

https://github.com/aiday-mar/mpi-cuda-project

Using MPI and CUDA in order to accelerate the conjugate gradient algorithm execution in C++

c-plus-plus cuda gpu mpi university-project

Last synced: 05 Jan 2025

https://github.com/cppalliance/crypt

A C++20 module of cryptographic utilities for CPU and GPU

cpp20 cuda security

Last synced: 09 Jan 2025

https://github.com/arminms/p2rng

A modern header-only C++ library for parallel algorithmic (pseudo) random number generation supporting OpenMP, CUDA, ROCm and oneAPI

cpp cuda cxx header-only heterogeneous-computing library linux macos multiplatorm oneapi openmp parallel pcg-random prng pseudorandom-number-generator random-number-distributions random-number-generation rocm stl-algorithms windows

Last synced: 05 Nov 2024

https://github.com/demoriarty/doksparse

sparse DOK tensors on GPU, pytorch

cuda pytorch sparse

Last synced: 05 Jan 2025

https://github.com/xmas7/cudampi

A large hybrid CPU/GPU sorting network using CUDA and MPI. The sorting network uses a standard Quicksort for CPUs and a custom Bitonic Sort for GPUs. These two algorithms were the fastest in a number of prior benchmarks.

cpu cuda gpu hybrid mpi network

Last synced: 01 Feb 2025

https://github.com/matthewfeickert/cuda-tf-torch

An Ubuntu 18.04 NVIDIA Docker image with CUDA 10.1 CuDNN 7 with TensorFlow and PyTorch

cuda cuda-101 cudnn cudnn-v7 docker docker-image gpu nvidia-docker nvidia-gpu pytorch tensorflow torch

Last synced: 01 Feb 2025

https://github.com/orlandopalmeira/trabalho-cp-2023-2024

Repositório do trabalho prático no âmbito da UC de Computação Paralela (CP) - Mestrado em Engenharia Informática (MEI/MIEI) - Universidade do Minho (UMinho)

computacao-paralela cp cuda cuda-programming mei miei nvidia nvidia-cuda openmp optimization optimization-problem parallelism performance uminho uminho-mei uminho-miei

Last synced: 25 Jan 2025

https://github.com/nachovizzo/saxpy_openacc_cpp

My way of thinking about OpenACC, C++, and Parallel computing in general

cpp cuda gpu openacc

Last synced: 30 Jan 2025

https://github.com/cfries/javagpuexperiments

Repository used to demo OpenCL, JOCL, JCuda.

cuda

Last synced: 27 Dec 2024

https://github.com/vietdoo/seam-carving-cuda

CUDA Seam Carving: Accelerating Image Resizing with GPU Computing

cc cuda cuda-programming gpu-computing parrallel-computing seam-carving

Last synced: 07 Feb 2025

https://github.com/pabvald/parallel-computing

Parallel computing practise with OpenMP, MPICH and CUDA

cuda mpich openmp parallel-computing

Last synced: 29 Jan 2025

https://github.com/speedcell4/torchdevice

Setup CUDA_VISIBLE_DEVICES

cuda deep-learning gpu machine-learning pytorch

Last synced: 08 Feb 2025

https://github.com/garciparedes/cuda-examples

Cuda examples who I develop to learn HPC based on GPU

c c-plus-plus cuda examples gpgpu gpu hpc

Last synced: 16 Jan 2025

https://github.com/wallneradam/docker-ccminer

CCMiner (tpruvot version) Docker Builder

ccminer cuda docker gpu litecoin miner monero nvidia nvidia-docker

Last synced: 01 Feb 2025

https://github.com/jonathanraiman/mini_cuda_rtc

Miniature CUDA Array library with Runtime Compilation

cpp11 cuda jit runtime-compilation

Last synced: 22 Jan 2025

https://github.com/ergonomech/comfyui-windows-installer

Automated setup for ComfyUI on Windows with CUDA, custom plugins, and optimized PyTorch settings. Made to Run as Server and Error Correct,. Easy installation and launch using Miniconda.

automation comfy conda conda-environment cuda hosting-deployment setup windows

Last synced: 06 Feb 2025

https://github.com/rhysdg/whisper-onnx-python

A low-footprint GPU accelerated Speech to Text Python package for the Jetpack 5 era bolstered by an optimized graph

ai chatbot cuda machine-learning onnxruntime speech-to-text whisper

Last synced: 08 Feb 2025

https://github.com/michaelfranzl/image_debian-gpgpu

Dockerfile for a Debian base image with AMD and Nvidia GPGPU support

amd container container-image cuda debian docker gpgpu nvidia opencl

Last synced: 21 Jan 2025

https://github.com/matteogianferrari/qr-decomposition

Tthis project implements different methods to exploit caches usage, the multicore CPU and the GPU architectures, on the Gram-Schmidt QR Decomposition algorithm and measure the performance of the different implementations.

cuda openmp parallel-computing

Last synced: 10 Feb 2025

https://github.com/liuyuweitarek/pytorch-docker-builder

Automate PyTorch Docker image builds with compatible Python, CUDA, and Poetry versions, including CI/CD for testing.

cicd containerd cuda docker docker-image poetry-python python python3 pytorch pytorch-docker

Last synced: 24 Jan 2025

https://github.com/programmer-rd-ai/detectx

A Pythonic approach to object detection using Detectron2, a clean, modular framework for training and deploying computer vision models. DetectX simplifies the complexity of object detection while maintaining high performance and extensibility.

coco-dataset computer-vision computer-vision-library cuda deep-learning detectron2 faster-rcnn gpu-accelerated machine-learning ml-framework object-detection object-recognition python3 pytorch retinanet

Last synced: 12 Jan 2025

https://github.com/giovaneiwamoto/cuda-shortest-paths

🧩 Cuda Shortest Paths - Parallel Dijkstra and Floyd algorithms using Nvidia CUDA to calculate All-Pairs Shortest Path (APSP) in a given graph represented by its adjacency matrix.

all-pairs-shortest-path cuda nvidia

Last synced: 11 Nov 2024

https://github.com/jessetg/cuda-practice

Working through the chapters of Cuda by Example

c cpp cuda cuda-by-example gpgpu

Last synced: 14 Jan 2025

https://github.com/galaxies99/inception-cuda

CUDA Implementation of Inception

cuda inception-v3

Last synced: 07 Nov 2024

https://github.com/dhruvsrikanth/cudann

A distributed implementation of a deep learning framework in CUDA.

cpp cuda deep-learning deep-learning-framework gpu-programming high-performance-computing hpc parallel-programming

Last synced: 25 Dec 2024

https://github.com/nickolasrm/gpuvscpumatrixmultiplication

CPU and GPU optimized matrix multiplication (AVX, transposition, CUDA and other)

avx comparison cuda hpc matrix multiplication

Last synced: 28 Dec 2024

https://github.com/tensorbfs/cutropicalgemm.jl

The fastest Tropical number matrix multiplication on GPU

cuda gemm tropical-algebra

Last synced: 20 Dec 2024

https://github.com/le-ander/msc_bioinfo-experimental_design

Using information theory to inform experimental design with GPU acceleration. Computing group project as part of the MSc in Bioinformatics and Theorectical Systems Biology at Imperial College London 2016/2017.

cuda experimental-design gpu-computing information-theory pycuda systems-biology

Last synced: 31 Jan 2025

https://github.com/programmer-rd-ai/digivis

A PyTorch-based deep learning implementation for MNIST digit recognition featuring CNNs, GPU acceleration, experiment tracking, and comprehensive testing capabilities.

cnn computer-vision cuda data-science deep-learning digit-recognition image-classification machine-learning mnist neural-networks python pytorch wandb

Last synced: 12 Jan 2025

https://github.com/thisalmandula/gpu_accelerated_lpt_cfd_code

This repository contains GPU accelerated version of the particle tracking model developed by Merel Kooi for biofouled microplastic particles ( available at: https://pubs.acs.org/doi/10.1021/acs.est.6b04702) written in CUDA Fortran and CUDA Python. This repository is intended as a learning tool for GPU programming.

biofouling computational-fluid-dynamics cuda fortran lagrangian-particle-tracking microplastics python

Last synced: 02 Feb 2025

https://github.com/dansolombrino/gphungarian

A GPU-accelerated implementation of the Hungarian Algorithm, written in CUDA

cuda gpu hpc opencl

Last synced: 07 Feb 2025

https://github.com/neoblizz/cupti-plus-plus

CUPTI++ is a C++ interface to the CUDA Profiling Tools Interface (CUPTI).

cpp cuda cuda-profiler cupti profiler

Last synced: 09 Feb 2025

https://github.com/hyunjinno/multicore_computing

A repository of multicore programming in Java and C.

c cpp cuda java multithreading openmp thread thrust

Last synced: 25 Jan 2025

https://github.com/mayukhdeb/patrick

Tiny neural net library written from scratch with cupy :warning: under construction :warning:

cuda deep-learning gpu-computing machine-learning neural-network regression

Last synced: 20 Dec 2024

https://github.com/kaierikniermann/hpc-uzh-notes

These are some notes for the High Performance Computing course taught at UZH

cuda high-performance-computing mpi openacc openmp

Last synced: 12 Jan 2025

https://github.com/willigarneau/sobel-filter-cuda

🖼️ Assignment 1 in Intelligent Industrial System at Cégep Lévis-Lauzon. Learning Cuda and OpenCV by creating a sobel filter. 💻

cplusplus cuda filter opencv sobel

Last synced: 23 Jan 2025

https://github.com/chintak/theano-lasagne-docker

Dockerfile for Lasagne with Cuda support. Look at the branches for relevant Dockerfiles - ``cpu`` and ``gpu``.

caffe cuda docker dockerfile install-script lasagne machine-learning machine-learning-library theano

Last synced: 23 Dec 2024

https://github.com/dolongbien/cuda

CUDA and Caffe/Caffe2 installation Ubuntu 16.04

c3d-intel-caffe caffe caffe2 cuda cudnn deep-learning ubuntu

Last synced: 21 Jan 2025

https://github.com/maelstrom6/mandelpy

A Mandelbrot and Buddhabrot viewer with GPU acceleration

buddhabrot cuda gpu mandelbrot python3

Last synced: 05 Feb 2025

https://github.com/alpha74/hungarianalgocuda

Hungarian Algorithm for Linear Assignment Problem implemented using CUDA.

cuda nvcc parallel-computing parallel-programming

Last synced: 16 Jan 2025

https://github.com/maawad/ptx_bcht

Bucketed Cuckoo hash set written in PTX and JIT-compiled.

cuckoo cuda gpu hash hashset ptx

Last synced: 09 Feb 2025

https://github.com/anras5/parallel-computing

Comparing CPU and GPU

cuda gpu openmp

Last synced: 21 Jan 2025

https://github.com/bhattbhavesh91/rapids-cudf-cuml-example

Running KNN algorithm much faster on GPU for free using RAPIDS packages like cuML and cuDF

cuda cuml deep-learning nvidia-gpu rapids rapidsai

Last synced: 17 Jan 2025

https://github.com/skillfulelectro/integral-solver

Simple integral solver

c cpp cuda math mathematics

Last synced: 01 Feb 2025

https://github.com/xza85hrf/ml-framework_checker

ML Framework and CUDA Checker is a Python-based GUI application for checking PyTorch, TensorFlow, and CUDA installations. It provides detailed system specs, compatibility checks, advanced GPU management, and offers options to view instructions, export logs, and update machine learning frameworks.

compatibility cuda gpu-management gui-application machine-learning python pytorch system-checker system-specs tensorflow

Last synced: 30 Jan 2025

https://github.com/willigarneau/object-detection-cuda

🕺 Put my knowledge of OpenCV and Cuda into practice to create an object detection system. 💻

camera cplusplus cuda detector filter opencv

Last synced: 23 Jan 2025

https://github.com/headless-start/data-augmentation-impact

This repository contains effect of Data Augmentation of Training Set during Model Training.

augmented-images cuda data gpu keras matplotlib mnist opencv-python python3 tensorflow training-data

Last synced: 08 Feb 2025

https://github.com/alekseyscorpi/vacancies_server

This is a server for vacancies generation using LLM (Saiga3)

code cuda cuda-toolkit docker dockerfile flask llama3 llamacpp llm ngrok pydantic saiga

Last synced: 01 Feb 2025

https://github.com/komorra/blackmagicengine

Nextgen, Classic/VR/AR Game Engine

core cuda dx12 game-development gameengine gpu net nvidia vulcan

Last synced: 31 Dec 2024

https://github.com/adamczykpiotr/cudamatrixlibrary

Matrix operation library using single, n-threads or CUDA supported GPU

agh agh-ust cpp cuda cuda-library matrix matrix-computations matrix-functions matrix-multiplication

Last synced: 19 Jan 2025

https://github.com/m-torhan/cuda-stl-renderer

CUDA C++ implementation of STL file renderer using ray tracing method

cuda

Last synced: 31 Dec 2024

https://github.com/andygeiss/machine-learning-golang

This repository provides a basic setup to do Machine Learning with Golang and Python, TensorFlow 1.15 and CUDA 10.0.

benchmark cuda docker go golang machine-learning python tensorflow

Last synced: 06 Feb 2025

https://github.com/dafadey/GPGPU_OpenCL_vs_CUDA

This is a repository with sample codes for testing memory bandwidth, arithmetic latency hiding and shared/local memory performance on AMD and nVidia devices

cuda gpgpu gpgpu-computing opencl

Last synced: 19 Nov 2024

https://github.com/mala13f/statistical-learning-in-finance

This Repository contains all the codes, papers and related data for assignments done during the course.

cuda gpu-acceleration jupyter-notebook machine-learning python statistical-learning

Last synced: 31 Jan 2025

https://github.com/gunrock/template

Template repository for essentials applications to get you started asap!

cpp cuda essentials gpu graph-algorithms graph-analytics gunrock

Last synced: 10 Jan 2025

https://github.com/pharmcat/metidacu.jl

CUDA solver for Metida.jl

cuda julia-language metida mixed-models

Last synced: 09 Feb 2025

https://github.com/xavierjiezou/gpu-compute-capability

An application for querying the computing power of each gpu released by NVIDIA.

cuda gpu nvidia

Last synced: 01 Feb 2025

https://github.com/xlisp/learn-vllm

vllm learning

cuda nvidia pytorch vllm

Last synced: 02 Feb 2025

https://github.com/kayuii/ironfish-miner

docker nvidia/amd Gpu hpool-dev/ironfish-miner ironfish-miner

amdgpu cuda docker gpu nvidia rocm

Last synced: 31 Jan 2025

https://github.com/hartorn/docker-python

Repository to build python image, based on ubuntu and CUDA

cuda docker mkl-dnn onednn python3 ubuntu ubuntu1804

Last synced: 12 Jan 2025

https://github.com/kchristin22/ising_model

Implementation of a cellular automaton on GPU using different features of CUDA

cellular-automaton cuda gpu-programming hpc ising-model parallel-computing

Last synced: 22 Jan 2025

https://github.com/bolner/totally-diffused

Debian/NVIDIA Docker image for AUTOMATIC1111's Stable Diffusion application.

automatic1111 cuda debian docker-image nvidia stable-diffusion xformers

Last synced: 08 Feb 2025

https://github.com/di-hal/vision-pro-max

A Raspberry Pi-based object detection system for assisting visually impaired individuals. This project utilizes YOLO object detection and a Hailo 8L TPU to identify obstacles like manholes, potholes, and bumps, providing real-time audio feedback to aid navigation.

bash computer-vision cuda fine-tuning gtts jupyter-notebook object-detection opencv python pytorch raspberry-pi rpi-camera ssh text-to-speech ultralytics yolo yolov8

Last synced: 26 Jan 2025

https://github.com/sleeepyjack/multisplit

Simple multisplit for CUDA accelerators

cpp cuda gpu nvidia parallel-programming primitive split

Last synced: 22 Jan 2025

https://github.com/pvdberg1998/cufft_rust

A safe Rust wrapper around a subset of cuFFT.

cuda cufft fft rust

Last synced: 12 Dec 2024

https://github.com/thomasonzhou/minitorch

rebuilding pytorch: from autograd to convolutions in CUDA

cuda numba numpy

Last synced: 30 Dec 2024

https://github.com/rjected/cuda-timelock

Solving a large number of timelock puzzles in parallel using GPU acceleration

c cgbn concurrent cpp cuda gmp graphics nvidia parallel puzzle timelock

Last synced: 09 Feb 2025

https://github.com/whutao/artificial-art

Image approximation with triangles using evolutionary algorithm.

cuda evolutionary-algorithm python3

Last synced: 16 Jan 2025

https://github.com/matx64/rs-netbot

Old School Runescape (MMORPG) Bot created using a Convolutional Neural Network for object identification

cuda numpy python pytorch

Last synced: 09 Feb 2025

https://github.com/mortafix/quickshift

A working implementation of Quickshift algorithm in CUDA, GPU-compatible.

cuda gpu-computing quickshift

Last synced: 13 Jan 2025

https://github.com/lcsb-biocore/cufluxsampler.jl

GPU-accelerated algorithms for flux sampling in CUDA.jl

cobra cuda gpu julia metabolic-network metabolism sampling

Last synced: 30 Jan 2025

https://github.com/abdulfatir/subkmeans

Numpy and pyCUDA implementation of subKmeans

clustering cuda kdd kmeans numpy pycuda python subspace-clustering

Last synced: 09 Feb 2025

https://github.com/quantum-integrated-technologies/deepforge

DeepForge : framework for working with machine learning.

ai artificial-intelligence cuda library machine-learning ml neural-network

Last synced: 10 Feb 2025

https://github.com/sartajbhuvaji/cuda

Deloped CUDA kernel functions to load and train a Convolution Neural Network from scratch.

cuda cuda-programming gpu-programming neural-network nvidia-cuda

Last synced: 05 Feb 2025

https://github.com/sohhamseal/scalable-systems-programs

A little less effort to learn parallel programming...

cuda mpi openmp

Last synced: 13 Jan 2025

https://github.com/brendanbignell/cuda_montecarlooptionpricer

CUDA Monte Carlo Barrier Option Pricing Demo & Jupyer lab ML models

cuda deep-learning ml pytorch quantitative-finance xgboost-regression

Last synced: 05 Feb 2025