An open API service indexing awesome lists of open source software.

Projects in Awesome Lists tagged with matmul

A curated list of projects in awesome lists tagged with matmul .

https://github.com/eth-cscs/cosma

Distributed Communication-Optimal Matrix-Matrix Multiplication Algorithm

communication-optimal cuda gpu-acceleration linear-algebra matmul matrix-multiplication mpi pdgemm rocm scalapack

Last synced: 04 Apr 2025

https://github.com/eth-cscs/tiled-mm

Matrix multiplication on GPUs for matrices stored on a CPU. Similar to cublasXt, but ported to both NVIDIA and AMD GPUs.

amd cublas cublasxt cuda gpu matmul matrix-multiplication nvidia rocblas rocblasxt rocm

Last synced: 04 Apr 2025

https://github.com/gha3mi/formatmul

ForMatmul - A Fortran library that overloads the matmul function to enable efficient matrix multiplication with/without coarray.

coarray fortran fortran-package-manager matmul

Last synced: 30 Mar 2025

https://github.com/laserborg/circuitpython_benchmark

Raspberry Pi Pico (RP2040) and Adafruit Metro M7 (NXP IMXRT10XX) benchmark

adafruit adafruit-metro-m7 benchmark circuitpython float32 matmul mcu python3 raspberry-pi-pico

Last synced: 16 Mar 2025

https://github.com/awrsha/cuda-gpus-and-triton-adcanced-review

This repository provides a comprehensive guide to optimizing GPU kernels for performance, with a focus on NVIDIA GPUs. It covers key tools and techniques such as CUDA, PyTorch, and Triton, aimed at improving computational efficiency for deep learning and scientific computing tasks.

cuda-programming gpu-programming jit kernels matmul mojo-language multiprocessing multithreading torchquantum triton

Last synced: 12 Jan 2025

https://github.com/jmaczan/tinyconvnet

Convolutional Neural Network from scratch in CuPy and tinygrad

cnn convnet convolution convolutional-neural-networks cs231n cupy matmul pooling tinygrad

Last synced: 18 Feb 2025