Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/rocm/tensile

Stretching GPU performance for GEMMs and tensor contractions.
https://github.com/rocm/tensile

amd assembly auto-tuning blas dnn gemm gpu gpu-acceleration gpu-computing hip machine-learning matrix-multiplication neural-networks opencl python radeon tensor-contraction tensors

Last synced: 3 days ago
JSON representation

Stretching GPU performance for GEMMs and tensor contractions.

Awesome Lists containing this project

README

        

Tensile is a tool for creating benchmark-driven backend libraries for GEMMs, GEMM-like problems (such as batched GEMM), and general N-dimensional tensor contractions on a GPU.
The Tensile library is mainly used as a backend library for rocBLAS.
Tensile acts as the performance backbone for a wide variety of 'compute' applications running on AMD GPUs.

> [!NOTE]
> The published documentation is available at [Tensile](https://rocm.docs.amd.com/projects/Tensile/en/latest/index.html) in an organized, easy-to-read format, with search and a table of contents. The documentation source files reside in the `Tensile/docs/src` folder of this repository. As with all ROCm projects, the documentation is open source. For more information on contributing to the documentation, see [Contribute to ROCm documentation](https://rocm.docs.amd.com/en/latest/contribute/contributing.html).