Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/rocm/tensile
Stretching GPU performance for GEMMs and tensor contractions.
https://github.com/rocm/tensile
amd assembly auto-tuning blas dnn gemm gpu gpu-acceleration gpu-computing hip machine-learning matrix-multiplication neural-networks opencl python radeon tensor-contraction tensors
Last synced: 3 days ago
JSON representation
Stretching GPU performance for GEMMs and tensor contractions.
- Host: GitHub
- URL: https://github.com/rocm/tensile
- Owner: ROCm
- License: mit
- Created: 2015-11-05T13:58:49.000Z (about 9 years ago)
- Default Branch: develop
- Last Pushed: 2024-06-04T19:03:45.000Z (8 months ago)
- Last Synced: 2024-06-04T19:05:17.825Z (8 months ago)
- Topics: amd, assembly, auto-tuning, blas, dnn, gemm, gpu, gpu-acceleration, gpu-computing, hip, machine-learning, matrix-multiplication, neural-networks, opencl, python, radeon, tensor-contraction, tensors
- Language: Python
- Homepage:
- Size: 92.8 MB
- Stars: 196
- Watchers: 55
- Forks: 136
- Open Issues: 25
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- Contributing: CONTRIBUTING.md
- License: LICENSE.md
- Codeowners: .github/CODEOWNERS
Awesome Lists containing this project
README
Tensile is a tool for creating benchmark-driven backend libraries for GEMMs, GEMM-like problems (such as batched GEMM), and general N-dimensional tensor contractions on a GPU.
The Tensile library is mainly used as a backend library for rocBLAS.
Tensile acts as the performance backbone for a wide variety of 'compute' applications running on AMD GPUs.> [!NOTE]
> The published documentation is available at [Tensile](https://rocm.docs.amd.com/projects/Tensile/en/latest/index.html) in an organized, easy-to-read format, with search and a table of contents. The documentation source files reside in the `Tensile/docs/src` folder of this repository. As with all ROCm projects, the documentation is open source. For more information on contributing to the documentation, see [Contribute to ROCm documentation](https://rocm.docs.amd.com/en/latest/contribute/contributing.html).