Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/xrq-phys/blis_apple
BLIS fork with kernels for Apple M1. (Perhaps) The first open-source BLAS with Apple Matrix Coprocessor support.
https://github.com/xrq-phys/blis_apple
Last synced: about 1 month ago
JSON representation
BLIS fork with kernels for Apple M1. (Perhaps) The first open-source BLAS with Apple Matrix Coprocessor support.
- Host: GitHub
- URL: https://github.com/xrq-phys/blis_apple
- Owner: xrq-phys
- License: other
- Created: 2021-07-17T08:13:18.000Z (over 3 years ago)
- Default Branch: amx-dev
- Last Pushed: 2023-01-07T14:09:32.000Z (almost 2 years ago)
- Last Synced: 2024-11-05T00:36:07.750Z (about 1 month ago)
- Language: C
- Homepage:
- Size: 44.2 MB
- Stars: 32
- Watchers: 5
- Forks: 5
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG
- Contributing: CONTRIBUTING.md
- License: LICENSE
Awesome Lists containing this project
- awesome-gemm - blis_apple: BLIS optimized for Apple M1 - 3-Clause) (Libraries 🗂️ / CPU Libraries 💻)
README
Current MD file explains this port of BLIS to Apple's matrix coprocessor.
- Here is the original BLIS [README](README_BLIS.md).
As of Jul., 2021, the coprocessor is undocumented but not protected either. (Any user / program is allowed to invoke this coprocessor and it's supposed to be safe.) This work is based on Dougall Johnson's effort on analyzing the related instructions.
Known issues:
- Generic-strided is not supported by our microkernels for the destination matrix. Program would `assert(false)` upon encountering such a situation.
- TRSM might fail. Try commenting out function call to `bli_cntx_set_packm_kers` in `config/aaplmx/bli_cntx_init_aaplmx.c` if your need TRSM to work.Performance:
![](docs/graphs/aaplmx/output_st_dgemm_asm_blis.png)
![](docs/graphs/aaplmx/output_st_sgemm_asm_blis.png)