https://github.com/xrq-phys/blis_apple
BLIS fork with kernels for Apple M1. (Perhaps) The first open-source BLAS with Apple Matrix Coprocessor support.
https://github.com/xrq-phys/blis_apple
Last synced: 3 days ago
JSON representation
BLIS fork with kernels for Apple M1. (Perhaps) The first open-source BLAS with Apple Matrix Coprocessor support.
- Host: GitHub
- URL: https://github.com/xrq-phys/blis_apple
- Owner: xrq-phys
- License: other
- Created: 2021-07-17T08:13:18.000Z (over 3 years ago)
- Default Branch: amx-dev
- Last Pushed: 2023-01-07T14:09:32.000Z (over 2 years ago)
- Last Synced: 2025-03-26T03:51:15.405Z (21 days ago)
- Language: C
- Homepage:
- Size: 44.2 MB
- Stars: 33
- Watchers: 4
- Forks: 5
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG
- Contributing: CONTRIBUTING.md
- License: LICENSE
Awesome Lists containing this project
- awesome-gemm - blis_apple: BLIS optimized for Apple M1 - 3-Clause) (Libraries 🗂️ / CPU Libraries 💻)
README
Current MD file explains this port of BLIS to Apple's matrix coprocessor.
- Here is the original BLIS [README](README_BLIS.md).
As of Jul., 2021, the coprocessor is undocumented but not protected either. (Any user / program is allowed to invoke this coprocessor and it's supposed to be safe.) This work is based on Dougall Johnson's effort on analyzing the related instructions.
Known issues:
- Generic-strided is not supported by our microkernels for the destination matrix. Program would `assert(false)` upon encountering such a situation.
- TRSM might fail. Try commenting out function call to `bli_cntx_set_packm_kers` in `config/aaplmx/bli_cntx_init_aaplmx.c` if your need TRSM to work.Performance:

