An open API service indexing awesome lists of open source software.

https://github.com/DefTruth/ffpa-attn-mma

📚[WIP] FFPA: Yet antother Faster Flash Prefill Attention with O(1)⚡️GPU SRAM complexity for headdim > 256, 1.8x~3x↑🎉faster vs SDPA EA.
https://github.com/DefTruth/ffpa-attn-mma

attention cuda flash-attention mlsys sdpa tensor-cores

Last synced: about 1 month ago
JSON representation

📚[WIP] FFPA: Yet antother Faster Flash Prefill Attention with O(1)⚡️GPU SRAM complexity for headdim > 256, 1.8x~3x↑🎉faster vs SDPA EA.

Awesome Lists containing this project

README

          

# Notes 👇👇

This project has been moved to [xlite-dev/ffpa-attn-mma](https://github.com/xlite-dev/ffpa-attn-mma). Please check [xlite-dev/ffpa-attn-mma](https://github.com/xlite-dev/ffpa-attn-mma) for latest updates! 👏👋

---