https://github.com/DefTruth/ffpa-attn-mma

📚[WIP] FFPA: Yet antother Faster Flash Prefill Attention with O(1)⚡️GPU SRAM complexity for headdim > 256, 1.8x~3x↑🎉faster vs SDPA EA.
https://github.com/DefTruth/ffpa-attn-mma

attention cuda flash-attention mlsys sdpa tensor-cores

Last synced: about 1 month ago
JSON representation

📚[WIP] FFPA: Yet antother Faster Flash Prefill Attention with O(1)⚡️GPU SRAM complexity for headdim > 256, 1.8x~3x↑🎉faster vs SDPA EA.

Host: GitHub
URL: https://github.com/DefTruth/ffpa-attn-mma
Owner: DefTruth
License: gpl-3.0
Created: 2024-11-29T11:47:23.000Z (12 months ago)
Default Branch: main
Last Pushed: 2025-01-20T07:20:27.000Z (10 months ago)
Last Synced: 2025-01-20T08:26:05.808Z (10 months ago)
Topics: attention, cuda, flash-attention, mlsys, sdpa, tensor-cores
Language: Cuda
Homepage:
Size: 4.08 MB
Stars: 53
Watchers: 2
Forks: 1
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

Awesome-LLM-Inference - **FFPA** - attn-mma]](https://github.com/DefTruth/ffpa-attn-mma) ![](https://img.shields.io/github/stars/DefTruth/ffpa-attn-mma)|⭐️⭐️ | (📖Contents / 📖IO/FLOPs-Aware/Sparse Attention ([©️back👆🏻](#paperlist)))

README

          # Notes 👇👇

This project has been moved to [xlite-dev/ffpa-attn-mma](https://github.com/xlite-dev/ffpa-attn-mma). Please check [xlite-dev/ffpa-attn-mma](https://github.com/xlite-dev/ffpa-attn-mma) for latest updates! 👏👋

---

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/DefTruth/ffpa-attn-mma

Awesome Lists containing this project

README