Projects in Awesome Lists tagged with flashmla | Ecosyste.ms: Awesome

Projects in Awesome Lists tagged with flashmla

A curated list of projects in awesome lists tagged with flashmla .

- Recently synced
- Stars

https://github.com/bruce-lee-ly/decoding_attention

Decoding Attention is specially optimized for MHA, MQA, GQA and MLA using CUDA core for the decoding stage of LLM inference.

cuda cuda-core decoding-attention flash-attention flashinfer flashmla gpu gqa inference large-language-model llm mha mla mqa multi-head-attention nvidia

Last synced: 19 Aug 2025

https://github.com/cat-gawr/deepseek-flashmla

DeepSeek Flash MLA - DeepSeek - copy manual

deepseek flashmla nvidia-cuda windows

Last synced: 02 May 2026