An open API service indexing awesome lists of open source software.

Projects in Awesome Lists tagged with paged-attention

A curated list of projects in awesome lists tagged with paged-attention .

https://github.com/deftruth/awesome-llm-inference

📖A curated list of Awesome LLM/VLM Inference Papers with codes: WINT8/4, FlashAttention, PagedAttention, MLA, Parallelism, etc. 🎉🎉

awesome-llm deepseek deepseek-r1 deepseek-v3 flash-attention flash-attention-3 flash-mla llm-inference minimax-01 mla paged-attention tensorrt-llm vllm

Last synced: 04 Apr 2025

https://github.com/nptt9/illama

A fast, lightweight, parallel inference server for Llama LLMs.

exllama exllamav2 flash-attention-2 inference llama llama2 llama3 llm-inference paged-attention server

Last synced: 22 Mar 2025

https://github.com/nickpotafiy/illama

A fast, lightweight, parallel inference server for Llama LLMs.

exllama exllamav2 flash-attention-2 inference llama llama2 llama3 llm-inference paged-attention server

Last synced: 26 Mar 2025