Projects in Awesome Lists by FMInference
A curated list of projects in awesome lists by FMInference .
https://github.com/FMInference/FlexLLMGen
Running large language models on a single GPU for throughput-oriented scenarios.
deep-learning gpt-3 high-throughput large-language-models machine-learning offloading opt
Last synced: 14 Mar 2025
https://github.com/fminference/flexllmgen
Running large language models on a single GPU for throughput-oriented scenarios.
deep-learning gpt-3 high-throughput large-language-models machine-learning offloading opt
Last synced: 27 Jul 2025
https://github.com/fminference/h2o
[NeurIPS'23] H2O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models.
gpt-3 heavy-hitters high-throughput kv-cache large-language-models sparsity
Last synced: 05 Apr 2025
https://github.com/FMInference/H2O
[NeurIPS'23] H2O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models.
gpt-3 heavy-hitters high-throughput kv-cache large-language-models sparsity
Last synced: 09 May 2025