An open API service indexing awesome lists of open source software.

Projects in Awesome Lists by FMInference

A curated list of projects in awesome lists by FMInference .

https://github.com/FMInference/FlexLLMGen

Running large language models on a single GPU for throughput-oriented scenarios.

deep-learning gpt-3 high-throughput large-language-models machine-learning offloading opt

Last synced: 14 Mar 2025

https://github.com/fminference/flexllmgen

Running large language models on a single GPU for throughput-oriented scenarios.

deep-learning gpt-3 high-throughput large-language-models machine-learning offloading opt

Last synced: 27 Jul 2025

https://github.com/fminference/h2o

[NeurIPS'23] H2O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models.

gpt-3 heavy-hitters high-throughput kv-cache large-language-models sparsity

Last synced: 05 Apr 2025

https://github.com/FMInference/H2O

[NeurIPS'23] H2O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models.

gpt-3 heavy-hitters high-throughput kv-cache large-language-models sparsity

Last synced: 09 May 2025

https://github.com/fminference/dejavu

Last synced: 25 Jul 2025