Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/FMInference/H2O
[NeurIPS'23] H2O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models.
https://github.com/FMInference/H2O
gpt-3 heavy-hitters high-throughput kv-cache large-language-models sparsity
Last synced: about 1 month ago
JSON representation
[NeurIPS'23] H2O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models.
- Host: GitHub
- URL: https://github.com/FMInference/H2O
- Owner: FMInference
- Created: 2023-06-12T06:03:19.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2024-08-01T19:41:25.000Z (5 months ago)
- Last Synced: 2024-11-10T07:21:08.821Z (about 1 month ago)
- Topics: gpt-3, heavy-hitters, high-throughput, kv-cache, large-language-models, sparsity
- Language: Python
- Homepage:
- Size: 39.1 MB
- Stars: 387
- Watchers: 5
- Forks: 39
- Open Issues: 31
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
- StarryDivineSky - FMInference/H2O - NeoX 在各种任务中验证了算法的准确性。在 OPT-6.7B 和 OPT-30B 上,我们实施了 20% 重击器的 H2O,将吞吐量提高了 29×、29× 和 3× 三个领先的推理系统 DeepSpeed Zero-Inference、Hugging Face Accelerate 和 FlexGen。在相同的批量大小下,H2O 最多可以减少 1.9× 的延迟。 (A01_文本生成_文本对话 / 大语言对话模型及数据)