https://github.com/HKUDS/SepLLM
SepLLM: Accelerate Large Language Models by Compressing One Segment into One Separator
https://github.com/HKUDS/SepLLM
inference-speed large-language-models llms
Last synced: 21 days ago
JSON representation
SepLLM: Accelerate Large Language Models by Compressing One Segment into One Separator
- Host: GitHub
- URL: https://github.com/HKUDS/SepLLM
- Owner: HKUDS
- Created: 2024-12-11T15:16:27.000Z (5 months ago)
- Default Branch: main
- Last Pushed: 2024-12-20T09:41:03.000Z (4 months ago)
- Last Synced: 2025-01-09T15:43:39.730Z (4 months ago)
- Topics: inference-speed, large-language-models, llms
- Language: Python
- Homepage: https://arxiv.org/abs/2412.12094
- Size: 170 MB
- Stars: 38
- Watchers: 2
- Forks: 0
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
- StarryDivineSky - HKUDS/SepLLM - 2和Mistral等模型上进行了验证。实验结果表明,SepLLM能够在不显著降低模型性能的情况下,实现显著的推理加速。项目提供了详细的实现细节和实验结果,方便用户复现和应用。SepLLM的优势在于其简单性和有效性,它不需要复杂的训练或微调过程,即可直接应用于现有的LLM模型。该项目为大型语言模型的加速提供了一种新的思路,尤其是在资源受限的环境下,具有重要的应用价值。项目还提供了相应的代码和文档,方便用户进行二次开发和定制。SepLLM的目标是让更多的人能够更高效地使用大型语言模型。 (A01_文本生成_文本对话 / 大语言对话模型及数据)