https://github.com/HKUNLP/ChunkLlama
[ICML'24] Data and code for our paper "Training-Free Long-Context Scaling of Large Language Models"
https://github.com/HKUNLP/ChunkLlama
Last synced: 12 months ago
JSON representation
[ICML'24] Data and code for our paper "Training-Free Long-Context Scaling of Large Language Models"
- Host: GitHub
- URL: https://github.com/HKUNLP/ChunkLlama
- Owner: HKUNLP
- License: apache-2.0
- Created: 2024-02-19T05:53:08.000Z (about 2 years ago)
- Default Branch: main
- Last Pushed: 2024-10-16T10:02:39.000Z (over 1 year ago)
- Last Synced: 2024-10-18T17:25:13.155Z (over 1 year ago)
- Language: Python
- Homepage:
- Size: 51.7 MB
- Stars: 343
- Watchers: 7
- Forks: 18
- Open Issues: 11
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- StarryDivineSky - HKUNLP/ChunkLlama - Free Long-Context Scaling of Large Language Models”(ICML'24)。它通过一种新颖的chunking方法,允许LLM处理比其原始训练长度更长的输入序列。该方法的核心思想是将长文本分割成多个chunk,然后利用LLM对这些chunk进行并行处理,最后将结果进行整合。ChunkLlama避免了对LLM进行微调或重新训练的需求,因此可以快速且经济高效地扩展现有LLM的上下文窗口。项目提供了论文中使用的数据和代码,方便研究人员复现和进一步研究。该方法尤其适用于需要处理长文档、书籍或其他长篇文本的应用场景。ChunkLlama的优势在于其简单性、高效性和通用性,使其成为扩展LLM上下文长度的一个有吸引力的选择。项目目标是提供一种易于使用且无需额外训练的LLM长文本处理方案。 (A01_文本生成_文本对话 / 大语言对话模型及数据)