Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/XiaoMi/subllm
This repository is the official implementation of the ECAI 2024 conference paper SUBLLM: A Novel Efficient Architecture with Token Sequence Subsampling for LLM
https://github.com/XiaoMi/subllm
Last synced: 19 days ago
JSON representation
This repository is the official implementation of the ECAI 2024 conference paper SUBLLM: A Novel Efficient Architecture with Token Sequence Subsampling for LLM
- Host: GitHub
- URL: https://github.com/XiaoMi/subllm
- Owner: XiaoMi
- License: apache-2.0
- Created: 2024-08-13T07:19:31.000Z (5 months ago)
- Default Branch: main
- Last Pushed: 2024-08-13T14:39:24.000Z (5 months ago)
- Last Synced: 2024-12-24T17:12:30.831Z (22 days ago)
- Language: Python
- Size: 248 KB
- Stars: 68
- Watchers: 4
- Forks: 4
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- StarryDivineSky - XiaoMi/subllm - shot评估代码,测试结果显示在1.3B模型上的性能优于LLaMA。该项目提供了结构图和详细的模块说明,并支持流式推理和few-shot评估。 (A01_文本生成_文本对话 / 大语言对话模型及数据)
README
# SUBLLM
This repository is the official implementation of the ECAI 2024 conference paper [**SUBLLM: A Novel Efficient Architecture with Token Sequence Subsampling for LLM**](https://arxiv.org/abs/2406.06571)
![](./assets/subllm_structure.jpg)
## News and Updates
* 2024.8.13 We release the model inference code, including the streaming inference and few-shot evaluation codes, and the model structure of SUBLLM to help better understand its module details.## Evaluation
The test results on benchmarks of training a 1.3B model with a training window length of 4k.
| Model | MMLU | BBH | AGIEval |
|:------------------|:--------:|:--------:|:--------:|
| | 5-shot | 3-shot | 5-shot |
| LLaMA | 26.23 | 23.70 | 16.76 |
| SUBLLM | **26.41** | **24.17** | **17.64** |## Stream Inference
```shell
cd inference
sh infer.sh
```## Fewshot
```shell
# data preparation
cd fewshot_eval
python download_data.py
# run fewshot task
sh fewshot.sh $MODEL_PATH $CONFIG_PATH $TOKENIZER_PATH $RSLT_PATH $MAX_LEN $TASK $N_SHOT
```## Citations
Please cite the paper if this repository is useful for you.```bibtex
@article{wang2024subllm,
title={SUBLLM: A Novel Efficient Architecture with Token Sequence Subsampling for LLM},
author={Quandong Wang and Yuxuan Yuan and Xiaoyu Yang and Ruike Zhang and Kang Zhao and Wei Liu and Jian Luan and Daniel Povey and Bin Wang},
journal={arXiv preprint arXiv:2406.06571},
year={2024},
}
```