https://github.com/XMUDeepLIT/QGC
Code for "Retaining Key Information under High Compression Rates: Query-Guided Compressor for LLMs" (ACL 2024)
https://github.com/XMUDeepLIT/QGC
Last synced: 5 months ago
JSON representation
Code for "Retaining Key Information under High Compression Rates: Query-Guided Compressor for LLMs" (ACL 2024)
- Host: GitHub
- URL: https://github.com/XMUDeepLIT/QGC
- Owner: XMUDeepLIT
- Created: 2024-05-17T05:46:21.000Z (about 2 years ago)
- Default Branch: main
- Last Pushed: 2024-06-12T09:00:56.000Z (about 2 years ago)
- Last Synced: 2025-06-01T07:27:03.914Z (about 1 year ago)
- Language: Python
- Size: 71.3 KB
- Stars: 17
- Watchers: 4
- Forks: 0
- Open Issues: 4
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
- awesome-context-engineering - QGC - Guided Compressor for LLMs (✂️ Compress Context / RAG compression)
README
# Query-Guided Compressor (QGC)
Code for "Retaining Key Information under High Compression Rates: Query-Guided Compressor for LLMs" (ACL 2024)
## Requirements
```
datasets==2.15.0
flash-attn==2.3.3
jsonlines==4.0.0
torch==2.0.0
torchvision==0.15.0
transformers==4.35.0
```
## Instructions
We use an example to show how to use our codes.
### LLMs and Datasets
We use [LongChat-13B](https://huggingface.co/lmsys/longchat-13b-16k) as the target LLM, and use Llama-2-7B to initial the compressor parameters. For datasets, we use open-source QA datasets (NaturalQuestions, TrivialQA, HotpotQA) to train our compressor and evaluate it. All datasets can be downloaded from [this site](https://drive.google.com/drive/folders/1HhwPP6iZUBbAjWeWRkbEPtgXVIRZUz6V?usp=drive_link).
### QGC Training and Inference
```
# train compressor
bash train.sh
# evaluate compressor
bash infer.sh
```