Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/cswellessun/camel
CAMEL: Context-Aware Modifier for Efficient Language model
https://github.com/cswellessun/camel
Last synced: 17 days ago
JSON representation
CAMEL: Context-Aware Modifier for Efficient Language model
- Host: GitHub
- URL: https://github.com/cswellessun/camel
- Owner: CSWellesSun
- License: apache-2.0
- Created: 2024-05-14T06:26:45.000Z (6 months ago)
- Default Branch: main
- Last Pushed: 2024-06-05T04:21:32.000Z (5 months ago)
- Last Synced: 2024-10-12T01:27:08.616Z (about 1 month ago)
- Language: Python
- Size: 224 KB
- Stars: 2
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# CAMEL
## Introduction
CAMEL(Context-Aware Modifier for Efficient Language model) is a speculative decoding method inspired by [EAGLE](https://github.com/SafeAILab/EAGLE). It compresses former input hidden states according to window size and then make speculations.
## Installation
```bash
pip install modifier
```## Quick Start
CAMEL only supports `meta-llama/Llama-2-7b-chat-hf` currently.
```python
import torch
from camel import CamelModelprompt = "What is artificial intelligence?"
model = CamelModel.from_pretrained(
base_model_path="meta-llama/Llama-2-7b-chat-hf",
modifier_path="0xWe11es/camel-llama2-h1024-w1",
torch_dtype=torch.float16,
device_map="auto"
)
tokenizer = model.get_tokenizer()
input_ids = tokenizer(prompt).input_ids
output_ids = model.generate(input_ids)
output = tokenizer.decode(output_ids)
print(output)
```CAMEL has the following modifier based on Llama2 (`h` stands for hidden size, `w` stands for window size):
- [0xWe11es/camel-llama2-h256-w1](https://huggingface.co/0xWe11es/camel-llama2-h256-w1)
- [0xWe11es/camel-llama2-h256-w4](https://huggingface.co/0xWe11es/camel-llama2-h256-w4)
- [0xWe11es/camel-llama2-h256-w16](https://huggingface.co/0xWe11es/camel-llama2-h256-w16)
- [0xWe11es/camel-llama2-h256-w64](https://huggingface.co/0xWe11es/camel-llama2-h256-w64)
- [0xWe11es/camel-llama2-h1024-w1](https://huggingface.co/0xWe11es/camel-llama2-h1024-w1)
- [0xWe11es/camel-llama2-h1024-w4](https://huggingface.co/0xWe11es/camel-llama2-h1024-w4)
- [0xWe11es/camel-llama2-h1024-w16](https://huggingface.co/0xWe11es/camel-llama2-h1024-w16)
- [0xWe11es/camel-llama2-h1024-w64](https://huggingface.co/0xWe11es/camel-llama2-h1024-w64)## Performance
We test modifier `0xWe11es/camel-llama2-h1024-w4` on several datasets, and get the following results compared to vanilla model (hf version).
| Dataset | Model | Temperature | Speed(Token/s) | Speedup |
|----------|-------------|-------------|----------------|---------|
| MT-Bench | LlaMa2 7B | 0.0 | 71.85 | 1.92x |
| MT-Bench | LlaMa2 7B | 1.0 | 57.54 | 1.62x |
| GSM8K | LlaMa2 7B | 0.0 | 73.51 | 2.20x |
| GSM8K | LlaMa2 7B | 1.0 | 57.15 | 1.77x |
| Alpaca | LlaMa2 7B | 0.0 | 68.92 | 1.88x |
| Alpaca | LlaMa2 7B | 1.0 | 55.38 | 1.56x |## Reference
- [Medusa](https://github.com/FasterDecoding/Medusa)
- [EAGLE](https://github.com/SafeAILab/EAGLE)