Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/cswellessun/camel

CAMEL: Context-Aware Modifier for Efficient Language model
https://github.com/cswellessun/camel

Last synced: 17 days ago
JSON representation

CAMEL: Context-Aware Modifier for Efficient Language model

Host: GitHub
URL: https://github.com/cswellessun/camel
Owner: CSWellesSun
License: apache-2.0
Created: 2024-05-14T06:26:45.000Z (6 months ago)
Default Branch: main
Last Pushed: 2024-06-05T04:21:32.000Z (5 months ago)
Last Synced: 2024-10-12T01:27:08.616Z (about 1 month ago)
Language: Python
Size: 224 KB
Stars: 2
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

        # CAMEL

## Introduction

CAMEL(Context-Aware Modifier for Efficient Language model) is a speculative decoding method inspired by [EAGLE](https://github.com/SafeAILab/EAGLE). It compresses former input hidden states according to window size and then make speculations.



    



## Installation

```bash

pip install modifier

```

## Quick Start

CAMEL only supports `meta-llama/Llama-2-7b-chat-hf` currently.

```python

import torch

from camel import CamelModel

prompt = "What is artificial intelligence?"

model = CamelModel.from_pretrained(

    base_model_path="meta-llama/Llama-2-7b-chat-hf",

    modifier_path="0xWe11es/camel-llama2-h1024-w1",

    torch_dtype=torch.float16,

    device_map="auto"

)

tokenizer = model.get_tokenizer()

input_ids = tokenizer(prompt).input_ids

output_ids = model.generate(input_ids)

output = tokenizer.decode(output_ids)

print(output)

```

CAMEL has the following modifier based on Llama2 (`h` stands for hidden size, `w` stands for window size):

- [0xWe11es/camel-llama2-h256-w1](https://huggingface.co/0xWe11es/camel-llama2-h256-w1)

- [0xWe11es/camel-llama2-h256-w4](https://huggingface.co/0xWe11es/camel-llama2-h256-w4)

- [0xWe11es/camel-llama2-h256-w16](https://huggingface.co/0xWe11es/camel-llama2-h256-w16)

- [0xWe11es/camel-llama2-h256-w64](https://huggingface.co/0xWe11es/camel-llama2-h256-w64)

- [0xWe11es/camel-llama2-h1024-w1](https://huggingface.co/0xWe11es/camel-llama2-h1024-w1)

- [0xWe11es/camel-llama2-h1024-w4](https://huggingface.co/0xWe11es/camel-llama2-h1024-w4)

- [0xWe11es/camel-llama2-h1024-w16](https://huggingface.co/0xWe11es/camel-llama2-h1024-w16)

- [0xWe11es/camel-llama2-h1024-w64](https://huggingface.co/0xWe11es/camel-llama2-h1024-w64)

## Performance

We test modifier `0xWe11es/camel-llama2-h1024-w4` on several datasets, and get the following results compared to vanilla model (hf version).

| Dataset  | Model       | Temperature | Speed(Token/s) | Speedup |

|----------|-------------|-------------|----------------|---------|

| MT-Bench | LlaMa2 7B   | 0.0         | 71.85          | 1.92x   |

| MT-Bench | LlaMa2 7B   | 1.0         | 57.54          | 1.62x   |

| GSM8K    | LlaMa2 7B   | 0.0         | 73.51          | 2.20x   |

| GSM8K    | LlaMa2 7B   | 1.0         | 57.15          | 1.77x   |

| Alpaca   | LlaMa2 7B   | 0.0         | 68.92          | 1.88x   |

| Alpaca   | LlaMa2 7B   | 1.0         | 55.38          | 1.56x   |

## Reference

- [Medusa](https://github.com/FasterDecoding/Medusa)

- [EAGLE](https://github.com/SafeAILab/EAGLE)