https://github.com/mohsenhariri/llm
https://github.com/mohsenhariri/llm
Last synced: 2 months ago
JSON representation
- Host: GitHub
- URL: https://github.com/mohsenhariri/llm
- Owner: mohsenhariri
- License: gpl-3.0
- Created: 2024-07-21T05:19:22.000Z (10 months ago)
- Default Branch: main
- Last Pushed: 2024-08-29T20:33:43.000Z (9 months ago)
- Last Synced: 2024-08-29T22:59:00.414Z (9 months ago)
- Language: Python
- Size: 1.1 MB
- Stars: 0
- Watchers: 2
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- License: LICENSE
Awesome Lists containing this project
README
# LLM Exploration
## GSM8K Evaluation
- `llm/evaluate/gsm8k.py` runs the evaluation on single GPU.
- `llm/evaluate/gsm8k_gpus` runs the evaluation on multiple GPUs.## KV cache
| Model | 16FP | KIVI |
| ------------------------ | ------------------- | ------------------- |
| Meta-Llama-3-8B | 0.49683544303797467 | |
| Meta-Llama-3-8B-Instruct | 0.7554179566563467 | |
| Llama-2-7b-hf | 0.1342925659472422 | 0.10454908220271349 |
| Llama-2-7b-chat-hf | 0.21674418604651163 | 0.1759927797833935 |
| Mistral-7B-v0.1 | 0.43967611336032386 | 0.4080971659919028 |
| Mistral-7B-Instruct-v0.2 | 0.45616883116883117 | 0.41804635761589404 |
| OLMo-1.7-7B-hf | 0.2793950075512405 | |## Llama models
- Original implementation: `llm/models/llama/meta/model.py`
- Single node implementation: `llm/models/llama/meta/model_single_node.py`## References
CoT template has been taken from:
[Chain of Thought Prompting Elicits Reasoning in Large Language Models](https://arxiv.org/abs/2201.11903)