https://github.com/oelin/tinyllama
A pedagogical implementation of TinyLlama.
https://github.com/oelin/tinyllama
deep-learning education educational language-model llama llama2 machine-learning natural-language-processing
Last synced: 3 months ago
JSON representation
A pedagogical implementation of TinyLlama.
- Host: GitHub
- URL: https://github.com/oelin/tinyllama
- Owner: oelin
- License: mit
- Created: 2024-01-09T12:57:14.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2024-01-09T13:26:14.000Z (over 1 year ago)
- Last Synced: 2025-03-07T02:35:42.309Z (3 months ago)
- Topics: deep-learning, education, educational, language-model, llama, llama2, machine-learning, natural-language-processing
- Language: Python
- Homepage:
- Size: 85 KB
- Stars: 0
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
![]()
# TinyLlama
A pedagogical implementation of TinyLlama from [TinyLlama: An Open-Source Small Language Model](https://arxiv.org/abs/2401.02385), in PyTorch.
## Usage
```python
from tinyllama import TinyLlama, TinyLlamaConfiguration# As specified in the paper.
configuration = TinyLlamaConfiguration(
embedding_dimension=2048,
intermediate_dimension=5632, # x2.75.
number_of_heads=16,
number_of_layers=22,
vocabulary_size=32_000,
context_length=2048,
)model = TinyLlama(configuration=configuration)
tokens = torch.tensor([[1, 2, 3, 4]])
logits = model(tokens, mask=None)
```## TODO
- [ ] Implement caching (RoPE, KV).
- [ ] Switch to Flash Attention 2 for GQA.