https://github.com/en10/babyllama
Train and run a small Llama 2 model from scratch on the TinyStories dataset.
https://github.com/en10/babyllama
gpt llama2 llm tinystories
Last synced: 10 months ago
JSON representation
Train and run a small Llama 2 model from scratch on the TinyStories dataset.
- Host: GitHub
- URL: https://github.com/en10/babyllama
- Owner: EN10
- License: mit
- Created: 2023-12-10T20:34:05.000Z (over 2 years ago)
- Default Branch: main
- Last Pushed: 2024-08-01T16:04:28.000Z (almost 2 years ago)
- Last Synced: 2025-08-11T06:45:01.449Z (10 months ago)
- Topics: gpt, llama2, llm, tinystories
- Language: Jupyter Notebook
- Homepage:
- Size: 19.8 MB
- Stars: 5
- Watchers: 3
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Baby Llama
Train and run a small [Llama 2](https://ai.meta.com/llama/) model from scratch on the [TinyStories](https://huggingface.co/datasets/roneneldan/TinyStories) dataset.
* [Based on karpathy/llama2.c](https://github.com/karpathy/llama2.c)
* [Based on eniompw/DisneyGPT](https://github.com/eniompw/DisneyGPT)
## Baby Llama Code Example:
### [Baby Llama 105 Tokens on Colab](https://github.com/EN10/BabyLlama/blob/main/Baby_Llama_105.ipynb)
* [Iters vs Val Loss](https://github.com/EN10/BabyLlama/blob/main/tok105/iters-vs-val-loss.md) Learning Words and Grammar Visualised
* [105 Token Vocab](https://github.com/EN10/BabyLlama/blob/main/tok105/tok105.vocab)
```
!cd llama2.c && python tinystories.py train_vocab --vocab_size=256
trainer_interface.cc(558) LOG(INFO) Alphabet size=102
Vocabulary size is smaller than required_chars. 256 vs 361.
```
### [More Tokens & Larger Models](https://github.com/EN10/BabyLlama/blob/main/Model-Sizes.md)
### Ref:
* [Training](https://github.com/karpathy/llama2.c#training)
* [Models](https://github.com/karpathy/llama2.c#models)
* [Pretokenized TinyStories](https://huggingface.co/datasets/enio/TinyStories)