https://github.com/en10/babyllama

Train and run a small Llama 2 model from scratch on the TinyStories dataset.
https://github.com/en10/babyllama

gpt llama2 llm tinystories

Last synced: 11 months ago
JSON representation

Train and run a small Llama 2 model from scratch on the TinyStories dataset.

Host: GitHub
URL: https://github.com/en10/babyllama
Owner: EN10
License: mit
Created: 2023-12-10T20:34:05.000Z (over 2 years ago)
Default Branch: main
Last Pushed: 2024-08-01T16:04:28.000Z (almost 2 years ago)
Last Synced: 2025-08-11T06:45:01.449Z (12 months ago)
Topics: gpt, llama2, llm, tinystories
Language: Jupyter Notebook
Homepage:
Size: 19.8 MB
Stars: 5
Watchers: 3
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

          # Baby Llama

Train and run a small [Llama 2](https://ai.meta.com/llama/) model from scratch on the [TinyStories](https://huggingface.co/datasets/roneneldan/TinyStories) dataset.

* [Based on karpathy/llama2.c](https://github.com/karpathy/llama2.c)

* [Based on eniompw/DisneyGPT](https://github.com/eniompw/DisneyGPT)

## Baby Llama Code Example:

### [Baby Llama 105 Tokens on Colab](https://github.com/EN10/BabyLlama/blob/main/Baby_Llama_105.ipynb)   

* [Iters vs Val Loss](https://github.com/EN10/BabyLlama/blob/main/tok105/iters-vs-val-loss.md)  Learning Words and Grammar Visualised  

* [105 Token Vocab](https://github.com/EN10/BabyLlama/blob/main/tok105/tok105.vocab)

```

!cd llama2.c && python tinystories.py train_vocab --vocab_size=256

trainer_interface.cc(558) LOG(INFO) Alphabet size=102

Vocabulary size is smaller than required_chars. 256 vs 361.

```

### [More Tokens & Larger Models](https://github.com/EN10/BabyLlama/blob/main/Model-Sizes.md)

### Ref:

* [Training](https://github.com/karpathy/llama2.c#training)

* [Models](https://github.com/karpathy/llama2.c#models)

* [Pretokenized TinyStories](https://huggingface.co/datasets/enio/TinyStories)

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/en10/babyllama

Awesome Lists containing this project

README