An open API service indexing awesome lists of open source software.

https://github.com/en10/babyllama

Train and run a small Llama 2 model from scratch on the TinyStories dataset.
https://github.com/en10/babyllama

gpt llama2 llm tinystories

Last synced: 10 months ago
JSON representation

Train and run a small Llama 2 model from scratch on the TinyStories dataset.

Awesome Lists containing this project

README

          

# Baby Llama

Train and run a small [Llama 2](https://ai.meta.com/llama/) model from scratch on the [TinyStories](https://huggingface.co/datasets/roneneldan/TinyStories) dataset.
* [Based on karpathy/llama2.c](https://github.com/karpathy/llama2.c)
* [Based on eniompw/DisneyGPT](https://github.com/eniompw/DisneyGPT)

## Baby Llama Code Example:

### [Baby Llama 105 Tokens on Colab](https://github.com/EN10/BabyLlama/blob/main/Baby_Llama_105.ipynb)
* [Iters vs Val Loss](https://github.com/EN10/BabyLlama/blob/main/tok105/iters-vs-val-loss.md) Learning Words and Grammar Visualised
* [105 Token Vocab](https://github.com/EN10/BabyLlama/blob/main/tok105/tok105.vocab)

```
!cd llama2.c && python tinystories.py train_vocab --vocab_size=256
trainer_interface.cc(558) LOG(INFO) Alphabet size=102
Vocabulary size is smaller than required_chars. 256 vs 361.
```

### [More Tokens & Larger Models](https://github.com/EN10/BabyLlama/blob/main/Model-Sizes.md)

### Ref:
* [Training](https://github.com/karpathy/llama2.c#training)
* [Models](https://github.com/karpathy/llama2.c#models)
* [Pretokenized TinyStories](https://huggingface.co/datasets/enio/TinyStories)