https://github.com/shreyansh26/llm-training-puzzles-solutions

The LLM Training Puzzles by Sasha Rush
https://github.com/shreyansh26/llm-training-puzzles-solutions

Last synced: 3 months ago
JSON representation

The LLM Training Puzzles by Sasha Rush

Host: GitHub
URL: https://github.com/shreyansh26/llm-training-puzzles-solutions
Owner: shreyansh26
License: mit
Created: 2023-07-08T15:54:29.000Z (over 2 years ago)
Default Branch: master
Last Pushed: 2023-07-08T15:55:21.000Z (over 2 years ago)
Last Synced: 2025-03-03T15:47:59.662Z (8 months ago)
Language: Jupyter Notebook
Homepage:
Size: 295 KB
Stars: 1
Watchers: 2
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# LLM Training Puzzles
- by [Sasha Rush](http://rush-nlp.com) - [srush_nlp](https://twitter.com/srush_nlp)

![image](https://github.com/srush/LLM-Training-Puzzles/assets/35882/0c46911f-ad1c-4e7a-a42b-2bc2537cccc3)

This is a collection of 8 challenging puzzles about training large language models (or really any NN) on many, many GPUs.
Very few people actually get a chance to train on thousands of of computers, but it is an interesting challenge and one that
is critically important for modern AI. The goal of these puzzles is to get hands-on experience with the key primitives and to understand
the goals of memory efficiency and compute pipelining.

I recommend running in Colab. Click here and copy the notebook to get start.

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/srush/LLM-Training-Puzzles/blob/main/puzzles.ipynb)

![image](https://github.com/srush/LLM-Training-Puzzles/assets/35882/6d16fc9e-3d14-4bd0-b7c7-d056e49854ac)

If you are into this kind of thing, this is 6th in a series of these puzzles.

* https://github.com/srush/gpu-puzzles
* https://github.com/srush/tensor-puzzles
* https://github.com/srush/autodiff-puzzles
* https://github.com/srush/transformer-puzzles
* https://github.com/srush/GPTworld

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/shreyansh26/llm-training-puzzles-solutions

Awesome Lists containing this project

README