Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/srush/llm-training-puzzles
What would you do with 1000 H100s...
https://github.com/srush/llm-training-puzzles
llm puzzles
Last synced: about 7 hours ago
JSON representation
What would you do with 1000 H100s...
- Host: GitHub
- URL: https://github.com/srush/llm-training-puzzles
- Owner: srush
- License: mit
- Created: 2023-06-26T02:15:48.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2024-01-10T17:11:43.000Z (10 months ago)
- Last Synced: 2024-11-12T10:46:43.633Z (about 18 hours ago)
- Topics: llm, puzzles
- Language: Jupyter Notebook
- Homepage:
- Size: 1.59 MB
- Stars: 896
- Watchers: 12
- Forks: 53
- Open Issues: 2
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# LLM Training Puzzles
- by [Sasha Rush](http://rush-nlp.com) - [srush_nlp](https://twitter.com/srush_nlp)![image](https://github.com/srush/LLM-Training-Puzzles/assets/35882/0c46911f-ad1c-4e7a-a42b-2bc2537cccc3)
This is a collection of 8 challenging puzzles about training large language models (or really any NN) on many, many GPUs.
Very few people actually get a chance to train on thousands of computers, but it is an interesting challenge and one that
is critically important for modern AI. The goal of these puzzles is to get hands-on experience with the key primitives and to understand
the goals of memory efficiency and compute pipelining.I recommend running in Colab. Click here and copy the notebook to get start.
[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/srush/LLM-Training-Puzzles/blob/main/puzzles.ipynb)
![image](https://github.com/srush/LLM-Training-Puzzles/assets/35882/6d16fc9e-3d14-4bd0-b7c7-d056e49854ac)
If you are into this kind of thing, this is 6th in a series of these puzzles.
* https://github.com/srush/gpu-puzzles
* https://github.com/srush/tensor-puzzles
* https://github.com/srush/autodiff-puzzles
* https://github.com/srush/transformer-puzzles
* https://github.com/srush/GPTworld