{"id":13687564,"url":"https://github.com/srush/LLM-Training-Puzzles","last_synced_at":"2025-05-01T12:34:34.357Z","repository":{"id":177349279,"uuid":"658536279","full_name":"srush/LLM-Training-Puzzles","owner":"srush","description":"What would you do with 1000 H100s...","archived":false,"fork":false,"pushed_at":"2024-01-10T17:11:43.000Z","size":1663,"stargazers_count":890,"open_issues_count":1,"forks_count":53,"subscribers_count":11,"default_branch":"main","last_synced_at":"2024-10-30T02:37:21.484Z","etag":null,"topics":["llm","puzzles"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/srush.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-06-26T02:15:48.000Z","updated_at":"2024-10-28T04:33:21.000Z","dependencies_parsed_at":"2024-10-30T01:24:43.472Z","dependency_job_id":null,"html_url":"https://github.com/srush/LLM-Training-Puzzles","commit_stats":null,"previous_names":["srush/llm-training-puzzles"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/srush%2FLLM-Training-Puzzles","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/srush%2FLLM-Training-Puzzles/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/srush%2FLLM-Training-Puzzles/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/srush%2FLLM-Training-Puzzles/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/srush","download_url":"https://codeload.github.com/srush/LLM-Training-Puzzles/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":224257797,"owners_count":17281774,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["llm","puzzles"],"created_at":"2024-08-02T15:00:56.654Z","updated_at":"2025-05-01T12:34:34.350Z","avatar_url":"https://github.com/srush.png","language":"Jupyter Notebook","funding_links":[],"categories":["Jupyter Notebook","**Miscellanea**"],"sub_categories":["**Linear Attention / State Space Models / Recurrent Language Models / etc.**"],"readme":"# LLM Training Puzzles\n- by [Sasha Rush](http://rush-nlp.com) - [srush_nlp](https://twitter.com/srush_nlp) \n\n![image](https://github.com/srush/LLM-Training-Puzzles/assets/35882/0c46911f-ad1c-4e7a-a42b-2bc2537cccc3)\n\n\nThis is a collection of 8 challenging puzzles about training large language models (or really any NN) on many, many GPUs. \nVery few people actually get a chance to train on thousands of computers, but it is an interesting challenge and one that \nis critically important for modern AI. The goal of these puzzles is to get hands-on experience with the key primitives and to understand \nthe goals of memory efficiency and compute pipelining. \n\n\nI recommend running in Colab. Click here and copy the notebook to get start.\n\n[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/srush/LLM-Training-Puzzles/blob/main/puzzles.ipynb)\n\n![image](https://github.com/srush/LLM-Training-Puzzles/assets/35882/6d16fc9e-3d14-4bd0-b7c7-d056e49854ac)\n\n\n\nIf you are into this kind of thing, this is 6th in a series of these puzzles.\n\n* https://github.com/srush/gpu-puzzles\n* https://github.com/srush/tensor-puzzles\n* https://github.com/srush/autodiff-puzzles\n* https://github.com/srush/transformer-puzzles\n* https://github.com/srush/GPTworld\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsrush%2FLLM-Training-Puzzles","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsrush%2FLLM-Training-Puzzles","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsrush%2FLLM-Training-Puzzles/lists"}