https://github.com/hundredblocks/large-model-parallelism
Functional local implementations of main model parallelism approaches
https://github.com/hundredblocks/large-model-parallelism
Last synced: about 20 hours ago
JSON representation
Functional local implementations of main model parallelism approaches
- Host: GitHub
- URL: https://github.com/hundredblocks/large-model-parallelism
- Owner: hundredblocks
- License: gpl-3.0
- Created: 2023-02-21T18:15:17.000Z (about 2 years ago)
- Default Branch: main
- Last Pushed: 2023-02-21T18:44:13.000Z (about 2 years ago)
- Last Synced: 2025-03-28T11:51:07.905Z (18 days ago)
- Language: Jupyter Notebook
- Size: 823 KB
- Stars: 95
- Watchers: 11
- Forks: 5
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- awesome-ChatGPT-repositories - large-model-parallelism - Functional local implementations of main model parallelism approaches (Reimplementations)
README

# Model parallelism 101
Learn how model parallelism enables training models like stable diffusion and Chat GPT in less than 300 lines of code. This [notebook](https://github.com/hundredblocks/large-model-parallelism/blob/main/large-model-parallelism.ipynb) provides practical local implementations of the main model parallelism methods. It explores three approaches: data parallelism, tensor parallelism, and pipeline parallelism with a 2-layer MLP example that can be naturally extended to more complex models.
Reading this notebook will give you a solid overview of model parallelism techniques and an intuition for how to implement them.
Pull requests welcome. Illustration above generated with Lexica's Aperture model.