https://github.com/siboehm/shallowspeed
Small scale distributed training of sequential deep learning models, built on Numpy and MPI.
https://github.com/siboehm/shallowspeed
deep-learning distributed-computing pipelines
Last synced: 4 months ago
JSON representation
Small scale distributed training of sequential deep learning models, built on Numpy and MPI.
- Host: GitHub
- URL: https://github.com/siboehm/shallowspeed
- Owner: siboehm
- Created: 2022-08-31T11:09:17.000Z (almost 4 years ago)
- Default Branch: master
- Last Pushed: 2023-10-19T22:47:31.000Z (over 2 years ago)
- Last Synced: 2025-06-21T02:57:54.809Z (about 1 year ago)
- Topics: deep-learning, distributed-computing, pipelines
- Language: Python
- Homepage: https://siboehm.com/articles/22/pipeline-parallel-training
- Size: 21.2 MB
- Stars: 134
- Watchers: 3
- Forks: 6
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Shallowspeed

A tiny POC implementation of distributed training for sequential deep learning models.
Implemented using plain Numpy & mpi4py.

Currently implements:
- Sequential models / deep MLPs, training using SGD.
- Data parallel training with interleaved communication & computation, similar to PyTorch's [DistributedDataParallel](https://arxiv.org/abs/2006.15704).
- Pipeline parallel training:
- Naive schedule without interleaved stages.
- [Gpipe](https://arxiv.org/abs/1811.06965) schedule with interleaved FWD & interleaved BWD.
- (soon) [PipeDream Flush](https://arxiv.org/abs/2006.09503) schedule with additional inter-FWD & BWD interleaving.
- Any combination of DP & PP algorithms.
## Setup
```bash
conda env create
pip install -e .
# M1 Macs: conda install "libblas=*=*accelerate"
python download_dataset.py
pytest
```
## Usage
```bash
# Sequential training
python train.py
# Data parallel distributed training
mpirun -n 4 python train.py --dp 4
# Pipeline parallel distributed training
mpirun -n 4 python train.py --pp 4 --schedule naive
# Data & pipeline parallel distributed training
mpirun -n 8 python train.py --dp 2 --pp 4 --schedule gpipe
```
## Internals
