Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
awesome-Auto-Parallelism
A baseline repository of Auto-Parallelism in Training Neural Networks
https://github.com/MachineLearningSystem/awesome-Auto-Parallelism
Last synced: 3 days ago
JSON representation
-
Pipeline Parallelism or Inter-layer Model Parallelism only:
-
Data Parallelism + Pipeline Parallelism (or Inter-layer Model Parallelism):
- PipeDream - 2BW | Microsoft Fiddle| [arxiv](https://arxiv.org/pdf/1806.03377.pdf), | PyTorch | 2018 on arxiv, SOSP 2019 | Dynamic Programming with Profile
- DNN-partitioning - of-concept implementation | NIPS 2020 |Dynamic Programming and Integer Programming
- DAPPLE
- PipeTransformer
- Chimera - scale neural networks with bidirectional pipelines | Department of Computer Science, ETH Zurich Switzerland | [dl.acm](https://dl.acm.org/doi/pdf/10.1145/3458817.3476145) | PyTorch | SC 2021 | Performance model with brute force
- FTPipe - GPU one. | Technion-Israel Institute of Technology | [usenix](https://usenix.org/system/files/atc21-eliad.pdf) | PyTorch | 2021 | multiprocessor scheduling problem with profiling.
- mlr.press
- nips
- arxiv
- usenix
- RaNNC - scale neural networks. | DIRECT and University of Tokyo | [arxiv](http://arxiv.org/abs/2103.16063) | PyTorch | IPDPS 2021 | dynamic programming
- HeterPS
- REGAL
- arxiv
-
Data Parallelism + Intra-layer Model Parallelism (or Tensor Parallelism):
- AccPar
- ROC - Paper.pdf) | On top of Flexflow | MLSys 2020 | uses a novel online linear regression model to achieve efficient graph partitioning, and introduces a dynamic programming algorithm to minimize data transfer cost.
- PaSE - research/PaSE/raw/master/docs/PaSE_ipdps2021.pdf) | prototype | IPDPS 2021 | Dynamic Programming
- TensorOpt - Parallelism | CUHK & Huawei | [arxiv](https://arxiv.org/pdf/2004.10856.pdf) | MindSpore | 2020 on arxiv | Dynamic Programming based graph search algorithm
- Double Recursive - 3-030-85665-6_13) | MindSpore | Euro-Par 2021 | Double Recursive
- arxiv
- arxiv
- FlexFlow
- arxiv
- arxiv
-
Data Parallelism + Model Parallelism (or Tensor Parallelism) + Pipeline Parallelism:
- Piper - of-concept implementation) and input files (profiled DNN models / workloads) from the paper "Piper: Multidimensional Planner for DNN Parallelization" published at NeurIPS 2021. An extension of DNN partitioning| Microsoft Fiddle| [link](https://www.microsoft.com/en-us/research/publication/piper-multidimensional-planner-for-dnn-parallelization/) | proof-of-concept implementation | NIPS 2021 | two-level dynamic programming
- DistIR - Search Simulator
- Alpa - and Intra-Operator Parallelism for Distributed Deep Learning | UC Berkley, Google, etc. | [arxiv](https://arxiv.org/pdf/2201.12023.pdf) | Jax, XLA | 2022 | Integer Linear for Intra, Dynamic programming for inter
- arxiv
- GSPMD
- arxiv - Karp Algorithm
- arxiv - given.
- arxiv
- arxiv - given.
-
Other Interesting automatic work
Categories
Data Parallelism + Pipeline Parallelism (or Inter-layer Model Parallelism):
14
Data Parallelism + Intra-layer Model Parallelism (or Tensor Parallelism):
10
Data Parallelism + Model Parallelism (or Tensor Parallelism) + Pipeline Parallelism:
9
Pipeline Parallelism or Inter-layer Model Parallelism only:
7
Other Interesting automatic work
1
Sub Categories
Keywords
pipeline-parallelism
4
deep-learning
3
deep-neural-networks
2
distributed-training
2
checkpointing
1
gpipe
1
model-parallelism
1
parallelism
1
pytorch
1
distribution-strategy-planner
1
hybrid-parallelism
1
distributed-deep-learning
1
transformers
1
fine-tuning
1
nlp
1
t5
1
alpa
1
auto-parallelization
1
compiler
1
distributed-computing
1
high-performance-computing
1
jax
1
llm
1
machine-learning
1
inference-optimization
1