Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/zuston/horovod-yarn
Horovod on Yarn test case
https://github.com/zuston/horovod-yarn
Last synced: 25 days ago
JSON representation
Horovod on Yarn test case
- Host: GitHub
- URL: https://github.com/zuston/horovod-yarn
- Owner: zuston
- Created: 2021-04-04T02:49:18.000Z (almost 4 years ago)
- Default Branch: master
- Last Pushed: 2021-04-10T13:17:28.000Z (almost 4 years ago)
- Last Synced: 2024-10-17T05:55:56.621Z (3 months ago)
- Language: Python
- Size: 12.7 KB
- Stars: 2
- Watchers: 3
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
Linked TonY(TF/PyTorch on Yarn) PR: https://github.com/linkedin/TonY/pull/524
## Horovod on Yarn
Aim to provide a Horovod local test program on Yarn, which can perform multi-process tests locally.1. Driver start rendezvous server
2. Inject some Horovod envs before starting training worker__Driver__
```
python3 driver.py -w localhost:2
```__Task__
```
python3 tensorflow2_minist.py --port=34824 --rank=0 --size=2 --local_rank=0 --local_size=2 --cross_rank=0 --cross_size=1
``````
python3 tensorflow2_minist.py --port=34824 --rank=1 --size=2 --local_rank=1 --local_size=2 --cross_rank=0 --cross_size=1
```