Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/All-less/mxnet-speculative-synchronization
A new parallel scheme implemented on MXNet.
https://github.com/All-less/mxnet-speculative-synchronization
Last synced: 2 months ago
JSON representation
A new parallel scheme implemented on MXNet.
- Host: GitHub
- URL: https://github.com/All-less/mxnet-speculative-synchronization
- Owner: All-less
- License: mit
- Created: 2017-12-09T06:24:02.000Z (about 7 years ago)
- Default Branch: master
- Last Pushed: 2019-11-19T15:00:30.000Z (about 5 years ago)
- Last Synced: 2024-08-01T22:42:22.273Z (5 months ago)
- Language: C++
- Size: 172 KB
- Stars: 6
- Watchers: 4
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- Awesome-MXNet - speculative-synchronization
README
# mxnet-speculative-synchronization
A new parallel scheme implemented on MXNet.
## Prerequisites
- Python 3.5+ (for instrumenting script)
- Python 2.7+ (for starting MXNet)
- make
- gcc## Installation
Run following commands to get the source code.
```bash
git clone --recursive https://github.com/All-less/mxnet-speculative-synchronization.git
cd mxnet-speculative-synchronization
```Roll back MXNet to commit `7fcaf15a`.
```bash
cd mxnet
git checkout 7fcaf15a3a597cc72a342d1bdb00273dec00e78c
git submodule update --recursive
```Our implementation is based on MXNet, so we need to insert some instrumentation into MXNet sources. We will elaborate on `extra-dir` option in [next section](#Run).
```bash
python instrument_source.py --extra-dir
```After instrumenting, follow the instructions [here](scripts/install-mxnet.sh) to build MXNet.
## Get Started
The training process is the same as [original](http://34.201.8.176/versions/0.11.0/tutorials/vision/large_scale_classification.html), whereas you need to set some environment variables to activate speculative synchronization. We provide two different modes of synchronization.
### Fixed Waiting
In *Fixed Waiting* mode, you need to specify how long each worker will wait and how many fresh updates to trigger synchronization.
```bash
export MXNET_ENABLE_CANCEL=1 # enable speculative synchronization
export MXNET_WAIT_RATIO=0.10 # wait 10% of batch time
export MXNET_CANCEL_THRESHOLD=5 # synchronize when getting more than 5 fresh updates.
```### Freshness Tuning
In *Freshness Tuning* mode, you only need to turn on the switch.
```bash
export MXNET_ENABLE_CANCEL=1
```## Caveats
1. Only CPU training is supported.