https://github.com/mitmul/chainermn-on-azure
https://github.com/mitmul/chainermn-on-azure
Last synced: 10 months ago
JSON representation
- Host: GitHub
- URL: https://github.com/mitmul/chainermn-on-azure
- Owner: mitmul
- Created: 2017-07-20T08:20:17.000Z (almost 9 years ago)
- Default Branch: aug2018
- Last Pushed: 2018-10-03T20:50:17.000Z (over 7 years ago)
- Last Synced: 2025-03-24T11:08:00.291Z (about 1 year ago)
- Language: Shell
- Size: 27.1 MB
- Stars: 2
- Watchers: 1
- Forks: 6
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# ChainerMN on Azure
## Deploy from ARM Template
Please note that it will take a long time.
### 1. Install Azure CLI and azure package
```
$ pip install azure-cli
$ pip install azure
```
### 2. Login to Azure using Azure CLI
```
$ az login
```
### 3. Select a subscription
```
$ az account list --all
```
Pick up a subscription ID you want to use from the above list.
```
$ az account set --subscription [YOUR SUBSCRIPTION ID]
```
### 4. Deploy
```
$ ./deploy.py \
--resource-group chainermn
--location westus2 \
--public-key-file ~/.ssh/id_rsa.pub
```
## Create images (Faster)
### 1. Create jumpbox
```
python deploy.py \
-k ~/.ssh/id_rsa.pub \
-g chainermn-images \
-s chainermnscripts \
--jumpbox-only
```
### 2. Create VMSS image
```
az vm create \
-n vmss-image \
-g chainermn-images \
--image Canonical:UbuntuServer:16.04-LTS:latest \
-l eastus \
--size Standard_NC24r \
--admin-username ubuntu \
--authentication-type ssh \
--ssh-key-value $HOME/.ssh/id_rsa.pub
```
Login to the VM and run the `scripts/setup_vmss_cuda92.sh`.
Then reboot it once, then run:
```
sudo waagent -deprovision+user -force
```
Logout and run these commands on your local machine:
```
az vm deallocate --resource-group chainermn-images --name vmss-image && \
az vm generalize --resource-group chainermn-images --name vmss-image && \
az image create --resource-group chainermn-images --name vmss-image-cuda92 --source vmss-image && \
python utils.py -g chainermn-images delete-vm vmss-image
```
### 3. Create jumpbox image
Login to the jumpbox server and run:
```
sudo waagent -deprovision+user -force
```
Then logout, then run these commands from your local machine:
```
az vm deallocate --resource-group chainermn-images --name jumpbox && \
az vm generalize --resource-group chainermn-images --name jumpbox && \
az image create --resource-group chainermn-images --name jumpbox-image --source jumpbox && \
python utils.py -g chainermn-images delete-vm jumpbox
```
## Deploy using images
First, please create a resource group.
```
az group create -g chainermn-v100 -l eastus
```
### 1. Deploy jumpbox
```
image_id=$(az image show -g chainermn-images -n jumpbox-image --query "id" -o tsv) && \
az vm create \
--image ${image_id} \
--name jumpbox \
--resource-group chainermn-v100 \
--size Standard_DS3_v2 \
--admin-username ubuntu \
--ssh-key-value $HOME/.ssh/id_rsa.pub \
--vnet-name chainer-vnet
```
### 2. Deploy VMSS
```
image_id=$(az image show -g chainermn-images -n vmss-image-cuda92 --query "id" -o tsv) && \
az vmss create \
--image ${image_id} \
--vm-sku Standard_NC24rs_v3 \
--instance-count 64 \
--lb '' \
--name vmss \
--resource-group chainermn-v100 \
--admin-username ubuntu \
--public-ip-address '' \
--ssh-key-value $HOME/.ssh/id_rsa.pub \
--vnet-name chainer-vnet \
--subnet jumpboxSubnet
```