Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

https://github.com/mit-han-lab/once-for-all

[ICLR 2020] Once for All: Train One Network and Specialize it for Efficient Deployment
https://github.com/mit-han-lab/once-for-all

acceleration automl edge-ai efficient-model nas tinyml

Last synced: about 1 month ago
JSON representation

[ICLR 2020] Once for All: Train One Network and Specialize it for Efficient Deployment

Lists

README

        

# Once-for-All: Train One Network and Specialize it for Efficient Deployment [[arXiv]](https://arxiv.org/abs/1908.09791) [[Slides]](https://file.lzhu.me/projects/OnceForAll/OFA%20Slides.pdf) [[Video]](https://youtu.be/a_OeT8MXzWI)
```BibTex
@inproceedings{
cai2020once,
title={Once for All: Train One Network and Specialize it for Efficient Deployment},
author={Han Cai and Chuang Gan and Tianzhe Wang and Zhekai Zhang and Song Han},
booktitle={International Conference on Learning Representations},
year={2020},
url={https://arxiv.org/pdf/1908.09791.pdf}
}
```
**[News]** Once-for-All is available at [PyTorch Hub](https://pytorch.org/hub/pytorch_vision_once_for_all) now!

**[News]** Once-for-All (OFA) Network is adopted by [SONY Neural Architecture Search Library](https://github.com/sony/nnabla-nas).

**[News]** Once-for-All (OFA) Network is adopted by [ADI MAX78000/MAX78002 Model Training and Synthesis Tool](https://github.com/MaximIntegratedAI/ai8x-training).

**[News]** Once-for-All (OFA) Network is adopted by Alibaba and ranked 1st in the open division of the MLPerf Inference Benchmark ([Datacenter](https://mlcommons.org/en/inference-datacenter-10/) and [Edge](https://mlcommons.org/en/inference-edge-10/)).

**[News]** First place in the [CVPR 2020 Low-Power Computer Vision Challenge](https://lpcv.ai/2020CVPR/introduction), CPU detection and FPGA track.

**[News]** OFA-ResNet50 is released.

**[News]** The [hands-on tutorial](https://hangzhang.org/CVPR2020/) of OFA is released! [![](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/mit-han-lab/once-for-all/blob/master/tutorial/ofa.ipynb)

**[News]** OFA is available via pip! Run **```pip install ofa```** to install the whole OFA codebase.

**[News]** First place in the 4th [Low-Power Computer Vision Challenge](https://lpcv.ai/competitions/2019), both classification and detection track.

**[News]** First place in the 3rd [Low-Power Computer Vision Challenge](https://lpcv.ai/competitions/2019), DSP track at ICCV’19 using the Once-for-all Network.

## Train once, specialize for many deployment scenarios
![](figures/overview.png)

## 80% top1 ImageNet accuracy under mobile setting
![](figures/cnn_imagenet_new.png)

![](figures/imagenet_80_acc.png)

## Consistently outperforms MobileNetV3 on Diverse hardware platforms
![](figures/diverse_hardware.png)

## OFA-ResNet50 [[How to use]](https://github.com/mit-han-lab/once-for-all/blob/master/tutorial/ofa_resnet50_example.ipynb)

## How to use / evaluate **OFA Networks**
### Use
```python
""" OFA Networks.
Example: ofa_network = ofa_net('ofa_mbv3_d234_e346_k357_w1.0', pretrained=True)
"""
from ofa.model_zoo import ofa_net
ofa_network = ofa_net(net_id, pretrained=True)

# Randomly sample sub-networks from OFA network
ofa_network.sample_active_subnet()
random_subnet = ofa_network.get_active_subnet(preserve_weight=True)

# Manually set the sub-network
ofa_network.set_active_subnet(ks=7, e=6, d=4)
manual_subnet = ofa_network.get_active_subnet(preserve_weight=True)
```

### Evaluate

`python eval_ofa_net.py --path 'Your path to imagenet' --net ofa_mbv3_d234_e346_k357_w1.0 `

| OFA Network | Design Space | Resolution | Width Multiplier | Depth | Expand Ratio | kernel Size |
|----------------------|:----------:|:----------:|:---------:|:------------:|:---------:|:------------:|
| ofa_resnet50 | ResNet50D | 128 - 224 | 0.65, 0.8, 1.0 | 0, 1, 2 | 0.2, 0.25, 0.35 | 3 |
| ofa_mbv3_d234_e346_k357_w1.0 | MobileNetV3 | 128 - 224 | 1.0 | 2, 3, 4 | 3, 4, 6 | 3, 5, 7 |
| ofa_mbv3_d234_e346_k357_w1.2 | MobileNetV3 | 160 - 224 | 1.2 | 2, 3, 4 | 3, 4, 6 | 3, 5, 7 |
| ofa_proxyless_d234_e346_k357_w1.3 | ProxylessNAS | 128 - 224 | 1.3 | 2, 3, 4 | 3, 4, 6 | 3, 5, 7 |

## How to use / evaluate **OFA Specialized Networks**
### Use
```python
""" OFA Specialized Networks.
Example: net, image_size = ofa_specialized('flops@[email protected]_finetune@75', pretrained=True)
"""
from ofa.model_zoo import ofa_specialized
net, image_size = ofa_specialized(net_id, pretrained=True)
```

### Evaluate

`python eval_specialized_net.py --path 'Your path to imagent' --net flops@[email protected]_finetune@75 `

| Model Name | Details | Top-1 (%) | Top-5 (%) | #Params | #MACs |
|----------------------|:----------:|:----------:|:----------:|:---------:|:------------:|
| **ResNet50 Design Space** |
| ofa-resnet50D-41 | [email protected][email protected] | 79.8 | 94.7 | 30.9M | 4.1B |
| ofa-resnet50D-37 | [email protected][email protected] | 79.7 | 94.7 | 26.5M | 3.7B |
| ofa-resnet50D-30 | [email protected][email protected] | 79.3 | 94.5 | 28.7M | 3.0B |
| ofa-resnet50D-24 | [email protected][email protected] | 79.0 | 94.2 | 29.0M | 2.4B |
| ofa-resnet50D-18 | [email protected][email protected] | 78.3 | 94.0 | 20.7M | 1.8B |
| ofa-resnet50D-12 | [email protected][email protected]_finetune@25 | 77.1 | 93.3 | 19.3M | 1.2B |
| ofa-resnet50D-09 | [email protected][email protected]_finetune@25 | 76.3 | 92.9 | 14.5M | 0.9B |
| ofa-resnet50D-06 | [email protected][email protected]_finetune@25 | 75.0 | 92.1 | 9.6M | 0.6B |
| **FLOPs** |
| ofa-595M| flops@[email protected]_finetune@75 | 80.0 | 94.9 | 9.1M | 595M |
| ofa-482M | flops@[email protected]_finetune@75 | 79.6 | 94.8 | 9.1M | 482M |
| ofa-389M | flops@[email protected]_finetune@75 | 79.1 | 94.5 | 8.4M | 389M |
| **LG G8** |
| ofa-lg-24 | LG-G8_lat@[email protected]_finetune@25 | 76.4 | 93.0 | 5.8M | 230M |
| ofa-lg-16 | LG-G8_lat@[email protected]_finetune@25 | 74.7 | 92.0 | 5.8M | 151M |
| ofa-lg-11 | LG-G8_lat@[email protected]_finetune@25 | 73.0 | 91.1 | 5.0M | 103M |
| ofa-lg-8 | LG-G8_lat@[email protected]_finetune@25 | 71.1 | 89.7 | 4.1M | 74M |
| **Samsung S7 Edge** |
| ofa-s7edge-88 | s7edge_lat@[email protected]_finetune@25 | 76.3 | 92.9 | 6.4M | 219M |
| ofa-s7edge-58| s7edge_lat@[email protected]_finetune@25 | 74.7 | 92.0 | 4.6M | 145M |
| ofa-s7edge-41 | s7edge_lat@[email protected]_finetune@25 | 73.1 | 91.0 | 4.7M | 96M |
| ofa-s7edge-29 | s7edge_lat@[email protected]_finetune@25 | 70.5 | 89.5 | 3.8M | 66M |
| **Samsung Note8** |
| ofa-note8-65 | note8_lat@[email protected]_finetune@25 | 76.1 | 92.7 | 5.3M | 220M |
| ofa-note8-49 | note8_lat@[email protected]_finetune@25 | 74.9 | 92.1 | 6.0M | 164M |
| ofa-note8-31 | note8_lat@[email protected]_finetune@25 | 72.8 | 90.8 | 4.6M | 101M |
| ofa-note8-22 | note8_lat@[email protected]_finetune@25 | 70.4 | 89.3 | 4.3M | 67M |
| **Samsung Note10** |
| ofa-note10-64 | note10_lat@[email protected]_finetune@75 | 80.2 | 95.1 | 9.1M | 743M |
| ofa-note10-50 | note10_lat@[email protected]_finetune@75 | 79.7 | 94.9 | 9.1M | 554M |
| ofa-note10-41 | note10_lat@[email protected]_finetune@75 | 79.3 | 94.5 | 9.0M | 457M |
| ofa-note10-30 | note10_lat@[email protected]_finetune@75 | 78.4 | 94.2 | 7.5M | 339M |
| ofa-note10-22 | note10_lat@[email protected]_finetune@25 | 76.6 | 93.1 | 5.9M | 237M |
| ofa-note10-16 | note10_lat@[email protected]_finetune@25 | 75.5 | 92.3 | 4.9M | 163M |
| ofa-note10-11 | note10_lat@[email protected]_finetune@25 | 73.6 | 91.2 | 4.3M | 110M |
| ofa-note10-08 | note10_lat@[email protected]_finetune@25 | 71.4 | 89.8 | 3.8M | 79M |
| **Google Pixel1** |
| ofa-pixel1-143 | pixel1_lat@[email protected]_finetune@75 | 80.1 | 95.0 | 9.2M | 642M |
| ofa-pixel1-132 | pixel1_lat@[email protected]_finetune@75 | 79.8 | 94.9 | 9.2M | 593M |
| ofa-pixel1-79 | pixel1_lat@[email protected]_finetune@75 | 78.7 | 94.2 | 8.2M | 356M |
| ofa-pixel1-58 | pixel1_lat@[email protected]_finetune@75 | 76.9 | 93.3 | 5.8M | 230M |
| ofa-pixel1-40 | pixel1_lat@[email protected]_finetune@25 | 74.9 | 92.1 | 6.0M | 162M |
| ofa-pixel1-28 | pixel1_lat@[email protected]_finetune@25 | 73.3 | 91.0 | 5.2M | 109M |
| ofa-pixel1-20 | pixel1_lat@[email protected]_finetune@25 | 71.4 | 89.8 | 4.3M | 77M |
| **Google Pixel2** |
| ofa-pixel2-62 | pixel2_lat@[email protected]_finetune@25 | 75.8 | 92.7 | 5.8M | 208M |
| ofa-pixel2-50 | pixel2_lat@[email protected]_finetune@25 | 74.7 | 91.9 | 4.7M | 166M |
| ofa-pixel2-35 | pixel2_lat@[email protected]_finetune@25 | 73.4 | 91.1 | 5.1M | 113M |
| ofa-pixel2-25 | pixel2_lat@[email protected]_finetune@25 | 71.5 | 90.1 | 4.1M | 79M |
| **1080ti GPU (Batch Size 64)** |
| ofa-1080ti-27 | 1080ti_gpu64@[email protected]_finetune@25 | 76.4 | 93.0 | 6.5M | 397M |
| ofa-1080ti-22 | 1080ti_gpu64@[email protected]_finetune@25 | 75.3 | 92.4 | 5.2M | 313M |
| ofa-1080ti-15 | 1080ti_gpu64@[email protected]_finetune@25 | 73.8 | 91.3 | 6.0M | 226M |
| ofa-1080ti-12 | 1080ti_gpu64@[email protected]_finetune@25 | 72.6 | 90.9 | 5.9M | 165M |
| **V100 GPU (Batch Size 64)** |
| ofa-v100-11 | v100_gpu64@[email protected]_finetune@25 | 76.1 | 92.7 | 6.2M | 352M |
| ofa-v100-09 | v100_gpu64@[email protected]_finetune@25 | 75.3 | 92.4 | 5.2M | 313M |
| ofa-v100-06 | v100_gpu64@[email protected]_finetune@25 | 73.0 | 91.1 | 4.9M | 179M |
| ofa-v100-05 | v100_gpu64@[email protected]_finetune@25 | 71.6 | 90.3 | 5.2M | 141M |
| **Jetson TX2 GPU (Batch Size 16)** |
| ofa-tx2-96 | tx2_gpu16@[email protected]_finetune@25 | 75.8 | 92.7 | 6.2M | 349M |
| ofa-tx2-80 | tx2_gpu16@[email protected]_finetune@25 | 75.4 | 92.4 | 5.2M | 313M |
| ofa-tx2-47 | tx2_gpu16@[email protected]_finetune@25 | 72.9 | 91.1 | 4.9M | 179M |
| ofa-tx2-35 | tx2_gpu16@[email protected]_finetune@25 | 70.3 | 89.4 | 4.3M | 121M |
| **Intel Xeon CPU with MKL-DNN (Batch Size 1)** |
| ofa-cpu-17 | cpu_lat@[email protected]_finetune@25 | 75.7 | 92.6 | 4.9M | 365M |
| ofa-cpu-15 | cpu_lat@[email protected]_finetune@25 | 74.6 | 92.0 | 4.9M | 301M |
| ofa-cpu-11 | cpu_lat@[email protected]_finetune@25 | 72.0 | 90.4 | 4.4M | 160M |
| ofa-cpu-10 | cpu_lat@[email protected]_finetune@25 | 71.1 | 89.9 | 4.2M | 143M |

## How to train **OFA Networks**
```bash
mpirun -np 32 -H :8,:8,:8,:8 \
-bind-to none -map-by slot \
-x NCCL_DEBUG=INFO -x LD_LIBRARY_PATH -x PATH \
python train_ofa_net.py
```
or
```bash
horovodrun -np 32 -H :8,:8,:8,:8 \
python train_ofa_net.py
```

## Introduction Video

[![Watch the video](figures/video_figure.png)](https://www.youtube.com/watch?v=a_OeT8MXzWI&feature=youtu.be)

## Hands-on Tutorial Video

[![Watch the video](figures/ofa-tutorial.jpg)](https://www.youtube.com/watch?v=wrsid5tvuSM)

## Requirement
* Python 3.6+
* Pytorch 1.4.0+
* ImageNet Dataset
* Horovod

## Related work on automated and efficient deep learning:
[ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware](https://arxiv.org/pdf/1812.00332.pdf) (ICLR’19)

[AutoML for Architecting Efficient and Specialized Neural Networks](https://ieeexplore.ieee.org/abstract/document/8897011) (IEEE Micro)

[AMC: AutoML for Model Compression and Acceleration on Mobile Devices](https://arxiv.org/pdf/1802.03494.pdf) (ECCV’18)

[HAQ: Hardware-Aware Automated Quantization](https://arxiv.org/pdf/1811.08886.pdf) (CVPR’19, oral)