An open API service indexing awesome lists of open source software.

https://github.com/jzz24/pytorch_quantization

A pytorch implementation of dorefa quantization
https://github.com/jzz24/pytorch_quantization

bn-fold dorefa imagenet nvidia-dali quantization resnet

Last synced: 5 months ago
JSON representation

A pytorch implementation of dorefa quantization

Awesome Lists containing this project

README

          

# Dorefa-net
A pytorch implementation of [dorefa](https://arxiv.org/abs/1606.06160).The code is inspired by [LaVieEnRoseSMZ](https://github.com/LaVieEnRoseSMZ/AutoBNN) and [zzzxxxttt](https://github.com/kuangliu/pytorch-cifar).

## Requirements
* python > 3.5
* torch >= 1.1.0
* torchvision >= 0.4.0
* tb-nightly, future (for tensorboard)
* nvidia-dali >= 0.12 (faster [dataloader](https://docs.nvidia.com/deeplearning/sdk/dali-developer-guide/docs/index.html#))

## Cifar-10 Accuracy

Quantized model are trained from scratch

| Model | W_bit | A_bit | Acc |
| :-: | :-: | :-: |:-: |
| resnet-18 | 32 | 32 | 94.71% |
| resnet-18 | 4 | 4 | 94.36% |
| resnet-18 | 1 | 4 | 93.87% |

## ImageNet Accuracy

Quantized model are trained from scratch

| Model | W_bit | A_bit | Top1 |Top5 |
| :-: | :-: | :-: |:-: |:-: |
| resnet-18 | 32 | 32 | 69.80% |89.32% |
| resnet-18 | 4 | 4 | 66.60% |87.15% |

## Usages
Download the ImageNet dataset and move validation images to labeled subfolders.To do this, you can use the following [script](https://raw.githubusercontent.com/soumith/imagenetloader.torch/master/valprep.sh)
- To train the model
```
python3 cifar_train_eval.py
python3 imagenet_torch_loader --multiprocessing-distributed or python3 imagenet_dali_loader.py
```
- To check the tensorboard log
```
tensorboard --logdir='your_log_dir'
```

then navigating to https://localhost:6006 .

- To test the quantized model and bn fused
- convert to the quantized model for inference
```
python3 test_fused_quant_model.py
```
- test bn fuse on the float model
```
python3 bn_fuse.py
```
Obviously, this fusion method is not suitable for quantized models. We will change the bn fuse in the future according to the [paper](https://arxiv.org/pdf/1806.08342.pdf) section 3.2.2.

This bn fuse test result is not serious. However, it is OK to explain the problem qualitatively.


| Model on CPU | before fuse | after fuse |
| :-: | :-: | :-: |
| resnet-18 | 0.74 s | 0.51 s |
| resnet-34 | 1.41 s | 0.92 s |
| resnet-50 | 1.96 s | 1.02 s |

## To do
- [x] Train on imagenet2012
- [x] Fold bn
- [x] Test speedup from quantization and bn fold
- [ ] Deploy models to embedded devices
- [ ] ...