https://github.com/jzz24/pytorch_quantization
A pytorch implementation of dorefa quantization
https://github.com/jzz24/pytorch_quantization
bn-fold dorefa imagenet nvidia-dali quantization resnet
Last synced: 5 months ago
JSON representation
A pytorch implementation of dorefa quantization
- Host: GitHub
- URL: https://github.com/jzz24/pytorch_quantization
- Owner: Jzz24
- License: mit
- Created: 2019-11-27T13:08:44.000Z (almost 6 years ago)
- Default Branch: master
- Last Pushed: 2019-12-30T11:57:31.000Z (almost 6 years ago)
- Last Synced: 2025-03-31T13:28:16.300Z (6 months ago)
- Topics: bn-fold, dorefa, imagenet, nvidia-dali, quantization, resnet
- Language: Python
- Homepage:
- Size: 885 KB
- Stars: 113
- Watchers: 2
- Forks: 11
- Open Issues: 7
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Dorefa-net
A pytorch implementation of [dorefa](https://arxiv.org/abs/1606.06160).The code is inspired by [LaVieEnRoseSMZ](https://github.com/LaVieEnRoseSMZ/AutoBNN) and [zzzxxxttt](https://github.com/kuangliu/pytorch-cifar).## Requirements
* python > 3.5
* torch >= 1.1.0
* torchvision >= 0.4.0
* tb-nightly, future (for tensorboard)
* nvidia-dali >= 0.12 (faster [dataloader](https://docs.nvidia.com/deeplearning/sdk/dali-developer-guide/docs/index.html#))## Cifar-10 Accuracy
Quantized model are trained from scratch
| Model | W_bit | A_bit | Acc |
| :-: | :-: | :-: |:-: |
| resnet-18 | 32 | 32 | 94.71% |
| resnet-18 | 4 | 4 | 94.36% |
| resnet-18 | 1 | 4 | 93.87% |## ImageNet Accuracy
Quantized model are trained from scratch
| Model | W_bit | A_bit | Top1 |Top5 |
| :-: | :-: | :-: |:-: |:-: |
| resnet-18 | 32 | 32 | 69.80% |89.32% |
| resnet-18 | 4 | 4 | 66.60% |87.15% |## Usages
Download the ImageNet dataset and move validation images to labeled subfolders.To do this, you can use the following [script](https://raw.githubusercontent.com/soumith/imagenetloader.torch/master/valprep.sh)
- To train the model
```
python3 cifar_train_eval.py
python3 imagenet_torch_loader --multiprocessing-distributed or python3 imagenet_dali_loader.py
```
- To check the tensorboard log
```
tensorboard --logdir='your_log_dir'
```then navigating to https://localhost:6006 .
- To test the quantized model and bn fused
- convert to the quantized model for inference
```
python3 test_fused_quant_model.py
```
- test bn fuse on the float model
```
python3 bn_fuse.py
```
Obviously, this fusion method is not suitable for quantized models. We will change the bn fuse in the future according to the [paper](https://arxiv.org/pdf/1806.08342.pdf) section 3.2.2.
This bn fuse test result is not serious. However, it is OK to explain the problem qualitatively.
| Model on CPU | before fuse | after fuse |
| :-: | :-: | :-: |
| resnet-18 | 0.74 s | 0.51 s |
| resnet-34 | 1.41 s | 0.92 s |
| resnet-50 | 1.96 s | 1.02 s |## To do
- [x] Train on imagenet2012
- [x] Fold bn
- [x] Test speedup from quantization and bn fold
- [ ] Deploy models to embedded devices
- [ ] ...