https://github.com/aovoc/nnieqat-pytorch

A nnie quantization aware training tool on pytorch.
https://github.com/aovoc/nnieqat-pytorch

nnie nnieqat-pytorch pytorch quantized-training

Last synced: 3 months ago
JSON representation

A nnie quantization aware training tool on pytorch.

Host: GitHub
URL: https://github.com/aovoc/nnieqat-pytorch
Owner: aovoc
License: mit
Created: 2020-08-16T11:11:33.000Z (almost 6 years ago)
Default Branch: master
Last Pushed: 2020-12-10T16:16:19.000Z (over 5 years ago)
Last Synced: 2026-01-03T08:07:24.639Z (6 months ago)
Topics: nnie, nnieqat-pytorch, pytorch, quantized-training
Language: Python
Homepage:
Size: 415 KB
Stars: 238
Watchers: 3
Forks: 38
Open Issues: 7
Metadata Files:
- Readme: README.md
- License: LICENSE.txt

Awesome Lists containing this project

README

          # nnieqat-pytorch

Nnieqat is a quantize aware training package for  Neural Network Inference Engine(NNIE) on pytorch, it uses hisilicon quantization library to quantize module's weight and activation as fake fp32 format.

## Table of Contents

- [nnieqat-pytorch](#nnieqat-pytorch)

  - [Table of Contents](#table-of-contents)

  - [Installation](#installation)

  - [Usage](#usage)

  - [Code Examples](#code-examples)

  - [Results](#results)

  - [Todo](#todo)

  - [Reference](#reference)

  

## Installation

* Supported Platforms: Linux

* Accelerators and GPUs: NVIDIA GPUs via CUDA driver ***10.1*** or ***10.2***.

* Dependencies:

  * python >= 3.5, < 4

  * llvmlite >= 0.31.0

  * pytorch >= 1.5

  * numba >= 0.42.0

  * numpy >= 1.18.1

* Install nnieqat via pypi:  

  ```shell

  $ pip install nnieqat

  ```

* Install nnieqat in docker(easy way to solve environment problems)：

  ```shell

  $ cd docker

  $ docker build -t nnieqat-image .

  ```

* Install nnieqat via repo：

  ```shell

  $ git clone https://github.com/aovoc/nnieqat-pytorch

  $ cd nnieqat-pytorch

  $ make install

  ```



## Usage

* add quantization hook.

  quantize and dequantize weight and data with HiSVP GFPQ library in forward() process.

  ```python

  from nnieqat import quant_dequant_weight, unquant_weight, merge_freeze_bn, register_quantization_hook

  ...

  ...

    register_quantization_hook(model)

  ...

  ```

* merge bn weight into conv and freeze bn

  suggest finetuning from a well-trained model, merge_freeze_bn at beginning. do it after a few epochs of training otherwise.

  ```python

  from nnieqat import quant_dequant_weight, unquant_weight, merge_freeze_bn, register_quantization_hook

  ...

  ...

      model.train()

      model = merge_freeze_bn(model)  #it will change bn to eval() mode during training

  ...

  ```

* Unquantize weight before update it

  ```python

  from nnieqat import quant_dequant_weight, unquant_weight, merge_freeze_bn, register_quantization_hook

  ...

  ...

      model.apply(unquant_weight)  # using original weight while updating

      optimizer.step()

  ...

  ```

* Dump weight optimized model

  ```python

  from nnieqat import quant_dequant_weight, unquant_weight, merge_freeze_bn, register_quantization_hook

  ...

  ...

      model.apply(quant_dequant_weight)

      save_checkpoint(...)

      model.apply(unquant_weight)

  ...

  ```

* Using EMA with caution(Not recommended).



## Code Examples

* [Cifar10 quantization aware training example][cifar10_qat]  (add nnieqat into [pytorch_cifar10_tutorial][cifar10_example])

  ```python test/test_cifar10.py```

* [ImageNet quantization finetuning example][imagenet_qat]  (add nnieqat into [pytorh_imagenet_main.py][imagenet_example])

  ```python test/test_imagenet.py  --pretrained  path_to_imagenet_dataset```



## Results  

* ImageNet

  ```

  python test/test_imagenet.py /data/imgnet/ --arch squeezenet1_1  --lr 0.001 --pretrained --epoch 10   # nnie_lr_e-3_ft

  python pytorh_imagenet_main.py /data/imgnet/ --arch squeezenet1_1  --lr 0.0001 --pretrained --epoch 10  # lr_e-4_ft

  python test/test_imagenet.py /data/imgnet/ --arch squeezenet1_1  --lr 0.0001 --pretrained --epoch 10  # nnie_lr_e-4_ft

  ```

  finetune result：

    |     | trt_fp32 | trt_int8     | nnie     |

    | -------- |  -------- | -------- | -------- |

    | torchvision     | 0.56992  | 0.56424  | 0.56026 |

    | nnie_lr_e-3_ft | 0.56600   | 0.56328   | 0.56612 |

    | lr_e-4_ft  | 0.57884   | 0.57502   | 0.57542 |

    | nnie_lr_e-4_ft | 0.57834   | 0.57524   | 0.57730 |  

* coco

net: simplified  yolov5s

train 300 epoches, hi3559 test result:   

 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.338   

 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.540   

 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.357   

 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.187   

 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.377   

 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.445   

 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.284   

 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.484   

 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.542   

 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.357   

 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.595   

 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.679   

finetune 20 epoches, hi3559 test result:   

 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.339   

 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.539   

 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.360   

 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.191   

 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.378   

 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.446   

 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.285   

 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.485   

 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.544   

 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.361   

 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.596   

 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.683   



## Todo

* Generate quantized model directly.

  

## Reference

HiSVP 量化库使用指南

[Quantizing deep convolutional networks for efficient inference: A whitepaper][quant_whitepaper]

[8-bit Inference with TensorRT][trt_quant]

[Distilling the Knowledge in a Neural Network][distillingNN]

[cifar10_qat]: https://github.com/aovoc/nnieqat-pytorch/blob/master/test/test_cifar10.py

[imagenet_qat]: https://github.com/aovoc/nnieqat-pytorch/blob/master/test/test_imagenet.py

[imagenet_example]: https://github.com/pytorch/examples/blob/master/imagenet/main.py

[cifar10_example]: https://pytorch.org/tutorials/beginner/blitz/cifar10_tutorial.html

[quant_whitepaper]: https://arxiv.org/abs/1806.08342

[trt_quant]: https://on-demand.gputechconf.com/gtc/2017/presentation/s7310-8-bit-inference-with-tensorrt.pdf

[distillingNN]: https://arxiv.org/abs/1503.02531

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/aovoc/nnieqat-pytorch

Awesome Lists containing this project

README