https://github.com/juliagusak/model-compression-and-acceleration-progress

Repository to track the progress in model compression and acceleration
https://github.com/juliagusak/model-compression-and-acceleration-progress
acceleration architecture-search compression knowledge-distillation low-rank neural-network pruning sparsification tensor-decomposition
Last synced: 4 months ago
JSON representation
Repository to track the progress in model compression and acceleration
Host: GitHub
URL: https://github.com/juliagusak/model-compression-and-acceleration-progress
Owner: juliagusak
Created: 2019-06-05T16:01:01.000Z (about 6 years ago)
Default Branch: master
Last Pushed: 2021-06-19T07:09:47.000Z (about 4 years ago)
Last Synced: 2025-01-02T22:43:47.740Z (6 months ago)
Topics: acceleration, architecture-search, compression, knowledge-distillation, low-rank, neural-network, pruning, sparsification, tensor-decomposition
Size: 110 KB
Stars: 105
Watchers: 8
Forks: 20
Open Issues: 1
Metadata Files:
- Readme: README.md
Awesome Lists containing this project

README

        # Model Compression and Acceleration Progress

Repository to track the progress in model compression and acceleration

## Low-rank approximation

- T-Net: Parametrizing Fully Convolutional Nets with a Single High-Order Tensor (CVPR 2019) 

[paper](http://openaccess.thecvf.com/content_CVPR_2019/papers/Kossaifi_T-Net_Parametrizing_Fully_Convolutional_Nets_With_a_Single_High-Order_Tensor_CVPR_2019_paper.pdf)

- MUSCO: Multi-Stage COmpression of neural networks (ICCVW 2019)

[paper](http://openaccess.thecvf.com/content_ICCVW_2019/papers/LPCV/Gusak_Automated_Multi-Stage_Compression_of_Neural_Networks_ICCVW_2019_paper.pdf) | [code (PyTorch)](https://github.com/juliagusak/musco)

- Efficient Neural Network Compression (CVPR 2019)

[paper](https://arxiv.org/abs/1811.12781) | [code (Caffe)](https://github.com/Hyeji-Kim/ENC) 

- Adaptive Mixture of Low-Rank Factorizations for Compact Neural Modeling (ICLR 2019)

[paper](https://openreview.net/pdf?id=B1eHgu-Fim) | [code (PyTorch)](https://github.com/zuenko/ALRF)

- Extreme Network Compression via Filter Group Approximation (ECCV 2018)

[paper](https://arxiv.org/abs/1807.11254)

- Ultimate tensorization: compressing convolutional and FC layers alike (NIPS 2016 workshop)

[paper](https://arxiv.org/abs/1611.03214) | [code (TensorFlow)](https://github.com/timgaripov/TensorNet-TF) | [code (MATLAB, Theano + Lasagne)](https://github.com/Bihaqo/TensorNet)

- Compression of Deep Convolutional Neural Networks for Fast and Low Power Mobile Applications (ICLR 2016)

[paper](https://arxiv.org/abs/1511.06530) 

- Accelerating Very Deep Convolutional Networks for Classification and Detection (IEEE TPAMI 2016)

[paper](https://arxiv.org/abs/1505.06798)

- Speeding-up Convolutional Neural Networks Using Fine-tuned CP-Decomposition (ICLR 2015)

[paper](https://arxiv.org/abs/1412.6553) | [code (Caffe)](https://github.com/vadim-v-lebedev/cp-decomposition)

- Exploiting Linear Structure Within Convolutional Networks for Efficient Evaluation (NIPS 2014)

[paper](https://arxiv.org/abs/1404.0736)

- Speeding up Convolutional Neural Networks with Low Rank Expansions (2014)

[paper](https://arxiv.org/abs/1405.3866)

## Pruning & Sparsification

#### Papers

- Rethinking the Value of Network Pruning (ICLR 2019, NIPS 2018 workshop) 

[paper](https://arxiv.org/abs/1810.05270) | [code (PyTorch)](https://github.com/Eric-mingjie/rethinking-network-pruning)

- Dynamic Channel Pruning: Feature Boosting and Suppression (ICLR 2019)

[paper](https://arxiv.org/abs/1810.05331) | [code](https://github.com/deep-fry/mayo)

- AutoPruner: An End-to-End Trainable Filter Pruning Method for Efficient Deep Model Inference (2019)

[paper](https://arxiv.org/abs/1805.08941)

- CLIP-Q: Deep Network Compression Learning by In-ParallelPruning-Quantization (CVPR 2018)

[paper](http://www.sfu.ca/~ftung/papers/clipq_cvpr18.pdf)

- Soft Filter Pruning for Accelerating Deep Convolutional Neural Networks (IJCAI 2018)

[paper](https://arxiv.org/abs/1808.06866) | [code and models (PyTorch)](https://github.com/he-y/soft-filter-pruning)

- Discrimination-aware Channel Pruning for Deep Neural Networks (NIPS 2018)

[paper](https://papers.nips.cc/paper/7367-discrimination-aware-channel-pruning-for-deep-neural-networks.pdf) | [code and pretrained models (PyTorch)](https://github.com/SCUT-AILab/DCP)

- AMC: AutoML for Model Compression and Acceleration on Mobile Devices (ECCV18)

[paper](https://arxiv.org/abs/1802.03494) | [code (PyTorch)](https://github.com/mit-han-lab/amc-release) | [pretrained models (PyTorch, TensorFlow, TensorFlow Light)](https://github.com/mit-han-lab/amc-compressed-models)

- Channel Gating Neural Networks (2018)

[paper](https://arxiv.org/abs/1805.12549)

- DSD: Dense-Sparse-Dense Training for Deep Neural Networks [paper](https://arxiv.org/abs/1607.04381) | [pretrained models (Caffe)](https://songhan.github.io/DSD/) (ICLR 2017)

- Channel Pruning for Accelerating Very Deep Neural Networks (ICCV 2017)

[paper](https://arxiv.org/abs/1707.06168) | [code and pretrained models (Caffe)](https://github.com/yihui-he/channel-pruning) | [code (PyTorch)](https://github.com/Eric-mingjie/rethinking-network-pruning/tree/master/imagenet)

- Learning Efficient Convolutional Networks through Network Slimming (ICCV 2017)

[paper](https://arxiv.org/abs/1708.06519) | [code (Torch, Pytorch)](https://github.com/Eric-mingjie/network-slimming)

- ThiNet: A Filter Level Pruning Method for Deep Neural Network Compression (ICCV 2017)

[paper](https://arxiv.org/abs/1707.06342) | [pretrained model (Caffe)](https://github.com/Roll920/ThiNet) | [code (PyTorch)](https://github.com/Eric-mingjie/rethinking-network-pruning/tree/master/imagenet)

- Structured Bayesian Pruning via Log-Normal Multiplicative Noise (NIPS 2017)

[paper](https://papers.nips.cc/paper/7254-structured-bayesian-pruning-via-log-normal-multiplicative-noise.pdf) | [code (TensorFlow, Theano + Lasagne)](https://github.com/necludov/group-sparsity-sbp)

- SphereFace: Deep Hypersphere Embedding for Face Recognition (CVPR 2017)

[paper](https://arxiv.org/abs/1704.08063) | [code and pretrained models (Caffe)](https://github.com/isthatyoung/Sphereface-prune) 

- Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding (ICLR 2016)

[paper](https://arxiv.org/abs/1510.00149)

- Fast ConvNets Using Group-wise Brain Damage (CVPR 2016)

[paper](http://openaccess.thecvf.com/content_cvpr_2016/papers/Lebedev_Fast_ConvNets_Using_CVPR_2016_paper.pdf)

#### Repos

- Pruning + quantization [code and pretrained models (TensorFlow, TensorFlow light)](https://github.com/vikranth94/Model-Compression). Examples for CIFAR.

## Knowledge distillation 

#### Papers

- Learning Efficient Detector with Semi-supervised Adaptive Distillation (arxiv 2019) [paper](https://arxiv.org/abs/1901.00366) | [code (Caffe)](https://github.com/Tangshitao/Semi-supervised-Adaptive-Distillation)

- Model compression via distillation and quantization (ICLR 2018) [paper](https://arxiv.org/abs/1802.05668) | [code (Pytorch)](https://github.com/antspy/quantized_distillation)

- Training Shallow and Thin Networks for Acceleration via Knowledge Distillation with Conditional Adversarial Networks (ICLR 2018 workshop)

[paper](https://arxiv.org/abs/1709.00513)

- Training Shallow and Thin Networks for Acceleration via Knowledge Distillation with Conditional Adversarial Networks ( BMVC 2018)

[paper](https://arxiv.org/abs/1709.00513)

- Net2Net: Accelerating Learning via Knowledge Transfer (ICLR 2016)

[paper](https://arxiv.org/abs/1511.05641)

- Distilling the Knowledge in a Neural Network (NIPS 2014)

[paper](https://arxiv.org/abs/1503.02531)

- FitNets: Hints for Thin Deep Nets (2014)

[paper](https://arxiv.org/abs/1412.6550) | [code (Theano + Pylearn2)](https://github.com/adri-romsor/FitNets)

#### Repos

TensorFlow implementation of three papers https://github.com/chengshengchan/model_compression, results for CIFAR-10

## Quantization

- Bayesian Bits: Unifying Quantization and Pruning (2020) [paper](https://arxiv.org/abs/2005.07093)

- Up or Down? Adaptive Rounding for Post-Training Quantization (2020) [paper](https://arxiv.org/abs/2004.10568)

- Gradient $\ell_1$ Regularization for Quantization Robustness (ICLR 2020) [paper](https://arxiv.org/abs/2002.07520)

- Training Binary Neural Networks with Real-to-Binary Convolutions (ICLR 2020)

[paper](https://arxiv.org/abs/2003.11535) | [code (coming soon)](https://github.com/brais-martinez/real2binary)

- Data-Free Quantization Through Weight Equalization and Bias Correction (ICCV 2019) [paper](https://arxiv.org/abs/1906.04721) | [code (PyTorch)](https://github.com/jakc4103/DFQ)

- XNOR-Net++ (2019)

[paper](https://arxiv.org/abs/1909.13863)

- Matrix and tensor decompositions for training binary neural networks (2019)

[paper](https://arxiv.org/pdf/1904.07852.pdf)

- XNOR-Net (ECCV 2016)

[paper](https://arxiv.org/abs/1603.05279) | [code (Pytorch)](https://github.com/jiecaoyu/XNOR-Net-PyTorch)

- Trained Quantization Thresholds for Accurate and Efficient Fixed-Point Inference of Deep Neural Networks (2019) [paper](https://arxiv.org/abs/1903.08066) | [code (TensorFlow)](https://github.com/Xilinx/graffitist)

- Relaxed Quantization for Discretized Neural Networks (ICLR 2019) [paper](https://arxiv.org/abs/1810.01875)

- Training and Inference with Integers in Deep Neural Networks (ICLR 2018) [paper](https://arxiv.org/abs/1802.04680) | [code (TensorFlow)](https://github.com/boluoweifenda/WAGE)

- Training Quantized Nets: A Deeper Understanding (NeurIPS 2017) [paper](https://arxiv.org/abs/1706.02379) 

- Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference (2017) [paper](https://arxiv.org/abs/1712.05877)

- Deep Learning with Limited Numerical Precision (2015) [paper](https://arxiv.org/abs/1502.02551)

- Estimating or Propagating Gradients Through Stochastic Neurons for Conditional Computation (2013) [paper](https://arxiv.org/abs/1308.3432)

## Architecture search

- MobileNets

  - Searching for MobileNetV3

  [paper](https://arxiv.org/abs/1905.02244)

  - MobileNetV2: Inverted Residuals and Linear Bottlenecks (CVPR 2018)

  [paper](https://arxiv.org/abs/1801.04381) | [code and pretrained models (TensorFlow)](https://github.com/tensorflow/models/tree/master/research/slim/nets/mobilenet)

- EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks (ICML 2019)

[paper](https://arxiv.org/abs/1905.11946) | [code and pretrained models (TensorFlow)](https://github.com/tensorflow/tpu/tree/master/models/official/efficientnet)

- MnasNet: Platform-Aware Neural Architecture Search for Mobile (CVPR 2019)

[paper](https://arxiv.org/abs/1807.11626) | [code (TensorFlow)](https://github.com/tensorflow/tpu/tree/master/models/official/mnasnet)

- MorphNet: Fast & Simple Resource-Constrained Learning of Deep Network Structure (CVPR 2018) 

[paper](https://arxiv.org/abs/1711.06798) | [code (TensorFlow)](https://github.com/google-research/morph-net)

- ShuffleNets

  - ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design (ECCV 2018)

  [paper](https://arxiv.org/abs/1807.11164)

  - ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices (CVPR 2018)

  [paper](https://arxiv.org/abs/1707.01083)

- Multi-Fiber Networks for Video Recognition (ECCV 2018)

[paper](https://arxiv.org/abs/1807.11195) | [code (PyTorch)](https://github.com/cypw/PyTorch-MFNet)

- IGCVs

  - IGCV3: Interleaved Low-Rank Group Convolutions for Efficient Deep Neural Networks (BMVC 2018)

  [paper](https://arxiv.org/abs/1806.00178) | [code and pretrained models (MXNet)](https://github.com/homles11/IGCV3)

  - IGCV2: Interleaved Structured Sparse Convolutional Neural Networks (CVPR 2018)

  [paper](https://arxiv.org/abs/1804.06202)

  - Interleaved Group Convolutions for Deep Neural Networks (ICCV 2017)

  [paper](https://arxiv.org/abs/1707.02725)

## PhD thesis and overviews

- Quantizing deep convolutional networks for efficient inference: A whitepaper (2018) [paper](https://arxiv.org/abs/1806.08342)

- Algorithms for speeding up convolutional neural networks (2018) [thesis](https://www.skoltech.ru/app/data/uploads/2018/10/Thesis-Final.pdf)

- Model Compression and Acceleration for Deep Neural Networks: The Principles, Progress, and Challenges (2018) [paper](http://cwww.ee.nctu.edu.tw/~cfung/docs/learning/cheng2018DNN_model_compression_accel.pdf)

- Efficient methods and hardware for deep learning (2017) [thesis](https://stacks.stanford.edu/file/druid:qf934gh3708/EFFICIENT%20METHODS%20AND%20HARDWARE%20FOR%20DEEP%20LEARNING-augmented.pdf)

## Frameworks

- [MUSCO](https://github.com/musco-ai) - framework for model compression using tensor decompositions (PyTorch, TensorFlow)

- [AIMET](https://github.com/quic/aimet) - AI Model Efficiency Toolkit (PyTorch, Tensorflow)

- [Distiller](https://github.com/NervanaSystems/distiller) - package for compression using pruning and low-precision arithmetic (PyTorch)

- [MorphNet](https://github.com/google-research/morph-net) - framework for neural networks architecture learning (TensorFlow)

- [Mayo](https://github.com/deep-fry/mayo) - deep learning framework with fine- and coarse-grained pruning, network slimming, and quantization methods 

- [PocketFlow](https://github.com/Tencent/PocketFlow) - framework for model pruning, sparcification, quantization (TensorFlow implementation) 

- [Keras compressor](https://github.com/DwangoMediaVillage/keras_compressor) - compression using low-rank approximations, SVD for matrices, Tucker for tensors.

- [Caffe compressor](https://github.com/yuanyuanli85/CaffeModelCompression) K-means based quantization

- [gemmlowp](https://github.com/google/gemmlowp/blob/master/doc/quantization.md#implementation-of-quantized-matrix-multiplication) - Building a quantization paradigm from first principles (C++)

- [NNI](https://github.com/microsoft/nni) - Framework for Feature Engineering, NAS, Hyperparam tuning and Model compression 

## Comparison of different approaches

Please, see ```comparative_results.pdf``` 

#### 

## Similar repos

- https://github.com/ZhishengWang/Embedded-Neural-Network

- https://github.com/memoiry/Awesome-model-compression-and-acceleration

- https://github.com/sun254/awesome-model-compression-and-acceleration

- https://github.com/guan-yuan/awesome-AutoML-and-Lightweight-Models

- https://github.com/chester256/Model-Compression-Papers

- https://github.com/mapleam/model-compression-and-acceleration-4-DNN

- https://github.com/cedrickchee/awesome-ml-model-compression

- https://github.com/jnjaby/Model-Compression-Acceleration

- https://github.com/he-y/Awesome-Pruning
ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/juliagusak/model-compression-and-acceleration-progress

Awesome Lists containing this project

README