{"id":19091601,"url":"https://github.com/juliagusak/model-compression-and-acceleration-progress","last_synced_at":"2026-02-26T20:33:40.617Z","repository":{"id":49300157,"uuid":"190425860","full_name":"juliagusak/model-compression-and-acceleration-progress","owner":"juliagusak","description":"Repository to track the progress in model compression and acceleration","archived":false,"fork":false,"pushed_at":"2021-06-19T07:09:47.000Z","size":113,"stargazers_count":105,"open_issues_count":1,"forks_count":20,"subscribers_count":8,"default_branch":"master","last_synced_at":"2025-01-02T22:43:47.740Z","etag":null,"topics":["acceleration","architecture-search","compression","knowledge-distillation","low-rank","neural-network","pruning","sparsification","tensor-decomposition"],"latest_commit_sha":null,"homepage":null,"language":null,"has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/juliagusak.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2019-06-05T16:01:01.000Z","updated_at":"2024-11-27T09:28:47.000Z","dependencies_parsed_at":"2022-09-19T11:12:14.718Z","dependency_job_id":null,"html_url":"https://github.com/juliagusak/model-compression-and-acceleration-progress","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/juliagusak%2Fmodel-compression-and-acceleration-progress","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/juliagusak%2Fmodel-compression-and-acceleration-progress/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/juliagusak%2Fmodel-compression-and-acceleration-progress/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/juliagusak%2Fmodel-compression-and-acceleration-progress/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/juliagusak","download_url":"https://codeload.github.com/juliagusak/model-compression-and-acceleration-progress/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":240138762,"owners_count":19753984,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["acceleration","architecture-search","compression","knowledge-distillation","low-rank","neural-network","pruning","sparsification","tensor-decomposition"],"created_at":"2024-11-09T03:13:55.146Z","updated_at":"2025-11-11T20:32:16.425Z","avatar_url":"https://github.com/juliagusak.png","language":null,"funding_links":[],"categories":["REFERENCE"],"sub_categories":["2023"],"readme":"# Model Compression and Acceleration Progress\nRepository to track the progress in model compression and acceleration\n\n## Low-rank approximation\n- T-Net: Parametrizing Fully Convolutional Nets with a Single High-Order Tensor (CVPR 2019) \n[paper](http://openaccess.thecvf.com/content_CVPR_2019/papers/Kossaifi_T-Net_Parametrizing_Fully_Convolutional_Nets_With_a_Single_High-Order_Tensor_CVPR_2019_paper.pdf)\n- MUSCO: Multi-Stage COmpression of neural networks (ICCVW 2019)\n[paper](http://openaccess.thecvf.com/content_ICCVW_2019/papers/LPCV/Gusak_Automated_Multi-Stage_Compression_of_Neural_Networks_ICCVW_2019_paper.pdf) | [code (PyTorch)](https://github.com/juliagusak/musco)\n- Efficient Neural Network Compression (CVPR 2019)\n[paper](https://arxiv.org/abs/1811.12781) | [code (Caffe)](https://github.com/Hyeji-Kim/ENC) \n- Adaptive Mixture of Low-Rank Factorizations for Compact Neural Modeling (ICLR 2019)\n[paper](https://openreview.net/pdf?id=B1eHgu-Fim) | [code (PyTorch)](https://github.com/zuenko/ALRF)\n- Extreme Network Compression via Filter Group Approximation (ECCV 2018)\n[paper](https://arxiv.org/abs/1807.11254)\n- Ultimate tensorization: compressing convolutional and FC layers alike (NIPS 2016 workshop)\n[paper](https://arxiv.org/abs/1611.03214) | [code (TensorFlow)](https://github.com/timgaripov/TensorNet-TF) | [code (MATLAB, Theano + Lasagne)](https://github.com/Bihaqo/TensorNet)\n- Compression of Deep Convolutional Neural Networks for Fast and Low Power Mobile Applications (ICLR 2016)\n[paper](https://arxiv.org/abs/1511.06530) \n- Accelerating Very Deep Convolutional Networks for Classification and Detection (IEEE TPAMI 2016)\n[paper](https://arxiv.org/abs/1505.06798)\n- Speeding-up Convolutional Neural Networks Using Fine-tuned CP-Decomposition (ICLR 2015)\n[paper](https://arxiv.org/abs/1412.6553) | [code (Caffe)](https://github.com/vadim-v-lebedev/cp-decomposition)\n- Exploiting Linear Structure Within Convolutional Networks for Efficient Evaluation (NIPS 2014)\n[paper](https://arxiv.org/abs/1404.0736)\n- Speeding up Convolutional Neural Networks with Low Rank Expansions (2014)\n[paper](https://arxiv.org/abs/1405.3866)\n\n\n## Pruning \u0026 Sparsification\n#### Papers\n- Rethinking the Value of Network Pruning (ICLR 2019, NIPS 2018 workshop) \n[paper](https://arxiv.org/abs/1810.05270) | [code (PyTorch)](https://github.com/Eric-mingjie/rethinking-network-pruning)\n- Dynamic Channel Pruning: Feature Boosting and Suppression (ICLR 2019)\n[paper](https://arxiv.org/abs/1810.05331) | [code](https://github.com/deep-fry/mayo)\n- AutoPruner: An End-to-End Trainable Filter Pruning Method for Efficient Deep Model Inference (2019)\n[paper](https://arxiv.org/abs/1805.08941)\n- CLIP-Q: Deep Network Compression Learning by In-ParallelPruning-Quantization (CVPR 2018)\n[paper](http://www.sfu.ca/~ftung/papers/clipq_cvpr18.pdf)\n- Soft Filter Pruning for Accelerating Deep Convolutional Neural Networks (IJCAI 2018)\n[paper](https://arxiv.org/abs/1808.06866) | [code and models (PyTorch)](https://github.com/he-y/soft-filter-pruning)\n- Discrimination-aware Channel Pruning for Deep Neural Networks (NIPS 2018)\n[paper](https://papers.nips.cc/paper/7367-discrimination-aware-channel-pruning-for-deep-neural-networks.pdf) | [code and pretrained models (PyTorch)](https://github.com/SCUT-AILab/DCP)\n- AMC: AutoML for Model Compression and Acceleration on Mobile Devices (ECCV18)\n[paper](https://arxiv.org/abs/1802.03494) | [code (PyTorch)](https://github.com/mit-han-lab/amc-release) | [pretrained models (PyTorch, TensorFlow, TensorFlow Light)](https://github.com/mit-han-lab/amc-compressed-models)\n- Channel Gating Neural Networks (2018)\n[paper](https://arxiv.org/abs/1805.12549)\n- DSD: Dense-Sparse-Dense Training for Deep Neural Networks [paper](https://arxiv.org/abs/1607.04381) | [pretrained models (Caffe)](https://songhan.github.io/DSD/) (ICLR 2017)\n- Channel Pruning for Accelerating Very Deep Neural Networks (ICCV 2017)\n[paper](https://arxiv.org/abs/1707.06168) | [code and pretrained models (Caffe)](https://github.com/yihui-he/channel-pruning) | [code (PyTorch)](https://github.com/Eric-mingjie/rethinking-network-pruning/tree/master/imagenet)\n- Learning Efficient Convolutional Networks through Network Slimming (ICCV 2017)\n[paper](https://arxiv.org/abs/1708.06519) | [code (Torch, Pytorch)](https://github.com/Eric-mingjie/network-slimming)\n- ThiNet: A Filter Level Pruning Method for Deep Neural Network Compression (ICCV 2017)\n[paper](https://arxiv.org/abs/1707.06342) | [pretrained model (Caffe)](https://github.com/Roll920/ThiNet) | [code (PyTorch)](https://github.com/Eric-mingjie/rethinking-network-pruning/tree/master/imagenet)\n- Structured Bayesian Pruning via Log-Normal Multiplicative Noise (NIPS 2017)\n[paper](https://papers.nips.cc/paper/7254-structured-bayesian-pruning-via-log-normal-multiplicative-noise.pdf) | [code (TensorFlow, Theano + Lasagne)](https://github.com/necludov/group-sparsity-sbp)\n- SphereFace: Deep Hypersphere Embedding for Face Recognition (CVPR 2017)\n[paper](https://arxiv.org/abs/1704.08063) | [code and pretrained models (Caffe)](https://github.com/isthatyoung/Sphereface-prune) \n- Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding (ICLR 2016)\n[paper](https://arxiv.org/abs/1510.00149)\n- Fast ConvNets Using Group-wise Brain Damage (CVPR 2016)\n[paper](http://openaccess.thecvf.com/content_cvpr_2016/papers/Lebedev_Fast_ConvNets_Using_CVPR_2016_paper.pdf)\n\n#### Repos\n- Pruning + quantization [code and pretrained models (TensorFlow, TensorFlow light)](https://github.com/vikranth94/Model-Compression). Examples for CIFAR.\n\n\n## Knowledge distillation \n#### Papers\n- Learning Efficient Detector with Semi-supervised Adaptive Distillation (arxiv 2019) [paper](https://arxiv.org/abs/1901.00366) | [code (Caffe)](https://github.com/Tangshitao/Semi-supervised-Adaptive-Distillation)\n- Model compression via distillation and quantization (ICLR 2018) [paper](https://arxiv.org/abs/1802.05668) | [code (Pytorch)](https://github.com/antspy/quantized_distillation)\n- Training Shallow and Thin Networks for Acceleration via Knowledge Distillation with Conditional Adversarial Networks (ICLR 2018 workshop)\n[paper](https://arxiv.org/abs/1709.00513)\n- Training Shallow and Thin Networks for Acceleration via Knowledge Distillation with Conditional Adversarial Networks ( BMVC 2018)\n[paper](https://arxiv.org/abs/1709.00513)\n- Net2Net: Accelerating Learning via Knowledge Transfer (ICLR 2016)\n[paper](https://arxiv.org/abs/1511.05641)\n- Distilling the Knowledge in a Neural Network (NIPS 2014)\n[paper](https://arxiv.org/abs/1503.02531)\n- FitNets: Hints for Thin Deep Nets (2014)\n[paper](https://arxiv.org/abs/1412.6550) | [code (Theano + Pylearn2)](https://github.com/adri-romsor/FitNets)\n\n#### Repos\nTensorFlow implementation of three papers https://github.com/chengshengchan/model_compression, results for CIFAR-10\n\n## Quantization\n- Bayesian Bits: Unifying Quantization and Pruning (2020) [paper](https://arxiv.org/abs/2005.07093)\n- Up or Down? Adaptive Rounding for Post-Training Quantization (2020) [paper](https://arxiv.org/abs/2004.10568)\n- Gradient $\\ell_1$ Regularization for Quantization Robustness (ICLR 2020) [paper](https://arxiv.org/abs/2002.07520)\n- Training Binary Neural Networks with Real-to-Binary Convolutions (ICLR 2020)\n[paper](https://arxiv.org/abs/2003.11535) | [code (coming soon)](https://github.com/brais-martinez/real2binary)\n- Data-Free Quantization Through Weight Equalization and Bias Correction (ICCV 2019) [paper](https://arxiv.org/abs/1906.04721) | [code (PyTorch)](https://github.com/jakc4103/DFQ)\n- XNOR-Net++ (2019)\n[paper](https://arxiv.org/abs/1909.13863)\n- Matrix and tensor decompositions for training binary neural networks (2019)\n[paper](https://arxiv.org/pdf/1904.07852.pdf)\n- XNOR-Net (ECCV 2016)\n[paper](https://arxiv.org/abs/1603.05279) | [code (Pytorch)](https://github.com/jiecaoyu/XNOR-Net-PyTorch)\n- Trained Quantization Thresholds for Accurate and Efficient Fixed-Point Inference of Deep Neural Networks (2019) [paper](https://arxiv.org/abs/1903.08066) | [code (TensorFlow)](https://github.com/Xilinx/graffitist)\n- Relaxed Quantization for Discretized Neural Networks (ICLR 2019) [paper](https://arxiv.org/abs/1810.01875)\n- Training and Inference with Integers in Deep Neural Networks (ICLR 2018) [paper](https://arxiv.org/abs/1802.04680) | [code (TensorFlow)](https://github.com/boluoweifenda/WAGE)\n- Training Quantized Nets: A Deeper Understanding (NeurIPS 2017) [paper](https://arxiv.org/abs/1706.02379) \n- Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference (2017) [paper](https://arxiv.org/abs/1712.05877)\n- Deep Learning with Limited Numerical Precision (2015) [paper](https://arxiv.org/abs/1502.02551)\n- Estimating or Propagating Gradients Through Stochastic Neurons for Conditional Computation (2013) [paper](https://arxiv.org/abs/1308.3432)\n\n\n\n\n## Architecture search\n- MobileNets\n  - Searching for MobileNetV3\n  [paper](https://arxiv.org/abs/1905.02244)\n  - MobileNetV2: Inverted Residuals and Linear Bottlenecks (CVPR 2018)\n  [paper](https://arxiv.org/abs/1801.04381) | [code and pretrained models (TensorFlow)](https://github.com/tensorflow/models/tree/master/research/slim/nets/mobilenet)\n- EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks (ICML 2019)\n[paper](https://arxiv.org/abs/1905.11946) | [code and pretrained models (TensorFlow)](https://github.com/tensorflow/tpu/tree/master/models/official/efficientnet)\n- MnasNet: Platform-Aware Neural Architecture Search for Mobile (CVPR 2019)\n[paper](https://arxiv.org/abs/1807.11626) | [code (TensorFlow)](https://github.com/tensorflow/tpu/tree/master/models/official/mnasnet)\n- MorphNet: Fast \u0026 Simple Resource-Constrained Learning of Deep Network Structure (CVPR 2018) \n[paper](https://arxiv.org/abs/1711.06798) | [code (TensorFlow)](https://github.com/google-research/morph-net)\n- ShuffleNets\n  - ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design (ECCV 2018)\n  [paper](https://arxiv.org/abs/1807.11164)\n  - ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices (CVPR 2018)\n  [paper](https://arxiv.org/abs/1707.01083)\n- Multi-Fiber Networks for Video Recognition (ECCV 2018)\n[paper](https://arxiv.org/abs/1807.11195) | [code (PyTorch)](https://github.com/cypw/PyTorch-MFNet)\n- IGCVs\n  - IGCV3: Interleaved Low-Rank Group Convolutions for Efficient Deep Neural Networks (BMVC 2018)\n  [paper](https://arxiv.org/abs/1806.00178) | [code and pretrained models (MXNet)](https://github.com/homles11/IGCV3)\n  - IGCV2: Interleaved Structured Sparse Convolutional Neural Networks (CVPR 2018)\n  [paper](https://arxiv.org/abs/1804.06202)\n  - Interleaved Group Convolutions for Deep Neural Networks (ICCV 2017)\n  [paper](https://arxiv.org/abs/1707.02725)\n\n\n## PhD thesis and overviews\n\n- Quantizing deep convolutional networks for efficient inference: A whitepaper (2018) [paper](https://arxiv.org/abs/1806.08342)\n- Algorithms for speeding up convolutional neural networks (2018) [thesis](https://www.skoltech.ru/app/data/uploads/2018/10/Thesis-Final.pdf)\n- Model Compression and Acceleration for Deep Neural Networks: The Principles, Progress, and Challenges (2018) [paper](http://cwww.ee.nctu.edu.tw/~cfung/docs/learning/cheng2018DNN_model_compression_accel.pdf)\n- Efficient methods and hardware for deep learning (2017) [thesis](https://stacks.stanford.edu/file/druid:qf934gh3708/EFFICIENT%20METHODS%20AND%20HARDWARE%20FOR%20DEEP%20LEARNING-augmented.pdf)\n\n\n## Frameworks\n- [MUSCO](https://github.com/musco-ai) - framework for model compression using tensor decompositions (PyTorch, TensorFlow)\n- [AIMET](https://github.com/quic/aimet) - AI Model Efficiency Toolkit (PyTorch, Tensorflow)\n- [Distiller](https://github.com/NervanaSystems/distiller) - package for compression using pruning and low-precision arithmetic (PyTorch)\n- [MorphNet](https://github.com/google-research/morph-net) - framework for neural networks architecture learning (TensorFlow)\n- [Mayo](https://github.com/deep-fry/mayo) - deep learning framework with fine- and coarse-grained pruning, network slimming, and quantization methods \n- [PocketFlow](https://github.com/Tencent/PocketFlow) - framework for model pruning, sparcification, quantization (TensorFlow implementation) \n- [Keras compressor](https://github.com/DwangoMediaVillage/keras_compressor) - compression using low-rank approximations, SVD for matrices, Tucker for tensors.\n- [Caffe compressor](https://github.com/yuanyuanli85/CaffeModelCompression) K-means based quantization\n- [gemmlowp](https://github.com/google/gemmlowp/blob/master/doc/quantization.md#implementation-of-quantized-matrix-multiplication) - Building a quantization paradigm from first principles (C++)\n- [NNI](https://github.com/microsoft/nni) - Framework for Feature Engineering, NAS, Hyperparam tuning and Model compression \n\n\n\n## Comparison of different approaches\n\nPlease, see ```comparative_results.pdf``` \n\n\n#### \n\n## Similar repos\n\n- https://github.com/ZhishengWang/Embedded-Neural-Network\n- https://github.com/memoiry/Awesome-model-compression-and-acceleration\n- https://github.com/sun254/awesome-model-compression-and-acceleration\n- https://github.com/guan-yuan/awesome-AutoML-and-Lightweight-Models\n- https://github.com/chester256/Model-Compression-Papers\n- https://github.com/mapleam/model-compression-and-acceleration-4-DNN\n- https://github.com/cedrickchee/awesome-ml-model-compression\n- https://github.com/jnjaby/Model-Compression-Acceleration\n- https://github.com/he-y/Awesome-Pruning\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjuliagusak%2Fmodel-compression-and-acceleration-progress","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fjuliagusak%2Fmodel-compression-and-acceleration-progress","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjuliagusak%2Fmodel-compression-and-acceleration-progress/lists"}