https://github.com/sun254/awesome-model-compression-and-acceleration

a list of awesome papers on deep model ompression and acceleration
https://github.com/sun254/awesome-model-compression-and-acceleration

List: awesome-model-compression-and-acceleration

Last synced: 2 months ago
JSON representation

a list of awesome papers on deep model ompression and acceleration

Host: GitHub
URL: https://github.com/sun254/awesome-model-compression-and-acceleration
Owner: sun254
Created: 2018-01-25T02:24:04.000Z (over 7 years ago)
Default Branch: master
Last Pushed: 2021-06-19T07:01:24.000Z (about 4 years ago)
Last Synced: 2025-04-12T09:04:22.821Z (2 months ago)
Size: 35.2 KB
Stars: 351
Watchers: 26
Forks: 88
Open Issues: 3
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

ultimate-awesome - awesome-model-compression-and-acceleration - A list of awesome papers on deep model ompression and acceleration. (Other Lists / Julia Lists)

README

        # awesome-model-compression-and-acceleration

---

## Paper

#### Overview

- [Model compression as constrained optimization, with application to neural nets. Part I: general framework](https://arxiv.org/abs/1707.01209)

- [Model compression as constrained optimization, with application to neural nets. Part II: quantization](https://arxiv.org/abs/1707.04319)

-[A Survey of Model Compression and Acceleration for Deep Neural Networks](https://arxiv.org/pdf/1710.09282.pdf)

#### Structure

- [Dynamic Capacity Networks](https://arxiv.org/pdf/1511.07838.pdf)

- [ResNeXt: Aggregated Residual Transformations for Deep Neural Networks](https://arxiv.org/pdf/1611.05431.pdf)

- [MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications](https://arxiv.org/pdf/1704.04861.pdf)

- [Xception: Deep Learning with Depthwise Separable Convolutions](https://arxiv.org/pdf/1610.02357.pdf)

- [ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices](https://arxiv.org/abs/1707.01083)

- [ThiNet: A Filter Level Pruning Method for Deep Neural Network Compression](https://arxiv.org/abs/1707.06342)

- [SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5MB model size](https://arxiv.org/abs/1602.07360)

- [Residual Attention Network for Image Classification](https://arxiv.org/pdf/1704.06904.pdf)

- [SEP-Nets: Small and Effective Pattern Networks](https://arxiv.org/pdf/1706.03912.pdf)

- [Deep Networks with Stochastic Depth](https://arxiv.org/pdf/1603.09382.pdf)

- [Learning Infinite Layer Networks Without the Kernel Trick](https://arxiv.org/pdf/1606.05316v2.pdf)

- [Coordinating Filters for Faster Deep Neural Networks](https://arxiv.org/pdf/1703.09746v3.pdf)

- [ResBinNet: Residual Binary Neural Network](https://arxiv.org/abs/1711.01243)

- [Squeezedet: Uniﬁed, small, low power fully convolutional neural networks](https://arxiv.org/pdf/1612.01051)

- [Efficient Sparse-Winograd Convolutional Neural Networks](https://openreview.net/pdf?id=r1rqJyHKg)

- [DSD: Dense-Sparse-Dense Training for Deep Neural Networks](https://openreview.net/pdf?id=HyoST_9xl)

- [Fast YOLO: A Fast You Only Look Once System for Real-time Embedded Object Detection in Video](https://arxiv.org/abs/1709.05943)

- [Inverted Residuals and Linear Bottlenecks: Mobile Networks for Classification, Detection and Segmentation](https://arxiv.org/pdf/1801.04381.pdf)

#### Distillation

- [Dark knowledge](http://www.ttic.edu/dl/dark14.pdf)

- [FitNets: Hints for Thin Deep Nets](https://arxiv.org/pdf/1412.6550.pdf)

- [Net2net: Accelerating learning via knowledge transfer]()

- [Distilling the Knowledge in a Neural Network](https://arxiv.org/abs/1503.02531)

- [MobileID: Face Model Compression by Distilling Knowledge from Neurons](https://www.aaai.org/ocs/index.php/AAAI/AAAI16/paper/view/11977)

- [DarkRank: Accelerating Deep Metric Learning via Cross Sample Similarities Transfer](https://arxiv.org/pdf/1707.01220.pdf)

- [Deep Model Compression: Distilling Knowledge from Noisy Teachers](https://arxiv.org/pdf/1610.09650.pdf)

- [Paying More Attention to Attention: Improving the Performance of Convolutional Neural Networks via Attention Transfer](https://arxiv.org/pdf/1612.03928.pdf)

- [Sequence-Level Knowledge Distillation](https://arxiv.org/pdf/1606.07947.pdf)

- [Like What You Like: Knowledge Distill via Neuron Selectivity Transfer](https://arxiv.org/pdf/1707.01219.pdf)

- [Learning Efficient Object Detection Models with Knowledge Distillation](http://papers.nips.cc/paper/6676-learning-efficient-object-detection-models-with-knowledge-distillation.pdf)

- [Data-Free Knowledge Distillation For Deep Neural Networks](https://arxiv.org/pdf/1710.07535.pdf)

- [Learning Loss for Knowledge Distillation with Conditional Adversarial Networks](https://arxiv.org/pdf/1709.00513.pdf)

- [Knowledge Projection for Effective Design of Thinner and Faster Deep Neural Networks](https://arxiv.org/pdf/1710.09505.pdf)

- [Moonshine: Distilling with Cheap Convolutions](https://arxiv.org/pdf/1711.02613.pdf)

- [Model Distillation with Knowledge Transfer from Face Classification to Alignment and Verification](https://arxiv.org/pdf/1709.02929.pdf)

#### Binarization

- [Local Binary Convolutional Neural Networks](https://arxiv.org/pdf/1608.06049.pdf)

- [Binarized Convolutional Neural Networks with Separable Filters for Efficient Hardware Acceleration](https://arxiv.org/pdf/1707.04693.pdf)

- [Binarized Neural Networks: Training Deep Neural Networks with Weights and Activations Constrained to +1 or -1](https://arxiv.org/pdf/1602.02830.pdf)

- [XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks](https://arxiv.org/pdf/1603.05279.pdf)

- [DoReFa-Net: Training Low Bitwidth Convolutional Neural Networks with Low Bitwidth Gradients](https://arxiv.org/pdf/1606.06160.pdf)

#### Quantization

- [Quantize weights and activations in Recurrent Neural Networks](https://arxiv.org/pdf/1611.10176.pdf)

- [The ZipML Framework for Training Models with End-to-End Low Precision: The Cans, the Cannots, and a Little Bit of Deep Learning](https://arxiv.org/pdf/1611.05402.pdf)

- [Quantized Convolutional Neural Networks for Mobile Devices](https://arxiv.org/pdf/1512.06473.pdf)

- [Compressing Deep Convolutional Networks using Vector Quantization](https://arxiv.org/pdf/1412.6115.pdf)

- [Quantized Neural Networks: Training Neural Networks with Low Precision Weights and Activations](https://arxiv.org/pdf/1609.07061.pdf)

- [Fixed-Point Performance Analysis of Recurrent Neural Networks](https://arxiv.org/abs/1512.01322)

- [Loss-aware Binarization of Deep Networks](https://arxiv.org/pdf/1611.01600.pdf)

- [Towards the Limit of Network Quantization](https://arxiv.org/pdf/1612.01543.pdf)

- [Deep Learning with Low Precision by Half-wave Gaussian Quantization](https://arxiv.org/pdf/1702.00953.pdf)

- [ShiftCNN: Generalized Low-Precision Architecture for Inference of Convolutional Neural Networks](https://arxiv.org/pdf/1706.02393.pdf)

- [Trained Ternary Quantization](https://arxiv.org/pdf/1612.01064.pdf)

#### Pruning

- [Data-Driven Sparse Structure Selection for Deep Neural Networks](https://arxiv.org/pdf/1707.01213.pdf)

- [Fine-Pruning: Joint Fine-Tuning and Compression of a Convolutional Network with Bayesian Optimization](https://arxiv.org/pdf/1707.09102.pdf)

- [Learning to Prune: Exploring the Frontier of Fast and Accurate Parsing](http://www.cs.jhu.edu/~jason/papers/vieira+eisner.tacl17.pdf)

- [Designing Energy-Efficient Convolutional Neural Networks using Energy-Aware Pruning](https://arxiv.org/pdf/1611.05128.pdf)

- [Pruning Filters for Efficient ConvNets](https://arxiv.org/pdf/1608.08710.pdf)

- [Pruning Convolutional Neural Networks for Resource Efficient Inference](https://arxiv.org/pdf/1611.06440.pdf)

- [Soft Weight-Sharing for Neural Network Compression](https://arxiv.org/pdf/1702.04008.pdf)

- [Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding](https://arxiv.org/pdf/1510.00149.pdf)

- [Learning both Weights and Connections for Efficient Neural Networks](https://arxiv.org/pdf/1506.02626.pdf)

- [Dynamic Network Surgery for Efficient DNNs](https://arxiv.org/pdf/1608.04493.pdf)

- [ESE: Efficient Speech Recognition Engine with Sparse LSTM on FPGA](https://arxiv.org/pdf/1612.00694.pdf)

- [Faster CNNs with Direct Sparse Convolutions and Guided Pruning](https://arxiv.org/pdf/1608.01409.pdf)

#### Low Rank Approximation

- [Exploiting Linear Structure Within Convolutional Networks for Efficient Evaluation](https://arxiv.org/pdf/1404.0736.pdf)

- [Compression of Deep Convolutional Neural Networks for Fast and Low Power Mobile Applications](https://arxiv.org/pdf/1511.06530.pdf)

- [Efficient and Accurate Approximations of Nonlinear Convolutional Networks](https://arxiv.org/pdf/1411.4229.pdf)

- [Accelerating Very Deep Convolutional Networks for Classification and Detection](https://arxiv.org/pdf/1505.06798.pdf)

- [Convolutional neural networks with low-rank regularization](https://arxiv.org/pdf/1511.06067.pdf)

- [Exploiting Linear Structure Within Convolutional Networks for Efficient Evaluation](https://arxiv.org/pdf/1404.0736.pdf)

- [Speeding up convolutional neural networks with low rank expansions](http://www.robots.ox.ac.uk/~vgg/publications/2014/Jaderberg14b/jaderberg14b.pdf)

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/sun254/awesome-model-compression-and-acceleration

Awesome Lists containing this project

README