Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/chadHGY/awesome-deep-model-compression

Awesome Deep Model Compression
https://github.com/chadHGY/awesome-deep-model-compression

List: awesome-deep-model-compression

awesome-list deep-learning model-compression model-distillation neural-network pruning python quantization

Last synced: 3 months ago
JSON representation

Awesome Deep Model Compression

Awesome Lists containing this project

README

        

# Awesome Deep Model Compression
[![Awesome](https://awesome.re/badge.svg)](https://awesome.re) [![hackmd-github-sync-badge](https://hackmd.io/uDS93NOpStaNuKB2Y1KXLQ/badge)](https://hackmd.io/uDS93NOpStaNuKB2Y1KXLQ)

A useful list of Deep Model Compression related research papers, articles, tutorials, libraries, tools and more.
Currently the Repos are additional given tags either [Pytorch/TF]. To quickly find hands-on Repos in your commonly used framework, please Ctrl+F to get start :smiley:

## Contents
- [Papers](#papers)
- [General](#general)
- [Architecture](#architecture)
- [Quantization](#quantization)
- [Binarization](#binarization)
- [Pruning](#pruning)
- [Distillation](#distillation)
- [Low Rank Approximation](#low-rank-approximation)
- [Articles](#articles)
- [Blogs](#blogs)
- [Tools](#tools)
- [Libraries](#libraries)
- [Cross Platform](#cross-platform)
- [Hard-ware Integration](#hard-ware-integration)
---

## Papers
### General

### Architecture

### Quantization

### Binarization

### Pruning
- The Lottery Ticket Hypothesis | ICLR, 2019, Google | [Paper](https://openreview.net/pdf?id=rJl-b3RcF7) | [Code](https://github.com/google-research/lottery-ticket-hypothesis)
### Distillation

### Low Rank Approximation

---
## Articles
### Blogs
- [Pruning deep neural networks to make them fast and small](https://jacobgil.github.io/deeplearning/pruning-deep-learning) [Pytorch] By using pruning a VGG-16 based Dogs-vs-Cats classifier is made x3 faster and x4 smaller.
- [All The Ways You Can Compress BERT](http://mitchgordon.me/machine/learning/2019/11/18/all-the-ways-to-compress-BERT.html) - An overview of different compression methods for large NLP models (BERT) based on different characteristics and compares their results.
- [Deep Learning Model Compression](https://rachitsingh.com/deep-learning-model-compression/) methods.
- [Do We Really Need Model Compression](http://mitchgordon.me/machine/learning/2020/01/13/do-we-really-need-model-compression.html) in the future?

---
## Tools
### Libraries
- [torch.nn.utils.prune](https://pytorch.org/tutorials/intermediate/pruning_tutorial.html) [Pytorch]
Pytorch official supported sparsify neural networks and custom pruning technique.
- [Neural Network Intelligence](https://nni.readthedocs.io/en/v1.6/model_compression.html)[Pytorch/TF]
[![Star on GitHub](https://img.shields.io/github/stars/microsoft/nni.svg?style=social)](https://github.com/microsoft/nni)
There are some popular model compression algorithms built-in in NNI. Users could further use NNI’s auto tuning power to find the best compressed model, which is detailed in Auto Model Compression.
- [Condensa](https://github.com/NVlabs/condensa) [Pytorch]
[![Star on GitHub](https://img.shields.io/github/stars/NVlabs/condensa.svg?style=social)](https://github.com/NVlabs/condensa)
A Programming System for Neural Network Compression. | [paper](https://arxiv.org/pdf/1911.02497.pdf)
- [IntelLabs distiller](https://github.com/IntelLabs/distiller) [Pytorch] [![Star on GitHub](https://img.shields.io/github/stars/IntelLabs/distiller.svg?style=social)](https://github.com/IntelLabs/distiller)
Neural Network Distiller by Intel AI Lab: a Python package for neural network compression research. | [Documentation](https://intellabs.github.io/distiller/)
- [Torch-Pruning](https://github.com/VainF/Torch-Pruning)[Pytorch]
[![Star on GitHub](https://img.shields.io/github/stars/VainF/Torch-Pruning.svg?style=social)](https://github.com/VainF/Torch-Pruning)
A pytorch toolkit for structured neural network pruning and layer dependency.
- [CompressAI](https://github.com/InterDigitalInc/CompressAI) [Pytorch]
[![Star on GitHub](https://img.shields.io/github/stars/InterDigitalInc/CompressAI.svg?style=social)](https://github.com/InterDigitalInc/CompressAI)
A PyTorch library and evaluation platform for end-to-end compression research.
- [Model Compression](https://github.com/j-marple-dev/model_compression)[Pytorch]
[![Star on GitHub](https://img.shields.io/github/stars/j-marple-dev/model_compression.svg?style=social)](https://github.com/j-marple-dev/model_compression)
A onestop pytorch model compression repo. | [Reposhub](https://reposhub.com/python/deep-learning/j-marple-dev-model_compression.html)
- [TensorFlow Model Optimization Toolkit](https://github.com/tensorflow/model-optimization) [TF]
[![Star on GitHub](https://img.shields.io/github/stars/tensorflow/model-optimization.svg?style=social)](https://github.com/tensorflow/model-optimization)
Accompanied blog post, [TensorFlow Model Optimization Toolkit — Pruning API](https://medium.com/tensorflow/tensorflow-model-optimization-toolkit-pruning-api-42cac9157a6a?linkId=67380711)
- [XNNPACK](https://github.com/google/xnnpack)
[![Star on GitHub](https://img.shields.io/github/stars/google/xnnpack.svg?style=social)](https://github.com/google/xnnpack)
XNNPACK is a highly optimized library of floating-point neural network inference operators for ARM, WebAssembly, and x86 (SSE2 level) platforms. It's a based on QNNPACK library. However, unlike QNNPACK, XNNPACK focuses entirely on floating-point operators.

### Cross Platform
- [Loading a TorchScript Model in C++](https://pytorch.org/tutorials/advanced/cpp_export.html) [Pytorch x C++]
From an existing Python model to a serialized representation that can be loaded and executed purely from C++, with no dependency on Python.
- [Open Neural Network Exchange (ONNX)](https://github.com/onnx/tutorials)[Pytorch, TF, Keras...etc]
[![Star on GitHub](https://img.shields.io/github/stars/onnx/onnx.svg?style=social)](https://github.com/onnx/onnx)
An open standard format for representing machine learning models.

### Hard-ware Integration
* [TensorRT (NVIDIA)](https://developer.nvidia.com/tensorrt) [Pytorch, TF,Keras ...etc]




* [torch2trt](https://github.com/NVIDIA-AI-IOT/torch2trt) [Pytorch] [![Star on GitHub](https://img.shields.io/github/stars/NVIDIA-AI-IOT/torch2trt.svg?style=social)](https://github.com/NVIDIA-AI-IOT/torch2trt)
An easy to use PyTorch to TensorRT converter
* [How to Convert a Model from PyTorch to TensorRT and Speed Up Inference](https://learnopencv.com/how-to-convert-a-model-from-pytorch-to-tensorrt-and-speed-up-inference/) [Pytorch]

* [TVM (Apache)](https://github.com/apache/tvm) [[Pytorch](https://tvm.apache.org/docs/tutorials/frontend/from_pytorch.html), [TF](https://tvm.apache.org/docs/tutorials/frontend/from_tensorflow.html#sphx-glr-tutorials-frontend-from-tensorflow-py)] [![Star on GitHub](https://img.shields.io/github/stars/apache/tvm.svg?style=social)](https://github.com/apache/tvm)
Open deep learning compiler stack for cpu, gpu and specialized accelerators

* [Pytorch Glow](https://github.com/pytorch/glow) [![Star on GitHub](https://img.shields.io/github/stars/pytorch/glow.svg?style=social)](https://github.com/pytorch/glow)
Glow is a machine learning compiler and execution engine for hardware accelerators. It is designed to be used as a backend for high-level machine learning frameworks. The compiler is designed to allow state of the art compiler optimizations and code generation of neural network graphs.

* [CoreML (Apple)](https://github.com/apple/coremltools) [Pytorch,TF,Keras,SKLearn ...etc] [![Star on GitHub](https://img.shields.io/github/stars/apple/coremltools.svg?style=social)](https://github.com/apple/coremltools)
Core ML provides a unified representation for all models. Your app uses Core ML APIs and user data to make predictions, and to train or fine-tune models, all on the user’s device.
[Introduction](https://developer.apple.com/documentation/coreml)

* [Tensorflow Lite (Google)](https://www.tensorflow.org/lite) [TF]
An open source deep learning framework for on-device inference.

* [~~Intel Nervana Neon~~](https://github.com/NervanaSystems/neon) [Deprecated] [![Star on GitHub](https://img.shields.io/github/stars/NervanaSystems/neon.svg?style=social)](https://github.com/NervanaSystems/neon)