https://github.com/oneflow-inc/oneflow-model-compression
https://github.com/oneflow-inc/oneflow-model-compression
Last synced: 12 months ago
JSON representation
- Host: GitHub
- URL: https://github.com/oneflow-inc/oneflow-model-compression
- Owner: Oneflow-Inc
- License: apache-2.0
- Created: 2021-03-15T05:51:16.000Z (about 5 years ago)
- Default Branch: main
- Last Pushed: 2021-08-16T04:24:17.000Z (almost 5 years ago)
- Last Synced: 2025-03-29T17:12:38.512Z (about 1 year ago)
- Language: Python
- Size: 1.33 MB
- Stars: 7
- Watchers: 50
- Forks: 4
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Oneflow-Model-Compression
## 概述
炼知技术平台是一个模型压缩平台,包含剪枝、量化、知识蒸馏等一系列模型压缩策略。
提供完整的模型压缩解决方案,可用于各种类型的自然语言和计算机视觉场景,如文本分类、推理,图像分类等。
另外,平台在不断完善各种压缩策略在经典开源任务的Benchmark,以便用户参考。
同时,平台也提供各种压缩策略的功能算子,方便用户使用、复现最新的论文方法,以及利用压缩算子进行二次开发。
## 功能
功能模块
算法
相关文档
量化
-
deep compression: Han S, Mao H, Dally W J. "Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding" arXiv preprint arXiv:1510.00149 (2017).
-
NVIDIA TensorRT: a C++ library for high performance inference on NVIDIA GPUs and deep learning accelerators.
剪枝
-
bn channel slimming: Zhuang Liu, Jianguo Li, Zhiqiang Shen. "Learning Efficient Convolutional Networks through Network Slimming" arXiv preprint arXiv:1708.06519 (2017).
-
conv channel slimming: Hao Li, Asim Kadav, Igor Durdanovic. "Pruning Filters for Efficient ConvNets" arXiv preprint arXiv:1608.08710 (2016).
-
conv channel slimming: Hengyuan Hu, Rui Peng, Yu-Wing Tai. "Network Trimming: A Data-Driven Neuron Pruning Approach towards Efficient Deep Architectures" arXiv preprint arXiv:1607.03250 (2016).
知识蒸馏
-
Knowledge Distillation: Hinton, Geoffrey, Oriol Vinyals, and Jeff Dean. "Distilling the knowledge in a neural network." arXiv preprint arXiv:1503.02531 (2015).
-
Distilled-BiLSTM: Tang, Raphael, et al. "Distilling task-specific knowledge from bert into simple neural networks." arXiv preprint arXiv:1903.12136 (2019).
-
BERT-PKD: Sun, Siqi, et al. "Patient knowledge distillation for bert model compression." arXiv preprint arXiv:1908.09355 (2019).
-
TinyBERT: Jiao, Xiaoqi, et al. "Tinybert: Distilling bert for natural language understanding." arXiv preprint arXiv:1909.10351 (2019).
-
MobileBERT: Sun, Zhiqing, et al. "Mobilebert: a compact task-agnostic bert for resource-limited devices." arXiv preprint arXiv:2004.02984 (2020).
-
BERT-Theseus: Xu, Canwen, et al. "Bert-of-theseus: Compressing bert by progressive module replacing." arXiv preprint arXiv:2002.02925 (2020).
-
改进版的BERT-Theseus: Xu, Canwen, et al. "Bert-of-theseus: Compressing bert by progressive module replacing." arXiv preprint arXiv:2002.02925 (2020).
-
知识蒸馏API文档
-
知识蒸馏快速上手
-
Knowledge Distillation算法文档
-
Distilled-BiLSTM算法文档
-
BERT-PKD算法文档
-
TinyBERT算法文档
-
BERT-Theseus算法文档
>
## 使用
- Oneflow介绍: 深度学习框架Oneflow[介绍以及环境安装说明](https://github.com/Oneflow-Inc/oneflow)。
- Oneflow快速开始:通过[简单示例](http://docs.oneflow.org/quick_start/quickstart_in_3_min.html)介绍如何快速3分钟上手使用Oneflow。
- 模型压缩API文档:用户接口文档,包含以下功能
- [量化](./docs/API_quant.md)
- [剪枝](./docs/API_prune.md)
- [知识蒸馏](./docs/API_knowledge_distill.md)
- 高阶教程:包括在CV和NLP等应用场景任务的使用示例、算法使用步骤,高级特性的使用教程。
- 量化功能文档: 介绍量化功能[使用示例](./model_compress/quantization/tutorial.md),主要包含int8量化。
- 剪枝功能文档: 介绍通道剪枝实现和[使用示例](./model_compress/ChannelSlimming/readme.md),只要包括CNN模型、DNN模型的不同剪枝算子。
- [知识蒸馏功能](./model_compress/distil)文档: 介绍知识蒸馏功能相关论文实现和使用示例,主要包含[KD](./model_compress/distil/examples/knowledge_distillation/README.md), [Distilled-BiLSTM](./model_compress/distil/examples/distilled-bilstm/README.md), [BERT-PKD](./model_compress/distil/examples/bert-pkd/README.md), [TinyBERT](./model_compress/distil/examples/tinybert/README.md), [BERT-Theseus](./model_compress/distil/theseus/README.md)等算法。
- [TensorRT量化部署](./docs/API_quant.md): 介绍如何使用TensorRT部署量化得到的Oneflow模型。
- [模型库](./docs/model_zoo.md):各个压缩算法在文本分类、推理,图像分类等数据集上的实验结果,包括模型精度、模型尺寸和推理速度。