An open API service indexing awesome lists of open source software.

Projects in Awesome Lists tagged with pruning

A curated list of projects in awesome lists tagged with pruning .

https://github.com/datawhalechina/leedl-tutorial

《李宏毅深度学习教程》(李宏毅老师推荐👍,苹果书🍎),PDF下载地址:https://github.com/datawhalechina/leedl-tutorial/releases

bert chatgpt cnn deep-learning diffusion gan leedl-tutorial machine-learning network-compression pruning reinforcement-learning rnn self-attention transfer-learning transformer tutorial

Last synced: 14 May 2025

https://nervanasystems.github.io/distiller/

Neural Network Distiller by Intel AI Lab: a Python package for neural network compression research. https://intellabs.github.io/distiller

automl-for-compression deep-neural-networks distillation early-exit group-lasso jupyter-notebook network-compression onnx pruning pruning-structures pytorch quantization regularization truncated-svd

Last synced: 09 Jul 2025

https://github.com/intellabs/distiller

Neural Network Distiller by Intel AI Lab: a Python package for neural network compression research. https://intellabs.github.io/distiller

automl-for-compression deep-neural-networks distillation early-exit group-lasso jupyter-notebook network-compression onnx pruning pruning-structures pytorch quantization regularization truncated-svd

Last synced: 27 Sep 2025

https://intellabs.github.io/distiller/

Neural Network Distiller by Intel AI Lab: a Python package for neural network compression research. https://intellabs.github.io/distiller

automl-for-compression deep-neural-networks distillation early-exit group-lasso jupyter-notebook network-compression onnx pruning pruning-structures pytorch quantization regularization truncated-svd

Last synced: 03 May 2025

https://github.com/IntelLabs/distiller

Neural Network Distiller by Intel AI Lab: a Python package for neural network compression research. https://intellabs.github.io/distiller

automl-for-compression deep-neural-networks distillation early-exit group-lasso jupyter-notebook network-compression onnx pruning pruning-structures pytorch quantization regularization truncated-svd

Last synced: 20 Mar 2025

https://intel.github.io/neural-compressor/

SOTA low-bit LLM quantization (INT8/FP8/MXFP8/INT4/MXFP4/NVFP4) & sparsity; leading model compression techniques on PyTorch, TensorFlow, and ONNX Runtime

auto-tuning awq fp4 gptq int4 int8 knowledge-distillation large-language-models low-precision mxformat post-training-quantization pruning quantization quantization-aware-training smoothquant sparsegpt sparsity

Last synced: 09 Dec 2025

https://github.com/intel/neural-compressor

SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime

auto-tuning awq fp4 gptq int4 int8 knowledge-distillation large-language-models low-precision mxformat post-training-quantization pruning quantization quantization-aware-training smoothquant sparsegpt sparsity

Last synced: 12 May 2025

https://github.com/quic/aimet

AIMET is a library that provides advanced quantization and compression techniques for trained neural network models.

auto-ml compression deep-learning deep-neural-networks machine-learning network-compression network-quantization open-source opensource pruning quantization

Last synced: 13 May 2025

https://github.com/666DZY666/micronet

micronet, a model compression and deploy lib. compression: 1、quantization: quantization-aware-training(QAT), High-Bit(>2b)(DoReFa/Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference)、Low-Bit(≤2b)/Ternary and Binary(TWN/BNN/XNOR-Net); post-training-quantization(PTQ), 8-bit(tensorrt); 2、 pruning: normal、regular and group convolutional channel pruning; 3、 group convolution structure; 4、batch-normalization fuse for quantization. deploy: tensorrt, fp32/fp16/int8(ptq-calibration)、op-adapt(upsample)、dynamic_shape

batch-normalization-fuse bnn convolutional-networks dorefa group-convolution integer-arithmetic-only model-compression network-in-network network-slimming neuromorphic-computing onnx post-training-quantization pruning pytorch quantization quantization-aware-training tensorrt tensorrt-int8-python twn xnor-net

Last synced: 20 Mar 2025

https://github.com/paddlepaddle/paddleslim

PaddleSlim is an open-source library for deep model compression and architecture search.

bert compression detection distillation ernie nas pruning quantization segmentation sparsity tensorrt transformer yolov5 yolov6 yolov7

Last synced: 14 May 2025

https://github.com/PaddlePaddle/PaddleSlim

PaddleSlim is an open-source library for deep model compression and architecture search.

bert compression detection distillation ernie nas pruning quantization segmentation sparsity tensorrt transformer yolov5 yolov6 yolov7

Last synced: 20 Mar 2025

https://github.com/tensorflow/model-optimization

A toolkit to optimize ML models for deployment for Keras and TensorFlow, including quantization and pruning.

compression deep-learning keras machine-learning ml model-compression optimization pruning quantization quantized-networks quantized-neural-networks quantized-training sparsity tensorflow

Last synced: 12 May 2025

https://github.com/horseee/LLM-Pruner

[NeurIPS 2023] LLM-Pruner: On the Structural Pruning of Large Language Models. Support Llama-3/3.1, Llama-2, LLaMA, BLOOM, Vicuna, Baichuan, TinyLlama, etc.

baichuan bloom chatglm compression language-model llama llama-2 llama3 llm neurips-2023 pruning pruning-algorithms vicuna

Last synced: 01 Oct 2025

https://github.com/horseee/llm-pruner

[NeurIPS 2023] LLM-Pruner: On the Structural Pruning of Large Language Models. Support Llama-3/3.1, Llama-2, LLaMA, BLOOM, Vicuna, Baichuan, TinyLlama, etc.

baichuan bloom chatglm compression language-model llama llama-2 llama3 llm neurips-2023 pruning pruning-algorithms vicuna

Last synced: 16 May 2025

https://github.com/jacobgil/pytorch-pruning

PyTorch Implementation of [1611.06440] Pruning Convolutional Neural Networks for Resource Efficient Inference

deep-learning pruning pytorch

Last synced: 04 Apr 2025

https://github.com/alibaba/tinyneuralnetwork

TinyNeuralNetwork is an efficient and easy-to-use deep learning model compression framework.

deep-learning deep-neural-networks model-compression model-converter post-training-quantization pruning pytorch quantization-aware-training

Last synced: 14 Oct 2025

https://github.com/Syencil/mobile-yolov5-pruning-distillation

mobilev2-yolov5s剪枝、蒸馏,支持ncnn,tensorRT部署。ultra-light but better performence!

distillation mobile-yolov5s ncnn pruning yolov5

Last synced: 20 Apr 2025

https://github.com/sforaidl/kd_lib

A Pytorch Knowledge Distillation library for benchmarking and extending works in the domains of Knowledge Distillation, Pruning, and Quantization.

algorithm-implementations benchmarking data-science deep-learning-library knowledge-distillation machine-learning model-compression pruning pytorch quantization

Last synced: 16 May 2025

https://github.com/he-y/filter-pruning-geometric-median

Filter Pruning via Geometric Median for Deep Convolutional Neural Networks Acceleration (CVPR 2019 Oral)

model-compression pruning pytorch

Last synced: 04 Apr 2025

https://github.com/princeton-nlp/llm-shearing

[ICLR 2024] Sheared LLaMA: Accelerating Language Model Pre-training via Structured Pruning

efficiency llama llama2 llm nlp pre-training pruning

Last synced: 04 Apr 2025

https://github.com/princeton-nlp/LLM-Shearing

[ICLR 2024] Sheared LLaMA: Accelerating Language Model Pre-training via Structured Pruning

efficiency llama llama2 llm nlp pre-training pruning

Last synced: 16 Apr 2025

https://github.com/huggingface/optimum-intel

🤗 Optimum Intel: Accelerate inference with Intel optimization tools

diffusers distillation inference intel onnx openvino optimization pruning quantization transformers

Last synced: 14 Oct 2025

https://github.com/ModelTC/llmc

[EMNLP 2024 Industry Track] This is the official PyTorch implementation of "LLMC: Benchmarking Large Language Model Quantization with a Versatile Compression Toolkit".

awq benchmark deployment evaluation internlm2 large-language-models lightllm llama3 llm lvlm mixtral omniquant post-training-quantization pruning quantization quarot smoothquant spinquant tool vllm

Last synced: 23 Apr 2025

https://github.com/BenWhetton/keras-surgeon

Pruning and other network surgery for trained Keras models.

deep-learning keras network-surgery pruning

Last synced: 20 Mar 2025

https://github.com/airaria/textpruner

A PyTorch-based model pruning toolkit for pre-trained language models

model-pruning pre-trained-language-models pruning transformer

Last synced: 05 Apr 2025

https://github.com/airaria/TextPruner

A PyTorch-based model pruning toolkit for pre-trained language models

model-pruning pre-trained-language-models pruning transformer

Last synced: 09 May 2025

https://github.com/he-y/soft-filter-pruning

Soft Filter Pruning for Accelerating Deep Convolutional Neural Networks

model-compression pruning pytorch

Last synced: 06 Apr 2025

https://github.com/datawhalechina/llm-deploy

大模型/LLM推理和部署理论与实践

knowledge-distillation llm llm-deploy lora pruning quantization

Last synced: 30 Jan 2026

https://github.com/talebolano/yolov3-network-slimming

yolov3 network slimming剪枝的一种实现

network pruning pytorch slimming yolo

Last synced: 20 Apr 2025

https://github.com/megvii-research/Sparsebit

A model compression and acceleration toolbox based on pytorch.

deep-learning post-training-quantization pruning quantization quantization-aware-training sparse tensorrt

Last synced: 12 May 2025

https://github.com/FasterAI-Labs/fasterai

FasterAI: Prune and Distill your models with FastAI and PyTorch

compression fastai knowledge-distillation pruning pytorch

Last synced: 04 May 2025

https://github.com/tasket/wyng-backup

Fast Time Machine-like backups for logical volumes & disk images

backup btrfs img incremental isolation kvm linux lvm pruning qcow2 qubes-os reflinks security vmdk xen xfs

Last synced: 03 Apr 2025

https://github.com/arcee-ai/pruneme

Automated Identification of Redundant Layer Blocks for Pruning in Large Language Models

llm merging pruning pruning-algorithms

Last synced: 09 Apr 2025

https://github.com/shekkizh/neuralnetworks.thought-experiments

Observations and notes to understand the workings of neural network models and other thought experiments using Tensorflow

generative-adversarial-network generative-model neural-network optimal-brain-damage pruning uncertainty-neural-networks

Last synced: 22 Jun 2025

https://github.com/princeton-nlp/cofipruning

[ACL 2022] Structured Pruning Learns Compact and Accurate Models https://arxiv.org/abs/2204.00408

bert model-compression nlp pruning

Last synced: 27 Apr 2025

https://github.com/jack-willturner/deep-compression

Learning both Weights and Connections for Efficient Neural Networks https://arxiv.org/abs/1506.02626

deep-learning pruning pytorch sparsity

Last synced: 03 Apr 2025

https://github.com/sayakpaul/adventures-in-tensorflow-lite

This repository contains notebooks that show the usage of TensorFlow Lite for quantizing deep neural networks.

inference model-optimization model-quantization on-device-ml post-training-quantization pruning quantization-aware-training tensorflow-2 tensorflow-lite tf-hub tf-lite-model

Last synced: 20 Sep 2025

https://github.com/sayakpaul/Adventures-in-TensorFlow-Lite

This repository contains notebooks that show the usage of TensorFlow Lite for quantizing deep neural networks.

inference model-optimization model-quantization on-device-ml post-training-quantization pruning quantization-aware-training tensorflow-2 tensorflow-lite tf-hub tf-lite-model

Last synced: 09 Jul 2025

https://github.com/jshilong/fisherpruning

Group Fisher Pruning for Practical Network Compression(ICML2021)

network-compression pruning

Last synced: 02 Oct 2025

https://github.com/arcee-ai/PruneMe

Automated Identification of Redundant Layer Blocks for Pruning in Large Language Models

llm merging pruning pruning-algorithms

Last synced: 16 Apr 2025

https://datawhalechina.github.io/llm-deploy/

大模型/LLM推理和部署理论与实践

knowledge-distillation llm llm-deploy lora pruning quantization

Last synced: 24 Sep 2025

https://github.com/bzantium/pytorch-admm-pruning

Prune DNN using Alternating Direction Method of Multipliers (ADMM)

admm deep-neural-networks pruning

Last synced: 30 Apr 2025

https://github.com/vita-group/svite

[NeurIPS'21] "Chasing Sparsity in Vision Transformers: An End-to-End Exploration" by Tianlong Chen, Yu Cheng, Zhe Gan, Lu Yuan, Lei Zhang, Zhangyang Wang

dynamic-sparsity efficient-transformers model-compression pruning sparse-training token-slimming vision-transformers

Last synced: 19 Apr 2025

https://github.com/hunto/image_classification_sota

Training ImageNet / CIFAR models with sota strategies and fancy techniques such as ViT, KD, Rep, etc.

cifar image-classification imagenet kd nas pruning pytorch rep transformer vit

Last synced: 03 Feb 2026

https://github.com/julesbelveze/bert-squeeze

🛠️ Tools for Transformers compression using PyTorch Lightning ⚡

bert deebert distillation fastbert lstm nlp pruning pytorch-lightning quantization theseus transformers

Last synced: 07 Apr 2025

https://github.com/luxdamore/nuxt-prune-html

🔌⚡ Nuxt module to prune html before sending it to the browser (it removes elements matching CSS selector(s)), useful for boosting performance showing a different HTML for bots/audits by removing all the scripts with dynamic rendering

audit bot cheerio dynamic-rendering html lighthouse measure modules nuxt nuxt-module nuxtjs optimization optimization-algorithms optimize pagespeed-insights performance prune pruning vuejs web-vitals

Last synced: 07 May 2025

https://github.com/LuXDAmore/nuxt-prune-html

🔌⚡ Nuxt module to prune html before sending it to the browser (it removes elements matching CSS selector(s)), useful for boosting performance showing a different HTML for bots/audits by removing all the scripts with dynamic rendering

audit bot cheerio dynamic-rendering html lighthouse measure modules nuxt nuxt-module nuxtjs optimization optimization-algorithms optimize pagespeed-insights performance prune pruning vuejs web-vitals

Last synced: 30 Mar 2025

https://github.com/vita-group/random_pruning

[ICLR 2022] The Unreasonable Effectiveness of Random Pruning: Return of the Most Naive Baseline for Sparse Training by Shiwei Liu, Tianlong Chen, Xiaohan Chen, Li Shen, Decebal Constantin Mocanu, Zhangyang Wang, Mykola Pechenizkiy

deeplearning erk grasp pruning pruningatinitialization randompruning snip sparsetraining

Last synced: 19 Apr 2025

https://github.com/jack-willturner/batchnorm-pruning

Rethinking the Smaller-Norm-Less-Informative Assumption in Channel Pruning of Convolution Layers https://arxiv.org/abs/1802.00124

batchnorm deep-learning lasso pruning sgd

Last synced: 02 Sep 2025

https://github.com/apple/ml-upscale

Export utility for unconstrained channel pruned models

deep-learning export machine-learning pruning

Last synced: 19 Oct 2025

https://github.com/vita-group/unified-lth-gnn

[ICML 2021] "A Unified Lottery Tickets Hypothesis for Graph Neural Networks", Tianlong Chen*, Yongduo Sui*, Xuxi Chen, Aston Zhang, Zhangyang Wang

co-design co-optimization graph-neural-networks graph-sparsification lottery-ticket-hypothesis pruning

Last synced: 19 Apr 2025

https://github.com/denji/jetbrains-utility

Remove/Backup – settings & cli for macOS (OS X) – DataGrip, AppCode, CLion, Gogland, IntelliJ, PhpStorm, PyCharm, Rider, RubyMine, WebStorm

appcode backup cleaner cleanup clion datagrid gogland intellij jetbrains macos macosx osx phpstorm pruning pycharm rider rubymine script webstorm

Last synced: 29 Sep 2025

https://github.com/vita-group/atmc

[NeurIPS'2019] Shupeng Gui, Haotao Wang, Haichuan Yang, Chen Yu, Zhangyang Wang, Ji Liu, “Model Compression with Adversarial Robustness: A Unified Optimization Framework”

model-compression pruning quantization robustness unified-optimization-framework

Last synced: 07 Aug 2025

https://github.com/joisino/speedbook

書籍『深層ニューラルネットワークの高速化』のサポートサイトです。

deep-learning deep-neural-networks distillation efficiency neural-networks pruning pytorch quantization

Last synced: 24 Oct 2025

https://github.com/sanster/pytorch-network-slimming

A package to make do Network Slimming a little easier

network-slimming pruning pytorch

Last synced: 13 Jun 2025

https://github.com/mit-han-lab/neurips-micronet

[JMLR'20] NeurIPS 2019 MicroNet Challenge Efficient Language Modeling, Champion

efficient-model knowledge-distillation language-modeling natural-language-processing pruning quantization

Last synced: 07 Jul 2025

https://github.com/roim1998/apt

[ICML'24 Oral] APT: Adaptive Pruning and Tuning Pretrained Language Models for Efficient Training and Inference

bert efficient-deep-learning llama2 llm llm-finetuning peft peft-fine-tuning-llm pruning roberta t5

Last synced: 16 May 2025

https://github.com/vita-group/sparsity-win-robust-generalization

[ICLR 2022] "Sparsity Winning Twice: Better Robust Generalization from More Efficient Training" by Tianlong Chen*, Zhenyu Zhang*, Pengjun Wang*, Santosh Balachandra*, Haoyu Ma*, Zehao Wang, Zhangyang Wang

dynamic-sparse-training generalization lottery-ticket-hypothesis pruning robust-generalization robust-overfitting sparsity

Last synced: 29 Oct 2025

https://github.com/eidoslab/simplify

Simplification of pruned models for accelerated inference | SoftwareX https://doi.org/10.1016/j.softx.2021.100907

deep-learning optimization pruning

Last synced: 24 Jun 2025

https://github.com/kentaroy47/deep-compression.pytorch

Unofficial Pytorch implementation of Deep Compression in CIFAR10

accuracy checkpoint deep-compression pruning pytorch resnet

Last synced: 10 Oct 2025

https://github.com/sjmikler/snip-pruning

Reproduction and analysis of SNIP paper

pruning reproduction snip

Last synced: 09 Mar 2025

https://github.com/vita-group/sfw-once-for-all-pruning

[ICLR 2022] "Learning Pruning-Friendly Networks via Frank-Wolfe: One-Shot, Any-Sparsity, and No Retraining" by Lu Miao*, Xiaolong Luo*, Tianlong Chen, Wuyang Chen, Dong Liu, Zhangyang Wang

once-and-for-all pruning sparse-neural-networks stochastic-frank-wolfe

Last synced: 19 Apr 2025

https://github.com/gfrogat/prunhild

A small library implementing magnitude-based pruning in PyTorch

lottery-ticket-hypothesis pruning pytorch

Last synced: 23 Jul 2025

https://github.com/vita-group/smc-bench

[ICLR 2023] "Sparsity May Cry: Let Us Fail (Current) Sparse Neural Networks Together!" Shiwei Liu, Tianlong Chen, Zhenyu Zhang, Xuxi Chen, Tianjin Huang, AJAY KUMAR JAISWAL, Zhangyang Wang

benchmark deep-learning dynamic-sparse-training pruning sparse-neural-networks sparsity

Last synced: 19 Apr 2025

https://github.com/vita-group/gan-lth

[ICLR 2021] "GANs Can Play Lottery Too" by Xuxi Chen, Zhenyu Zhang, Yongduo Sui, Tianlong Chen

gan generative-adversarial-network lottery-ticket-hypothesis pruning transfer

Last synced: 19 Apr 2025

https://github.com/modeltc/llmc

llmc is an efficient LLM compression tool with various advanced compression methods, supporting multiple inference backends.

benchmark deployment evaluation large-language-models llm pruning quantization tool

Last synced: 04 Apr 2025

https://github.com/zhengaoli/disp-llm-dimension-independent-structural-pruning

An implementation of the DISP-LLM method from the NeurIPS 2024 paper: Dimension-Independent Structural Pruning for Large Language Models.

llama llm neurips neurips2024 pruning

Last synced: 06 Oct 2025

https://github.com/zjcv/ssl

[NIPS 2016] Learning Structured Sparsity in Deep Neural Networks

channel-pruning filter-pruning layer-pruning network-pruning pruning pytorch resnet structured-sparsity-learning vggnet zcls

Last synced: 12 May 2025

https://github.com/huangcongqing/model-compression-optimization

model compression and optimization for deployment for Pytorch, including knowledge distillation, quantization and pruning.(知识蒸馏,量化,剪枝)

knowledge-distillation model-compression nas pruning pytorch quantization quantized-networks sparsity sparsity-optimization

Last synced: 05 May 2025

https://github.com/alexfjw/jp-ocr-prunned-cnn

Attempting feature map prunning on a CNN trained for Japanese OCR

deep-learning japanese ocr pruning pytorch

Last synced: 14 Apr 2025

https://github.com/mxbonn/ltmp

Code for Learned Thresholds Token Merging and Pruning for Vision Transformers (LTMP). A technique to reduce the size of Vision Transformers to any desired size with minimal loss of accuracy.

computer-vision deep-learning efficiency pruning transformer vision-transformer

Last synced: 07 May 2025