Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/intel/neural-compressor
SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime
https://github.com/intel/neural-compressor
auto-tuning awq fp4 gptq int4 int8 knowledge-distillation large-language-models low-precision mxformat post-training-quantization pruning quantization quantization-aware-training smoothquant sparsegpt sparsity
Last synced: about 2 months ago
JSON representation
SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime
- Host: GitHub
- URL: https://github.com/intel/neural-compressor
- Owner: intel
- License: apache-2.0
- Created: 2020-07-21T23:49:56.000Z (about 4 years ago)
- Default Branch: master
- Last Pushed: 2024-05-22T12:02:49.000Z (4 months ago)
- Last Synced: 2024-05-22T12:28:28.833Z (4 months ago)
- Topics: auto-tuning, awq, fp4, gptq, int4, int8, knowledge-distillation, large-language-models, low-precision, mxformat, post-training-quantization, pruning, quantization, quantization-aware-training, smoothquant, sparsegpt, sparsity
- Language: Python
- Homepage: https://intel.github.io/neural-compressor/
- Size: 409 MB
- Stars: 2,009
- Watchers: 34
- Forks: 239
- Open Issues: 32
-
Metadata Files:
- Readme: README.md
- License: LICENSE
- Security: SECURITY.md
Awesome Lists containing this project
- awesome-oneapi - neural-compressor - Intel Neural Compressor is an open-source Python library for applying popular model compression techniques, such as pruning, quantization, sparsity, and distillation, on all mainstream deep learning frameworks and Intel extensions. (Table of Contents / AI - Frameworks and Toolkits)
- awesome-approximate-dnn - Intel Neural Compressor - source Python lib for neural network compression |TensorFlow, PyTorch, ONNX Runtime, MXNet |Pruning (Magnitude, Grad), Quantization (PQT, dynamic, QAT, Mix precision), Knowledge Distillation| (Tools / Approximations Frameworks)
- StarryDivineSky - intel/neural-compressor
- Awesome-LLM-Compression - [Code
- awesome-production-machine-learning - neural-compressor - compressor.svg?style=social) - Intel® Neural Compressor aims to provide popular model compression techniques such as quantization, pruning (sparsity), distillation, and neural architecture search on mainstream frameworks. (Model Storage Optimisation)