Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://intel.github.io/neural-compressor/
SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime
auto-tuning awq fp4 gptq int4 int8 knowledge-distillation large-language-models low-precision mxformat post-training-quantization pruning quantization quantization-aware-training smoothquant sparsegpt sparsity
Last synced: 10 Jun 2024
![](https://github.com/intel.png)
https://github.com/tpoisonooo/how-to-optimize-gemm
row-major matmul optimization
arm64 armv7 cuda cuda-kernel gemm-optimization int4 ptx vulkan
Last synced: 27 May 2024
![](https://github.com/tpoisonooo.png)
https://github.com/intel/neural-compressor
SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime
auto-tuning awq fp4 gptq int4 int8 knowledge-distillation large-language-models low-precision mxformat post-training-quantization pruning quantization quantization-aware-training smoothquant sparsegpt sparsity
Last synced: 23 Mar 2024
![](https://github.com/intel.png)