https://github.com/megvii-research/IntLLaMA

IntLLaMA: A fast and light quantization solution for LLaMA
https://github.com/megvii-research/IntLLaMA

llama llms quantization

Last synced: 8 months ago
JSON representation

IntLLaMA: A fast and light quantization solution for LLaMA

Host: GitHub
URL: https://github.com/megvii-research/IntLLaMA
Owner: megvii-research
License: apache-2.0
Created: 2023-07-13T05:51:58.000Z (over 2 years ago)
Default Branch: main
Last Pushed: 2023-07-21T04:49:46.000Z (over 2 years ago)
Last Synced: 2025-04-30T03:36:23.032Z (8 months ago)
Topics: llama, llms, quantization
Language: Python
Homepage:
Size: 6.93 MB
Stars: 18
Watchers: 4
Forks: 0
Open Issues: 2
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

Awesome-LLM-Compression - [Code

README

# IntLLaMA: A fast and light quantization solution for LLaMA

## Introduction
IntLLaMA, a fast and light quantization solution reduces gpu-memory requirement and improve computational efficiency while simultaneously preserving model intelligence. Specifically, IntLLaMA facilitates a quantization-friendly distribution of hidden-states by utilizing Random Centralization to address the asymmetry and mitigate the impact of outliers. Meanwhile, Hessian-weighted Singular Value Decomposition(HSVD) is further proposed to compensate for the performance degradation caused by representing the model weights using low bit-width. Benefits from RandC and HSVD, IntLLaMA quantize the weight into 4 bit-width, hidden-state into 8 bit-width sperately and close to full-precision performance in perplexity and MMLU accuracy.

## Update News
- 2023-07-13: Release the code for LoRA instruct fine-tuing, More information can be found in
- 2023-07-13: Release a 4w8f ChatGLMv2-6B, which archieve in C-eval and speedup . The more detail can be found in Table1 .
- 2023-07-12: Release the code for convert a full-precision model to quantized model

## Acknowledgement
IntLLaMA was inspired by several open source projects. We are grateful for these excellent projects and list them as follows:
- GPTQ
- AWQ
- Alpaca-LoRA
- Standard-Alpaca

## License
IntLLaMA is released under the Apache 2.0 license.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/megvii-research/IntLLaMA

Awesome Lists containing this project

README