https://github.com/elphinkuo/llamaqt.c

Clean C language version of quantizing llama2 model and running quantized llama2 model
https://github.com/elphinkuo/llamaqt.c

google-colab large-language-models quantization quantization-algorithms quantization-efficient-network

Last synced: 4 months ago
JSON representation

Clean C language version of quantizing llama2 model and running quantized llama2 model

Host: GitHub
URL: https://github.com/elphinkuo/llamaqt.c
Owner: elphinkuo
License: apache-2.0
Created: 2023-08-17T19:06:39.000Z (almost 2 years ago)
Default Branch: main
Last Pushed: 2023-09-08T17:14:32.000Z (over 1 year ago)
Last Synced: 2025-01-01T20:07:05.455Z (5 months ago)
Topics: google-colab, large-language-models, quantization, quantization-algorithms, quantization-efficient-network
Language: C
Homepage:
Size: 455 KB
Stars: 4
Watchers: 2
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

        # llama2qt.c

Clean C language version of quantizing llama2 model and running quantized llama2 model.

The code contains some modifications (mainly about quantization and running quantized model) based on [llama.c](https://github.com/karpathy/llama2.c) (Inference Llama 2 in one file of pure C) from  Andrej Karpathy.

Simple instructions:

## 8bit quantization, grouping per layer, without block:

gcc -O3 -o quantize quantize_8bit.c -lm

./quantize {model_name}.bin

## Inference 8bit quantization

gcc -O3 -march=native runq.c -o runq -lm

./runq llama2_7b_8bit.bin -t {temperature} -p {top_p} -n {max_token} -i "{prompt}"

## 8bit quantization, grouping by 64 * 64 block:

gcc -O3 -o quantize quantize_8bit_64block.c -lm

# A quick test, using the Google colab:

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/elphinkuo/llamaqt.c/blob/master/quantization_8bit_demo.ipynb)

More details can be found in the [README.md](README.md) .

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/elphinkuo/llamaqt.c

Awesome Lists containing this project

README