https://github.com/modeltc/quant_horizon
https://github.com/modeltc/quant_horizon
Last synced: 7 months ago
JSON representation
- Host: GitHub
- URL: https://github.com/modeltc/quant_horizon
- Owner: ModelTC
- License: apache-2.0
- Created: 2024-11-28T09:54:09.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2025-01-10T08:53:02.000Z (about 1 year ago)
- Last Synced: 2025-05-07T19:57:47.701Z (9 months ago)
- Language: Cuda
- Size: 7.08 MB
- Stars: 10
- Watchers: 6
- Forks: 2
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# quant_horizon
`quant_horizon` is a benchmarking framework designed to evaluate the performance of different GPU kernels.
## Prerequisites
To run the benchmark, you need to have the following installed:
- PyTorch (with CUDA support)
- CUDA Toolkit
We also provide some basic docker images:
```bash
# docker-hub python3.11 torch2.5.1 cuda124
docker pull llmcompression/llmc:pure-24112502-cu124
# docker-hub python3.11 torch2.5.1 cuda121
docker pull llmcompression/llmc:pure-24112502-cu121
# aliyun-hub python3.11 torch2.5.1 cuda124
docker pull registry.cn-hangzhou.aliyuncs.com/yongyang/llmcompression:pure-24112502-cu124
# aliyun-hub python3.11 torch2.5.1 cuda121
docker pull registry.cn-hangzhou.aliyuncs.com/yongyang/llmcompression:pure-24112502-cu121
# Then create a container
docker run --gpus all -itd --ipc=host --name [name] -v [path]:[path] --entrypoint /bin/bash [image_id]
```
Make sure to install the necessary dependencies using:
```bash
cd quant_horizon
pip install -v -e .
```
## Usage
### Benchmark a single shape
```bash
cd examples
python bench_single_shape.py
```
### Benchmark all shapes in the transformer model
```bash
cd examples
# You just need to put the config.json into the model_path folder.
python bench_model_shape.py --model [model_path] --tp 1 --bs 1 --seqlen 2048
```