https://github.com/monk1337/autoquantllm
AutoQuantLLM
https://github.com/monk1337/autoquantllm
Last synced: 23 days ago
JSON representation
AutoQuantLLM
- Host: GitHub
- URL: https://github.com/monk1337/autoquantllm
- Owner: monk1337
- License: apache-2.0
- Created: 2024-04-02T10:29:05.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2024-04-02T10:55:54.000Z (about 1 year ago)
- Last Synced: 2025-04-28T16:13:05.218Z (23 days ago)
- Language: Shell
- Size: 13.7 KB
- Stars: 6
- Watchers: 1
- Forks: 2
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
Auto-QuantLLM ⚡️
Quantize Large Language Models (LLMs) Locally with a Single Command
## Overview
Auto-QuantLLM is a toolkit designed to simplify the Quantization of Large Language Models (LLMs) with an emphasis on ease of use and flexibility, Auto-QuantLLM supports conversion of LLM into different efficient formats for local deployment.
## Supporting Quantization Methods
GGUF, GPTQ, EXL2, AWQ, and HQQ
## Getting Started
### Installation
Clone the repository to get started with Auto-QuantLLM:
```bash
git clone https://github.com/monk1337/AutoQuantLLM.git
cd AutoQuantLLM
``````bash
# Convert your Hugging Face model to GGUF format for local deployment
# Usage:
# ./scripts/autogguf.sh -m [-u USERNAME] [-t TOKEN] [-q QUANTIZATION_METHODS]# Example command:
./scripts/autogguf.sh -m unsloth/gemma-2b
```### More Options
```bash
# if want to upload the gguf model to hub after the conversion, provide the user and token
# Example command:
./scripts/autogguf.sh -m unsloth/gemma-2b -u user_name -t hf_token#if wants to provide QUANTIZATION_METHODS
# Example command:
./scripts/autogguf.sh -m unsloth/gemma-2b -u user_name -t hf_token -q "q4_k_m,q5_k_m"
```