https://github.com/dwain-barnes/llm-gguf-auto-converter

Automated Jupyter notebook solution for batch converting Large Language Models to GGUF format with multiple quantization options. Built on llama.cpp with HuggingFace integration.
https://github.com/dwain-barnes/llm-gguf-auto-converter

auto-converter batch-processing cuda gguf huggingface jupyter-notebook llama-cpp llm model-quantization

Last synced: 5 months ago
JSON representation

Automated Jupyter notebook solution for batch converting Large Language Models to GGUF format with multiple quantization options. Built on llama.cpp with HuggingFace integration.

Host: GitHub
URL: https://github.com/dwain-barnes/llm-gguf-auto-converter
Owner: dwain-barnes
Created: 2025-01-26T20:29:36.000Z (9 months ago)
Default Branch: main
Last Pushed: 2025-01-26T20:47:32.000Z (9 months ago)
Last Synced: 2025-01-26T21:26:43.427Z (9 months ago)
Topics: auto-converter, batch-processing, cuda, gguf, huggingface, jupyter-notebook, llama-cpp, llm, model-quantization
Language: Jupyter Notebook
Homepage:
Size: 0 Bytes
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# LLM GGUF Auto Converter

An automated Jupyter notebook solution for converting and quantizing Large Language Models to GGUF format. This tool streamlines the process of converting models from Hugging Face to GGUF format with multiple quantization options. It was inspired by Maxime Labonne script but
that one was not working for me and was in Google Colab.

## Features

- Automated conversion process
- Batch processing of multiple quantization formats (Q2_K to Q8_0)
- Automatic CUDA detection and utilisation
- Integrated Hugging Face upload functionality
- Progress tracking for long conversions
- Automatic model card generation
- Built on llama.cpp for optimal performance

## Supported Quantization Methods

All standard llama.cpp quantization methods are supported:
- Q2_K: Ultra-lightweight (2-bit)
- Q3_K_M: Balanced lightweight (3-bit)
- Q4_K_M: Standard balanced (4-bit)
- Q5_K_M: Enhanced balanced (5-bit)
- Q6_K: High quality (6-bit)
- Q8_0: Maximum quality (8-bit)

## Prerequisites

- Python 3.8+
- Jupyter Notebook environment
- CUDA capable GPU
- Hugging Face account and API token
- CMake
- Git

## Configuration

Before running the notebook, update these 3 required variables:

```python
# The Hugging Face model ID to convert (e.g., "mistralai/Mistral-7B-v0.1")
MODEL_ID = "your-model-id-here"

# Your Hugging Face username
USERNAME = "your-username"

# Your Hugging Face API token from https://huggingface.co/settings/tokens
HF_TOKEN = "your-token-here"
```
## Quick Start

1. Clone this repository
2. Open the Jupyter notebook
3. Update the configuration variables with your details
4. Run all cells

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/dwain-barnes/llm-gguf-auto-converter

Awesome Lists containing this project

README