https://github.com/Zuellni/ComfyUI-ExLlama-Nodes

ExLlamaV2 nodes for ComfyUI.
https://github.com/Zuellni/ComfyUI-ExLlama-Nodes

ai comfyui exllamav2 llm

Last synced: 4 months ago
JSON representation

ExLlamaV2 nodes for ComfyUI.

Host: GitHub
URL: https://github.com/Zuellni/ComfyUI-ExLlama-Nodes
Owner: Zuellni
License: mit
Created: 2023-09-11T11:14:56.000Z (almost 2 years ago)
Default Branch: main
Last Pushed: 2024-12-06T14:22:11.000Z (7 months ago)
Last Synced: 2025-03-29T15:04:27.626Z (4 months ago)
Topics: ai, comfyui, exllamav2, llm
Language: Python
Homepage:
Size: 179 KB
Stars: 116
Watchers: 4
Forks: 18
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

awesome-comfyui - ExLlama-Nodes
awesome-comfyui - **ComfyUI ExLlamaV2 Nodes**

README

# ComfyUI ExLlama Nodes
A simple local text generator for [ComfyUI](https://github.com/comfyanonymous/ComfyUI) using [ExLlamaV2](https://github.com/turboderp/exllamav2).

## Installation
Clone the repository to `custom_nodes` and install the requirements:
```
cd custom_nodes
git clone https://github.com/Zuellni/ComfyUI-ExLlama-Nodes
pip install -r ComfyUI-ExLlama-Nodes/requirements.txt
```

Use wheels for [ExLlamaV2](https://github.com/turboderp/exllamav2/releases/latest) and [FlashAttention](https://github.com/bdashore3/flash-attention/releases/latest) on Windows:
```
pip install exllamav2-X.X.X+cuXXX.torch2.X.X-cp3XX-cp3XX-win_amd64.whl
pip install flash_attn-X.X.X+cuXXX.torch2.X.X-cp3XX-cp3XX-win_amd64.whl
```

## Usage
Only EXL2, 4-bit GPTQ and FP16 models are supported. You can find them on [Hugging Face](https://huggingface.co).
To use a model with the nodes, you should clone its repository with `git` or manually download all the files and place them in a folder in `models/llm`.
For example, if you'd like to download the 4-bit [Llama-3.1-8B-Instruct](https://huggingface.co/turboderp/Llama-3.1-8B-Instruct-exl2):
```
cd models
mkdir llm
git install lfs
git clone https://huggingface.co/turboderp/Llama-3.1-8B-Instruct-exl2 -b 4.0bpw
```

> [!TIP]
> You can add your own `llm` path to the [extra_model_paths.yaml](https://github.com/comfyanonymous/ComfyUI/blob/master/extra_model_paths.yaml.example) file and put the models there instead.

## Nodes

ExLlama Nodes

Loader
Loads models from the llm directory.

cache_bits
A lower value reduces VRAM usage, but also affects generation speed and quality.

flash_attention
Enabling reduces VRAM usage, not supported on cards with compute capability lower than 8.0.

max_seq_len
Max context, higher value equals higher VRAM usage. 0 will default to model config.

Formatter
Formats messages using the model's chat template.

add_assistant_role
Appends assistant role to the formatted output.

Tokenizer
Tokenizes input text using the model's tokenizer.

add_bos_token
Prepends the input with a bos token if enabled.

encode_special_tokens
Encodes special tokens such as bos and eos if enabled, otherwise treats them as normal strings.

Settings
Optional sampler settings node. Refer to SillyTavern for parameters.

Generator
Generates text based on the given input.

unload
Unloads the model after each generation to reduce VRAM usage.

stop_conditions
A list of strings to stop generation on, e.g. "\n" to stop on newline. Leave empty to only stop on eos.

max_tokens
Max new tokens to generate. 0 will use available context.

Text Nodes

Clean
Strips punctuation, fixes whitespace, and changes case for input text.

Message
A message for the Formatter node. Can be chained to create a conversation.

Preview
Displays generated text in the UI.

Replace
Replaces variable names in curly brackets, e.g. {a}, with their values.

String
A string constant.

## Workflow
An example workflow is embedded in the image below and can be opened in ComfyUI.

![Workflow](https://github.com/user-attachments/assets/359c0340-fe0e-4e69-a1b4-259c6ff5a142)

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/Zuellni/ComfyUI-ExLlama-Nodes

Awesome Lists containing this project

README