Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/Zuellni/ComfyUI-ExLlama-Nodes
ExLlama nodes for ComfyUI.
https://github.com/Zuellni/ComfyUI-ExLlama-Nodes
comfyui exllama stable-diffusion
Last synced: 8 days ago
JSON representation
ExLlama nodes for ComfyUI.
- Host: GitHub
- URL: https://github.com/Zuellni/ComfyUI-ExLlama-Nodes
- Owner: Zuellni
- License: mit
- Created: 2023-09-11T11:14:56.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2024-08-07T18:06:27.000Z (3 months ago)
- Last Synced: 2024-08-07T21:25:44.713Z (3 months ago)
- Topics: comfyui, exllama, stable-diffusion
- Language: Python
- Homepage:
- Size: 165 KB
- Stars: 105
- Watchers: 4
- Forks: 12
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- awesome-comfyui - ExLlama-Nodes
README
# ComfyUI ExLlama Nodes
A simple local text generator for [ComfyUI](https://github.com/comfyanonymous/ComfyUI) using [ExLlamaV2](https://github.com/turboderp/exllamav2).## Installation
Clone the repository to `custom_nodes` and install the requirements:
```
cd custom_nodes
git clone https://github.com/Zuellni/ComfyUI-ExLlama-Nodes
pip install -r ComfyUI-ExLlama-Nodes/requirements.txt
```Use wheels for [ExLlamaV2](https://github.com/turboderp/exllamav2/releases/latest) and [FlashAttention](https://github.com/bdashore3/flash-attention/releases/latest) on Windows:
```
pip install exllamav2-X.X.X+cuXXX.torch2.X.X-cp3XX-cp3XX-win_amd64.whl
pip install flash_attn-X.X.X+cuXXX.torch2.X.X-cp3XX-cp3XX-win_amd64.whl
```## Usage
Only EXL2, 4-bit GPTQ and FP16 models are supported. You can find them on [Hugging Face](https://huggingface.co).
To use a model with the nodes, you should clone its repository with `git` or manually download all the files and place them in a folder in `models/llm`.
For example, if you'd like to download the 4-bit [Llama-3.1-8B-Instruct](https://huggingface.co/turboderp/Llama-3.1-8B-Instruct-exl2):
```
cd models
mkdir llm
git install lfs
git clone https://huggingface.co/turboderp/Llama-3.1-8B-Instruct-exl2 -b 4.0bpw
```> [!TIP]
> You can add your own `llm` path to the [extra_model_paths.yaml](https://github.com/comfyanonymous/ComfyUI/blob/master/extra_model_paths.yaml.example) file and put the models there instead.## Nodes
ExLlama Nodes
Loader
Loads models from thellm
directory.
cache_bits
A lower value reduces VRAM usage, but also affects generation speed and quality.
fast_tensors
Enabling reduces RAM usage and speeds up model loading.
flash_attention
Enabling reduces VRAM usage, not supported on cards with compute capability lower than8.0
.
max_seq_len
Max context, higher value equals higher VRAM usage.0
will default to model config.
Formatter
Formats messages using the model's chat template.
add_assistant_role
Appends assistant role to the formatted output.
Tokenizer
Tokenizes input text using the model's tokenizer.
add_bos_token
Prepends the input with abos
token if enabled.
encode_special_tokens
Encodes special tokens such asbos
andeos
if enabled, otherwise treats them as normal strings.
Settings
Optional sampler settings node. Refer to SillyTavern for parameters.
Generator
Generates text based on the given input.
unload
Unloads the model after each generation to reduce VRAM usage.
stop_conditions
A list of strings to stop generation on, e.g."\n"
to stop on newline. Leave empty to only stop oneos
.
max_tokens
Max new tokens to generate.0
will use available context.
Text Nodes
Clean
Strips punctuation, fixes whitespace, and changes case for input text.
Message
A message for theFormatter
node. Can be chained to create a conversation.
Preview
Displays generated text in the UI.
Replace
Replaces variable names in curly brackets, e.g.{a}
, with their values.
String
A string constant.
## Workflow
An example workflow is embedded in the image below and can be opened in ComfyUI.![Workflow](https://github.com/user-attachments/assets/359c0340-fe0e-4e69-a1b4-259c6ff5a142)