Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/zuellni/llasa-webui
LLaSA WebUI using ExLlamaV2 and FastAPI.
https://github.com/zuellni/llasa-webui
exllamav2 fastapi tts
Last synced: about 20 hours ago
JSON representation
LLaSA WebUI using ExLlamaV2 and FastAPI.
- Host: GitHub
- URL: https://github.com/zuellni/llasa-webui
- Owner: Zuellni
- License: mit
- Created: 2025-02-01T21:08:22.000Z (10 days ago)
- Default Branch: main
- Last Pushed: 2025-02-06T21:10:24.000Z (5 days ago)
- Last Synced: 2025-02-06T21:33:39.022Z (5 days ago)
- Topics: exllamav2, fastapi, tts
- Language: Python
- Homepage:
- Size: 1010 KB
- Stars: 10
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# LLaSA WebUI
A simple web interface for [LLaSA](https://huggingface.co/collections/HKUSTAudio/llasa-679b87dbd06ac556cc0e0f44) using [ExLlamaV2](https://github.com/turboderp-org/exllamav2) with an [OpenAI](https://platform.openai.com/docs/guides/text-to-speech) compatible [FastAPI](https://github.com/fastapi/fastapi) server.## Installation
Clone the repo:
```sh
git clone https://github.com/zuellni/llasa-webui
cd llasa-webui
```Create a conda/mamba/python env:
```sh
conda create -n llasa-webui python
conda activate llasa-webui
```Install dependencies, ignore any `xcodec2` errors:
```sh
pip install -r requirements.txt
pip install xcodec2 --no-deps
```Install wheels for [`exllamav2`](https://github.com/turboderp-org/exllamav2/releases/latest) and [`flash-attn`](https://github.com/kingbri1/flash-attention/releases/latest):
```sh
pip install link-to-exllamav2-wheel-goes-here+cu124.torch2.6.0.whl
pip install link-to-flash-attn-wheel-goes-here+cu124.torch2.6.0.whl
```## Models
LLaSA-1B:
```sh
git clone https://huggingface.co/hkustaudio/llasa-1b model # bf16
```LLaSA-3B:
```sh
git clone https://huggingface.co/annuvin/llasa-3b-8.0bpw-h8-exl2 model # 8bpw
git clone https://huggingface.co/hkustaudio/llasa-3b model # bf16
```LLaSA-8B:
```sh
git clone https://huggingface.co/annuvin/llasa-8b-4.0bpw-exl2 model # 4bpw
git clone https://huggingface.co/annuvin/llasa-8b-6.0bpw-exl2 model # 6bpw
git clone https://huggingface.co/annuvin/llasa-8b-8.0bpw-h8-exl2 model # 8bpw
git clone https://huggingface.co/hkustaudio/llasa-8b model # bf16
```X-Codec-2:
```sh
git clone https://huggingface.co/annuvin/xcodec2-bf16 codec # bf16
git clone https://huggingface.co/annuvin/xcodec2-fp32 codec # fp32
```## Usage
```sh
python server.py -m model -c codec -v voices
```
Add `--cache q4 --dtype bf16` for less [VRAM usage](https://www.canirunthisllm.net). You can specify a HuggingFace repo id for `xcodec2`, but you will still need to download one of the LLaSA models above.## Preview
![Preview](assets/preview.png)