https://github.com/billwuhao/ComfyUI_StepAudioTTS
https://github.com/billwuhao/ComfyUI_StepAudioTTS
Last synced: 4 months ago
JSON representation
- Host: GitHub
- URL: https://github.com/billwuhao/ComfyUI_StepAudioTTS
- Owner: billwuhao
- License: apache-2.0
- Created: 2025-02-20T20:25:01.000Z (4 months ago)
- Default Branch: master
- Last Pushed: 2025-02-20T21:37:33.000Z (4 months ago)
- Last Synced: 2025-02-20T22:25:13.671Z (4 months ago)
- Language: Python
- Size: 1.57 MB
- Stars: 1
- Watchers: 1
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README-en.md
- License: LICENSE
Awesome Lists containing this project
- awesome-comfyui - **ComfyUI_StepAudioTTS** - Audio-TTS in ComfyUI. Can speak, rap, sing, or clone voice. (All Workflows Sorted by GitHub Stars)
README
[中文](README.md) | English
# A Text To Speech node using Step-Audio-TTS in ComfyUI. Can speak, rap, sing, or clone voice.

assets/2025-02-21_05-34-25.png
## Model Download
Download to the `ComfyUI\models\TTS` folder
### Huggingface
| Models | Links |
|-------|-------|
| Step-Audio-Tokenizer | [🤗huggingface](https://huggingface.co/stepfun-ai/Step-Audio-Tokenizer) |
| Step-Audio-TTS-3B | [🤗huggingface](https://huggingface.co/stepfun-ai/Step-Audio-TTS-3B) |### Modelscope
| Models | Links |
|-------|-------|
| Step-Audio-Tokenizer | [modelscope](https://modelscope.cn/models/stepfun-ai/Step-Audio-Tokenizer) |
| Step-Audio-TTS-3B | [modelscope](https://modelscope.cn/models/stepfun-ai/Step-Audio-TTS-3B) |Where_you_download_dir should have the following structure:
```
ComfyUI\models\TTS
├── Step-Audio-Tokenizer
├── Step-Audio-TTS-3B
```### Welcome to contribute more voices
The audio file is named as `{Speaker}_prompt_.WAV`, For example, `明文_prompt.WAV`. I will add them to the code. Thus, there is no need for cloning.
The currently supported voices are in the `Step-Audio-speakers` folder. Welcome to PR more voices.
## Supports Chinese, English, Korean, Japanese, Sichuanese, Cantonese etc.
## Acknowledgements
Part of the code for this project comes from:
* [Step-Audio](https://github.com/stepfun-ai/Step-Audio)
* [CosyVoice](https://github.com/FunAudioLLM/CosyVoice)
* [transformers](https://github.com/huggingface/transformers)
* [FunASR](https://github.com/modelscope/FunASR)Thank you to all the open-source projects for their contributions to this project!