Projects in Awesome Lists tagged with tts

https://github.com/corentinj/real-time-voice-cloning

Clone a voice in 5 seconds to generate arbitrary speech in real-time

deep-learning python pytorch tensorflow tts voice-cloning

Last synced: 14 May 2025

https://github.com/CorentinJ/Real-Time-Voice-Cloning

Clone a voice in 5 seconds to generate arbitrary speech in real-time

deep-learning python pytorch tensorflow tts voice-cloning

Last synced: 14 Mar 2025

https://github.com/rvc-boss/gpt-sovits

1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

text-to-speech tts vits voice-clone voice-cloneai voice-cloning

Last synced: 12 May 2025

https://github.com/coqui-ai/tts

🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production

deep-learning glow-tts hifigan melgan multi-speaker-tts python pytorch speaker-encoder speaker-encodings speech speech-synthesis tacotron text-to-speech tts tts-model vocoder voice-cloning voice-conversion voice-synthesis

Last synced: 12 May 2025

https://github.com/unslothai/unsloth

Finetune Qwen3, Llama 4, TTS, DeepSeek-R1 & Gemma 3 LLMs 2x faster with 70% less memory! 🦥

deepseek deepseek-r1 fine-tuning finetuning gemma gemma3 llama llama-4 llama3 llama4 llm llms lora mistral qlora qwen qwen3 text-to-speech tts unsloth

Last synced: 12 May 2025

https://github.com/babysor/mockingbird

🚀AI拟声: 5秒内克隆您的声音并生成任意语音内容 Clone a voice in 5 seconds to generate arbitrary speech in real-time

ai deep-learning pytorch speech text-to-speech tts

Last synced: 12 May 2025

https://github.com/2noise/chattts

A generative speech model for daily dialogue.

agent chat chatgpt chattts chinese chinese-language english english-language gpt llm llm-agent natural-language-inference python text-to-speech torch torchaudio tts

Last synced: 12 May 2025

https://github.com/babysor/MockingBird

🚀AI拟声: 5秒内克隆您的声音并生成任意语音内容 Clone a voice in 5 seconds to generate arbitrary speech in real-time

ai deep-learning pytorch speech text-to-speech tts

Last synced: 20 Mar 2025

https://github.com/2noise/ChatTTS

A generative speech model for daily dialogue.

agent chat chatgpt chattts chinese chinese-language english english-language gpt llm llm-agent natural-language-inference python text-to-speech torch torchaudio tts

Last synced: 24 Mar 2025

https://github.com/coqui-ai/TTS

🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production

deep-learning glow-tts hifigan melgan multi-speaker-tts python pytorch speaker-encoder speaker-encodings speech speech-synthesis tacotron text-to-speech tts tts-model vocoder voice-cloning voice-conversion voice-synthesis

Last synced: 14 Mar 2025

https://github.com/RVC-Boss/GPT-SoVITS

1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

text-to-speech tts vits voice-clone voice-cloneai voice-cloning

Last synced: 24 Mar 2025

https://github.com/mudler/localai

:robot: The free, Open Source alternative to OpenAI, Claude and others. Self-hosted and local-first. Drop-in replacement for OpenAI, running on consumer-grade hardware. No GPU required. Runs gguf, transformers, diffusers and many more models architectures. Features: Generate Text, Audio, Video, Images, Voice Cloning, Distributed, P2P inference

ai api audio-generation distributed gemma gpt4all image-generation kubernetes libp2p llama llama3 llm mamba mistral musicgen rerank rwkv stable-diffusion text-generation tts

Last synced: 12 May 2025

https://github.com/go-skynet/LocalAI

:robot: The free, Open Source alternative to OpenAI, Claude and others. Self-hosted and local-first. Drop-in replacement for OpenAI, running on consumer-grade hardware. No GPU required. Runs gguf, transformers, diffusers and many more models architectures. Features: Generate Text, Audio, Video, Images, Voice Cloning, Distributed, P2P inference

ai api audio-generation distributed gemma gpt4all image-generation kubernetes libp2p llama llama3 llm mamba mistral musicgen rerank rwkv stable-diffusion text-generation tts

Last synced: 03 May 2025

https://github.com/myshell-ai/OpenVoice

Instant voice cloning by MIT and MyShell.

text-to-speech tts voice-clone zero-shot-tts

Last synced: 20 Mar 2025

https://github.com/mudler/LocalAI

:robot: The free, Open Source alternative to OpenAI, Claude and others. Self-hosted and local-first. Drop-in replacement for OpenAI, running on consumer-grade hardware. No GPU required. Runs gguf, transformers, diffusers and many more models architectures. Features: Generate Text, Audio, Video, Images, Voice Cloning, Distributed inference

ai api audio-generation distributed gemma gpt4all image-generation kubernetes llama llama3 llm mamba mistral musicgen p2p rerank rwkv stable-diffusion text-generation tts

Last synced: 14 Mar 2025

https://github.com/fishaudio/fish-speech

SOTA Open Source TTS

llama transformer tts valle vits vqgan vqvae

Last synced: 13 May 2025

https://github.com/nvidia/nemo

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

asr deeplearning generative-ai large-language-models machine-translation multimodal neural-networks speaker-diariazation speaker-recognition speech-synthesis speech-translation tts

Last synced: 12 May 2025

https://github.com/funaudiollm/cosyvoice

Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.

audio-generation cantonese chatbot chatgpt chinese cosyvoice cross-lingual english fine-grained fine-tuning gpt-4o japanese korean multi-lingual natural-language-generation python text-to-speech tts voice-cloning

Last synced: 14 May 2025

https://github.com/mastra-ai/mastra

The TypeScript AI agent framework. ⚡ Assistants, RAG, observability. Supports any LLM: GPT-4, Claude, Gemini, Llama.

agents ai chatbots evals javascript llm mcp nextjs nodejs reactjs tts typescript workflows

Last synced: 14 May 2025

https://github.com/pot-app/pot-desktop

🌈一个跨平台的划词翻译和OCR软件 | A cross-platform software for text translation and recognition.

linux macos ocr pot pot-app recognize tauri translate translation tts windows

Last synced: 15 May 2025

https://github.com/paddlepaddle/paddlespeech

Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.

asr code-switch conformer kws punctuation-restoration self-supervised-learning sound-classification speech-alignment speech-recognition speech-synthesis speech-translation streaming-asr streaming-tts transformer tts vocoder voice-cloning voice-recognition wav2vec2 whisper

Last synced: 12 May 2025

https://github.com/NVIDIA/NeMo

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

asr deeplearning generative-ai large-language-models machine-translation multimodal neural-networks speaker-diariazation speaker-recognition speech-synthesis speech-translation tts

Last synced: 14 Mar 2025

https://github.com/PaddlePaddle/PaddleSpeech

Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.

asr code-switch conformer kws punctuation-restoration self-supervised-learning sound-classification speech-alignment speech-recognition speech-synthesis speech-translation streaming-asr streaming-tts transformer tts vocoder voice-cloning voice-recognition wav2vec2 whisper

Last synced: 24 Mar 2025

https://github.com/mozilla/tts

:robot: :speech_balloon: Deep learning for Text to Speech (Discussion forum: https://discourse.mozilla.org/c/tts)

dataset-analysis deep-learning gantts glow-tts melgan multiband-melgan python pytorch speaker-encoder speech tacotron tacotron2 tensorflow2 text-to-speech tts vocoder

Last synced: 13 May 2025

https://github.com/drewthomasson/ebook2audiobook

Convert ebooks to audiobooks with chapters and metadata using dynamic AI models and voice cloning. Supports 1,107+ languages!

audiobooks chinese colab-notebook docker english epub gradio kaggle linux mac multilingual tts voice-cloning windows xtts

Last synced: 12 May 2025

https://github.com/mozilla/TTS

:robot: :speech_balloon: Deep learning for Text to Speech (Discussion forum: https://discourse.mozilla.org/c/tts)

dataset-analysis deep-learning gantts glow-tts melgan multiband-melgan python pytorch speaker-encoder speech tacotron tacotron2 tensorflow2 text-to-speech tts vocoder

Last synced: 14 Mar 2025

https://github.com/rhasspy/piper

A fast, local neural text to speech system

speech-synthesis text-to-speech tts

Last synced: 13 May 2025

https://github.com/fishaudio/Bert-VITS2

vits2 backbone with multilingual-bert

agent bert bert-vits bert-vits2 fish fish-speech llm tts vits vits2 vocoder

Last synced: 27 Mar 2025

https://github.com/rany2/edge-tts

Use Microsoft Edge's online text-to-speech service from Python WITHOUT needing Microsoft Edge or Windows or an API key

speech-synthesis text-to-speech tts

Last synced: 13 May 2025

https://github.com/netease-youdao/emotivoice

EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine

ai deep-learning emotion emotivoice multi-speaker prompt python pytorch speech speech-synthesis style text-to-speech tts

Last synced: 13 May 2025

https://github.com/plachtaa/vall-e-x

An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io/vallex/

emotional-speech gpt text-to-speech transformer-architecture tts vall-e voice-clone

Last synced: 14 May 2025

https://github.com/Plachtaa/VALL-E-X

An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io/vallex/

emotional-speech gpt text-to-speech transformer-architecture tts vall-e voice-clone

Last synced: 24 Mar 2025

https://github.com/netease-youdao/EmotiVoice

EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine

ai deep-learning emotion emotivoice multi-speaker prompt python pytorch speech speech-synthesis style text-to-speech tts

Last synced: 24 Mar 2025

https://github.com/jaywalnut310/vits

VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech

deep-learning pytorch speech-synthesis text-to-speech tts

Last synced: 14 May 2025

https://github.com/readest/readest

Readest is a modern, feature-rich ebook reader designed for avid readers offering seamless cross-platform access, powerful tools, and an intuitive interface to elevate your reading experience.

android cross-platform ebook ebook-reader epub foliate ios nextjs reader sync tauri tauri2 tts

Last synced: 13 May 2025

https://github.com/wzpan/wukong-robot

🤖 wukong-robot 是一个简单、灵活、优雅的中文语音对话机器人/智能音箱项目，支持ChatGPT多轮对话能力，还可能是首个支持脑机交互的开源智能音箱项目。

ai alexa amazon-echo anyq asr bci chatgpt google-home gpt3 homeassistant muse openai raspeberry-pi snowboy speaker tts unit

Last synced: 14 May 2025

https://github.com/jianchang512/clone-voice

A sound cloning tool with a web interface, using your voice or any sound to record audio / 一个带web界面的声音克隆工具，使用你的音色或任意声音来录制音频

clonevoice speech-analysis sts tts voice-assistant

Last synced: 14 May 2025

https://github.com/shidahuilang/shuyuan

阅读书源-香色闺阁+阅读3.0书源+源阅读+爱阅书香+千阅+花火阅读+读不舍手+番茄+喜马拉雅IPTV源+IPA巨魔应用=自动更新

aiyueshuxiang ipa iptv reader shuyuan trollstore tts xiangsegige yuanyuedu yuedu

Last synced: 10 Apr 2025

https://github.com/lokerl/tts-vue

🎤 微软语音合成工具，使用 Electron + Vue + ElementPlus + Vite 构建。

electron element-plus tts vue

Last synced: 13 May 2025

https://github.com/myshell-ai/melotts

High-quality multi-lingual text-to-speech library by MyShell.ai. Support English, Spanish, French, Chinese, Japanese and Korean.

chinese english french japanese korean multilingual spanish text-to-speech tts

Last synced: 13 May 2025

https://github.com/FunAudioLLM/CosyVoice

Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.

audio-generation cantonese chatbot chatgpt chinese cosyvoice cross-lingual english fine-grained fine-tuning gpt-4o japanese korean multi-lingual natural-language-generation python text-to-speech tts voice-cloning

Last synced: 24 Mar 2025

https://github.com/LokerL/tts-vue

🎤 微软语音合成工具，使用 Electron + Vue + ElementPlus + Vite 构建。

electron element-plus tts vue

Last synced: 24 Mar 2025

https://github.com/chrox/readest

Readest is a modern, feature-rich ebook reader designed for avid readers offering seamless cross-platform access, powerful tools, and an intuitive interface to elevate your reading experience.

android cross-platform ebook ebook-reader epub foliate ios nextjs reader sync tauri tauri2 tts

Last synced: 31 Mar 2025

https://github.com/yl4579/StyleTTS2

StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models

adversarial-training deep-learning diffusion-models gan latent-diffusion latent-diffusion-models pytorch speaker-adaptation speech-synthesis text-to-speech tts wavlm

Last synced: 09 Apr 2025

https://github.com/yl4579/styletts2

StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models

adversarial-training deep-learning diffusion-models gan latent-diffusion latent-diffusion-models pytorch speaker-adaptation speech-synthesis text-to-speech tts wavlm

Last synced: 14 May 2025

https://github.com/ten-framework/ten-agent

TEN Agent is a conversational voice AI agent powered by TEN, integrating Deepseek, Gemini, OpenAI, RTC, and hardware like ESP32. It enables realtime AI capabilities like seeing, hearing, and speaking, and is fully compatible with platforms like Dify and Coze.

agent ai asr cpp gemini golang gpt-4 gpt-4o llm low-latency multimodal nextjs14 openai python rag real-time realtime tts vision voice-assistant

Last synced: 31 Mar 2025

https://github.com/snakers4/silero-models

Silero Models: pre-trained speech-to-text, text-to-speech and text-enhancement models made embarrassingly simple

asr capitalization colab english german onnx pretrained-models pytorch repunctuation spanish speech speech-recognition speech-synthesis speech-to-text stt stt-benchmark text-to-speech torch-hub tts tts-models

Last synced: 13 May 2025

https://github.com/TEN-framework/TEN-Agent

TEN Agent is a conversational voice AI agent powered by TEN, integrating Deepseek, Gemini, OpenAI, RTC, and hardware like ESP32. It enables realtime AI capabilities like seeing, hearing, and speaking, and is fully compatible with platforms like Dify and Coze.

agent ai asr cpp gemini golang gpt-4 gpt-4o llm low-latency multimodal nextjs14 openai python rag real-time realtime tts vision voice-assistant

Last synced: 08 Mar 2025

https://github.com/myshell-ai/MeloTTS

High-quality multi-lingual text-to-speech library by MyShell.ai. Support English, Spanish, French, Chinese, Japanese and Korean.

chinese english french japanese korean multilingual spanish text-to-speech tts

Last synced: 24 Mar 2025

https://github.com/nexaai/nexa-sdk

Nexa SDK is a comprehensive toolkit for supporting GGML and ONNX models. It supports text generation, image generation, vision-language models (VLM), Audio Language Model, auto-speech-recognition (ASR), and text-to-speech (TTS) capabilities.

asr audio edge-computing language-model llm on-device-ai on-device-ml sdk sdk-python stable-diffusion transformers tts vlm whisper

Last synced: 11 May 2025

https://github.com/moonintheriver/diffsinger

DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism (SVS & TTS); AAAI 2022; Official code

aaai2022 diffusion-model diffusion-speedup midi singing-synthesis singing-voice singing-voice-database singing-voice-synthesis speech-synthesis text-to-speech tts

Last synced: 13 May 2025

https://github.com/MoonInTheRiver/DiffSinger

DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism (SVS & TTS); AAAI 2022; Official code

aaai2022 diffusion-model diffusion-speedup midi singing-synthesis singing-voice singing-voice-database singing-voice-synthesis speech-synthesis text-to-speech tts

Last synced: 02 Apr 2025

https://github.com/NexaAI/nexa-sdk

Nexa SDK is a comprehensive toolkit for supporting GGML and ONNX models. It supports text generation, image generation, vision-language models (VLM), Audio Language Model, auto-speech-recognition (ASR), and text-to-speech (TTS) capabilities.

asr audio edge-computing language-model llm on-device-ai on-device-ml sdk sdk-python stable-diffusion transformers tts vlm whisper

Last synced: 07 Feb 2025

https://github.com/whisperspeech/whisperspeech

An Open Source text-to-speech system built by inverting Whisper.

pytorch speech-synthesis tts

Last synced: 14 May 2025

https://github.com/metavoiceio/metavoice-src

Foundational model for human-like, expressive TTS

ai deep-learning pytorch speech speech-synthesis text-to-speech tts voice-clone zero-shot-tts

Last synced: 14 May 2025

https://github.com/collabora/whisperspeech

An Open Source text-to-speech system built by inverting Whisper.

pytorch speech-synthesis tts

Last synced: 10 Dec 2024

https://github.com/tensorspeech/tensorflowtts

:stuck_out_tongue_closed_eyes: TensorFlowTTS: Real-Time State-of-the-art Speech Synthesis for Tensorflow 2 (supported including English, French, Korean, Chinese, German and Easy to adapt for other languages)

chinese-tts fastspeech fastspeech2 german-tts japanese-tts korea-tts melgan mobile-tts multi-speaker-tts multiband-melgan parallel-wavegan real-time speech-synthesis tacotron2 tensorflow2 text-to-speech tflite tts vocoder zh-tts

Last synced: 09 Apr 2025

https://github.com/TensorSpeech/TensorFlowTTS

:stuck_out_tongue_closed_eyes: TensorFlowTTS: Real-Time State-of-the-art Speech Synthesis for Tensorflow 2 (supported including English, French, Korean, Chinese, German and Easy to adapt for other languages)

chinese-tts fastspeech fastspeech2 german-tts japanese-tts korea-tts melgan mobile-tts multi-speaker-tts multiband-melgan parallel-wavegan real-time speech-synthesis tacotron2 tensorflow2 text-to-speech tflite tts vocoder zh-tts

Last synced: 24 Mar 2025

https://github.com/TensorSpeech/TensorflowTTS

:stuck_out_tongue_closed_eyes: TensorFlowTTS: Real-Time State-of-the-art Speech Synthesis for Tensorflow 2 (supported including English, French, Korean, Chinese, German and Easy to adapt for other languages)

chinese-tts fastspeech fastspeech2 german-tts japanese-tts korea-tts melgan mobile-tts multi-speaker-tts multiband-melgan parallel-wavegan real-time speech-synthesis tacotron2 tensorflow2 text-to-speech tflite tts vocoder zh-tts

Last synced: 28 Nov 2024

https://github.com/jing332/tts-server-android

这是一个Android系统TTS应用，内置微软演示接口，可自定义HTTP请求，可导入其他本地TTS引擎，以及根据中文双引号的简单旁白/对话识别朗读，还有自动重试，备用配置，文本替换等更多功能。

android compose-ui golang jetpack-compose kotlin legado microsoft tts

Last synced: 15 May 2025

https://github.com/abus-aikorea/voice-pro

Gradio WebUI for creators and developers, featuring key TTS (Edge-TTS, kokoro) and zero-shot Voice Cloning (E2 & F5-TTS, CosyVoice), with Whisper audio processing, YouTube download, Demucs vocal isolation, and multilingual translation.

audiobook faster-whisper gradio karaoke podcasts speech-recognition speech-synthesis speech-to-text subtitles text-to-speech transcription translator tts voice-cloning voice-conversion webui whisper whisperx yt-dlp

Last synced: 14 May 2025

https://github.com/peterh0323/streamer-sales

Streamer-Sales 销冠 —— 卖货主播 LLM 大模型🛒🎁，一个能够根据给定的商品特点从激发用户购买意愿角度出发进行商品解说的卖货主播大模型。🚀⭐内含详细的数据生成流程❗ 📦另外还集成了 LMDeploy 加速推理🚀、RAG检索增强生成 📚、TTS文字转语音🔊、数字人生成 🦸、 Agent 使用网络查询实时信息🌐、ASR 语音转文字🎙️、Vue 生态搭建前端🍍、FastAPI 搭建后端🗝️、Docker-compose 打包部署🐋

asr chat chat-application chatbot chatgpt digital-human gpt internlm-chat-7b internlm2 llm meta-human rag text-generation tts

Last synced: 14 May 2025

https://github.com/enhuiz/vall-e

An unofficial PyTorch implementation of the audio LM VALL-E

audio-lm pytorch text-to-speech tts vall-e valle

Last synced: 15 May 2025

https://github.com/keithito/tacotron

A TensorFlow implementation of Google's Tacotron speech synthesis with pre-trained model (unofficial)

machine-learning python speech-synthesis tacotron tensorflow tts

Last synced: 14 May 2025

https://github.com/tensorflow/lingvo

Lingvo

asr distributed gpu-computing language-model lm machine-translation mnist nlp research seq2seq speech speech-recognition speech-synthesis speech-to-text tensorflow translation tts

Last synced: 13 May 2025

https://github.com/readbeyond/aeneas

aeneas is a Python/C library and a set of tools to automagically synchronize audio and text (aka forced alignment)

alignment audio cli dtw espeak espeak-ng festival ffmpeg forced-alignment linux macos nlp python smil speech srt text text-to-speech tts windows

Last synced: 14 May 2025

https://github.com/liou666/polyglot

🤖️ Cross-platform AI language practice app （跨平台AI语言练习应用）

azure chatgpt electron openai polyglot tts

Last synced: 14 May 2025

https://github.com/remsky/kokoro-fastapi

Dockerized FastAPI wrapper for Kokoro-82M text-to-speech model w/CPU ONNX and NVIDIA GPU PyTorch support, handling, and auto-stitching

fastapi huggingface-spaces kokoro kokoro-tts onnx onnxruntime openai-compatible-api openwebui pytorch sillytavern tts tts-api uv

Last synced: 14 May 2025

https://github.com/PeterH0323/Streamer-Sales

Streamer-Sales 销冠 —— 卖货主播 LLM 大模型🛒🎁，一个能够根据给定的商品特点从激发用户购买意愿角度出发进行商品解说的卖货主播大模型。🚀⭐内含详细的数据生成流程❗ 📦另外还集成了 LMDeploy 加速推理🚀、RAG检索增强生成 📚、TTS文字转语音🔊、数字人生成 🦸、 Agent 使用网络查询实时信息🌐、ASR 语音转文字🎙️、Vue 生态搭建前端🍍、FastAPI 搭建后端🗝️、Docker-compose 打包部署🐋

asr chat chat-application chatbot chatgpt digital-human gpt internlm-chat-7b internlm2 llm meta-human rag text-generation tts

Last synced: 11 Apr 2025

https://github.com/marytts/marytts

MARY TTS -- an open-source, multilingual text-to-speech synthesis system written in pure java

java speech-synthesis text-to-speech tts

Last synced: 12 May 2025

https://github.com/pndurette/gtts

Python library and CLI tool to interface with Google Translate's text-to-speech API

cli gtts pypi python python-library speech speech-api text-to-speech tts

Last synced: 12 May 2025

https://github.com/furkangozukara/stable-diffusion

FLUX, Stable Diffusion, SDXL, SD3, LoRA, Fine Tuning, DreamBooth, Training, Automatic1111, Forge WebUI, SwarmUI, DeepFake, TTS, Animation, Text To Video, Tutorials, Guides, Lectures, Courses, ComfyUI, Google Colab, RunPod, Kaggle, NoteBooks, ControlNet, TTS, Voice Cloning, AI, AI News, ML, ML News, News, Tech, Tech News, Kohya, Midjourney, RunPod

ai-art coding deepfake-generation dreambooth education flux-dev flux-lora generative-ai guides how-to image-to-video-generation kohya-webui learning lora-training programming stable-diffusion text-to-image text-to-video tts tutorials

Last synced: 14 May 2025

https://github.com/pndurette/gTTS

Python library and CLI tool to interface with Google Translate's text-to-speech API

cli gtts pypi python python-library speech speech-api text-to-speech tts

Last synced: 14 Mar 2025

https://github.com/FurkanGozukara/Stable-Diffusion

FLUX, Stable Diffusion, SDXL, SD3, LoRA, Fine Tuning, DreamBooth, Training, Automatic1111, Forge WebUI, SwarmUI, DeepFake, TTS, Animation, Text To Video, Tutorials, Guides, Lectures, Courses, ComfyUI, Google Colab, RunPod, Kaggle, NoteBooks, ControlNet, TTS, Voice Cloning, AI, AI News, ML, ML News, News, Tech, Tech News, Kohya, Midjourney, RunPod

ai-art coding deepfake-generation dreambooth education flux-dev flux-lora generative-ai guides how-to image-to-video-generation kohya-webui learning lora-training programming stable-diffusion text-to-image text-to-video tts tutorials

Last synced: 10 Apr 2025

https://github.com/fleschutz/powershell

500+ free PowerShell scripts (.ps1) for Linux, Mac OS, and Windows.

automation collection command-line cross-platform powershell powershell-scripts ps1 remote-control scripts tasks tts

Last synced: 05 Apr 2025

https://github.com/fleschutz/PowerShell

500+ free PowerShell scripts (.ps1) for Linux, Mac OS, and Windows.

automation collection command-line cross-platform powershell powershell-scripts ps1 remote-control scripts tasks tts

Last synced: 03 Apr 2025

https://github.com/openctp/openctp

openctp提供CTP股票期权、中泰证券XTP、华鑫证券奇点TORA、东方证券OST、东方财富证券EMT、盈透证券TWS、易盛TAP、量投QDP等各通道的CTPAPI兼容接口，CTP程序可以无缝对接各股票柜台。openctp也提供了一套基于TTS交易系统的模拟环境，同样提供了CTPAPI兼容接口，不仅支持国内期货与期权全品种，也支持A股股票、基金、债券以及股票期权模拟交易，可以替代Simnow，为CTP量化交易开发者提供7x24可用的模拟环境。

ctp ctpapi futures options quant simnow stock tora trader tts xtp

Last synced: 14 May 2025

https://github.com/fatchord/wavernn

WaveRNN Vocoder + TTS

neural-vocoder pytorch speech-synthesis tacotron text-to-speech tts wavernn

Last synced: 15 May 2025

https://github.com/fatchord/WaveRNN

WaveRNN Vocoder + TTS

neural-vocoder pytorch speech-synthesis tacotron text-to-speech tts wavernn

Last synced: 27 Mar 2025

https://github.com/jik876/hifi-gan

HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis

deep-learning gan hifi-gan pytorch speech-synthesis text-to-speech tts vocoder

Last synced: 14 May 2025

https://github.com/lifeiteng/vall-e

PyTorch implementation of VALL-E(Zero-Shot Text-To-Speech), Reproduced Demo https://lifeiteng.github.io/valle/index.html

chatgpt in-context-learning large-language-models text-to-speech tts vall-e valle

Last synced: 15 May 2025

https://github.com/rsxdalv/tts-generation-webui

TTS Generation Web UI (Bark, MusicGen + AudioGen, Tortoise, RVC, Vocos, Demucs, SeamlessM4T, MAGNet, StyleTTS2, MMS, Stable Audio, Mars5, F5-TTS, ParlerTTS)

ai audio-generation audiogen bark deep-learning generator gradio machine-learning magnet music musicgen rvc seamlessm4t styletts2 text-to-speech torch tortoise-tts tts vocos web

Last synced: 04 Apr 2025

https://github.com/iahispano/applio

A simple, high-quality voice conversion tool focused on ease of use and performance.

ai applio pytorch rvc speech speech-to-speech text-to-speech tts vc vits voice voice-clone voice-cloning voice-conversion

Last synced: 14 May 2025

https://github.com/r9y9/deepvoice3_pytorch

PyTorch implementation of convolutional neural networks-based text-to-speech synthesis models

end-to-end machine-learning multi-speaker python pytorch speech-processing speech-synthesis tts

Last synced: 14 May 2025

https://github.com/ikaros-521/ai-vtuber

AI Vtuber是一个由【ChatterBot/ChatGPT/claude/langchain（本地/llm）/chatglm/text-generation-webui/闻达/千问/kimi】驱动的虚拟主播【Live2D/UE/xuniren】，可以在【Bilibili/抖音/快手/微信视频号/斗鱼/YouTube/twitch/TikTok】直播中与观众实时互动或直接在本地进行聊天。它使用TTS技术【edge-tts/VITS/elevenlabs/bark/bert-vits2/睿声】生成回答并可以选择【so-vits-svc/DDSP-SVC】变声；指令协同SD画图。

ai bilibili claude douyin gpt kuaishou langchain live2d llm python qanything stable-diffusion svc tiktok tts twitch ue vits youtube