Projects in Awesome Lists tagged with voice-cloning
A curated list of projects in awesome lists tagged with voice-cloning .
https://github.com/corentinj/real-time-voice-cloning
Clone a voice in 5 seconds to generate arbitrary speech in real-time
deep-learning python pytorch tensorflow tts voice-cloning
Last synced: 22 Apr 2025
https://github.com/CorentinJ/Real-Time-Voice-Cloning
Clone a voice in 5 seconds to generate arbitrary speech in real-time
deep-learning python pytorch tensorflow tts voice-cloning
Last synced: 14 Mar 2025
https://github.com/rvc-boss/gpt-sovits
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
text-to-speech tts vits voice-clone voice-cloneai voice-cloning
Last synced: 22 Apr 2025
https://github.com/coqui-ai/tts
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
deep-learning glow-tts hifigan melgan multi-speaker-tts python pytorch speaker-encoder speaker-encodings speech speech-synthesis tacotron text-to-speech tts tts-model vocoder voice-cloning voice-conversion voice-synthesis
Last synced: 22 Apr 2025
https://github.com/coqui-ai/TTS
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
deep-learning glow-tts hifigan melgan multi-speaker-tts python pytorch speaker-encoder speaker-encodings speech speech-synthesis tacotron text-to-speech tts tts-model vocoder voice-cloning voice-conversion voice-synthesis
Last synced: 14 Mar 2025
https://github.com/RVC-Boss/GPT-SoVITS
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
text-to-speech tts vits voice-clone voice-cloneai voice-cloning
Last synced: 24 Mar 2025
https://github.com/funaudiollm/cosyvoice
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
audio-generation cantonese chatbot chatgpt chinese cosyvoice cross-lingual english fine-grained fine-tuning gpt-4o japanese korean multi-lingual natural-language-generation python text-to-speech tts voice-cloning
Last synced: 22 Apr 2025
https://github.com/huanshere/videolingo
Netflix-level subtitle cutting, translation, alignment, and even dubbing - one-click fully automated AI video subtitle team | Netflix级字幕切割、翻译、对齐、甚至加上配音,一键全自动视频搬运AI字幕组
ai-translation dubbing localization video-translation voice-cloning
Last synced: 23 Apr 2025
https://github.com/paddlepaddle/paddlespeech
Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.
asr code-switch conformer kws punctuation-restoration self-supervised-learning sound-classification speech-alignment speech-recognition speech-synthesis speech-translation streaming-asr streaming-tts transformer tts vocoder voice-cloning voice-recognition wav2vec2 whisper
Last synced: 21 Apr 2025
https://github.com/PaddlePaddle/PaddleSpeech
Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.
asr code-switch conformer kws punctuation-restoration self-supervised-learning sound-classification speech-alignment speech-recognition speech-synthesis speech-translation streaming-asr streaming-tts transformer tts vocoder voice-cloning voice-recognition wav2vec2 whisper
Last synced: 24 Mar 2025
https://github.com/drewthomasson/ebook2audiobook
Convert ebooks to audiobooks with chapters and metadata using dynamic AI models and voice cloning. Supports 1,107+ languages!
audiobooks chinese colab-notebook docker english epub gradio kaggle linux mac multilingual tts voice-cloning windows xtts
Last synced: 23 Apr 2025
https://github.com/FunAudioLLM/CosyVoice
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
audio-generation cantonese chatbot chatgpt chinese cosyvoice cross-lingual english fine-grained fine-tuning gpt-4o japanese korean multi-lingual natural-language-generation python text-to-speech tts voice-cloning
Last synced: 24 Mar 2025
https://github.com/Huanshere/VideoLingo
Netflix-level subtitle cutting, translation, alignment, and even dubbing - one-click fully automated AI video subtitle team | Netflix级字幕切割、翻译、对齐、甚至加上配音,一键全自动视频搬运AI字幕组
ai-translation dubbing localization video-translation voice-cloning
Last synced: 18 Dec 2024
https://github.com/multimodal-art-projection/yue
YuE: Open Full-song Music Generation Foundation Model, something similar to Suno.ai but open
ai audio-generation deep-learning foundation-models gpt huggingface llama llms music-generation style-transfers voice-cloning
Last synced: 15 Apr 2025
https://github.com/abus-aikorea/voice-pro
Gradio WebUI for creators and developers, featuring key TTS (Edge-TTS, kokoro) and zero-shot Voice Cloning (E2 & F5-TTS, CosyVoice), with Whisper audio processing, YouTube download, Demucs vocal isolation, and multilingual translation.
audiobook faster-whisper gradio karaoke podcasts speech-recognition speech-synthesis speech-to-text subtitles text-to-speech transcription translator tts voice-cloning voice-conversion webui whisper whisperx yt-dlp
Last synced: 13 Apr 2025
https://github.com/camb-ai/mars5-tts
MARS5 speech model (TTS) from CAMB.AI
prosody speech speech-synthesis text-to-speech voice-cloneai voice-cloning
Last synced: 11 Apr 2025
https://github.com/iahispano/applio
A simple, high-quality voice conversion tool focused on ease of use and performance.
ai applio pytorch rvc speech speech-to-speech text-to-speech tts vc vits voice voice-clone voice-cloning voice-conversion
Last synced: 10 Apr 2025
https://github.com/IAHispano/Applio
A simple, high-quality voice conversion tool focused on ease of use and performance
ai applio pytorch rvc speech speech-to-speech text-to-speech tts vc vits voice voice-clone voice-cloning voice-conversion
Last synced: 14 Nov 2024
https://github.com/coqui-ai/open-speech-corpora
💎 A list of accessible speech corpora for ASR, TTS, and other Speech Technologies
speech-emotion-recognition speech-processing speech-recognition speech-separation speech-synthesis speech-to-text stt text-to-speech tts voice-activity-detection voice-cloning voice-recognition
Last synced: 26 Mar 2025
https://github.com/DrewThomasson/ebook2audiobook
Generates an audiobook with chapters and ebook metadata using Calibre and Xtts from Coqui tts, and with optional voice cloning, and supports multiple languages
audiobooks chinese docker english epub gradio linux mac multilingual tts voice-cloning windows xtts
Last synced: 15 Dec 2024
https://github.com/gitmylo/audio-webui
A webui for different audio related Neural Networks
ai aio all-in-one artificial-intelligence audiocraft audioldm bark bark-gui generative-audio generative-music music rvc rvc-gui text-to-audio text-to-speech tts voice-cloning
Last synced: 14 Mar 2025
https://github.com/Tomiinek/Multilingual_Text_to_Speech
An implementation of Tacotron 2 that supports multilingual experiments with parameter-sharing, code-switching, and voice cloning.
code-switching multilingual speech-synthesis text-to-speech tts voice-cloning
Last synced: 03 Apr 2025
https://github.com/wladradchenko/wunjo.wladradchenko.ru
Wunjo AI: Synthesize & clone voices in English, Russian & Chinese, real-time speech recognition, deepfake face & lips animation, face swap with one photo, change video by text prompts, segmentation, and retouching. Open-source, local & free.
controlnet deepfake deepfake-emotion deepfakes diffusion face-swap face-swapping free image-animation retouching-video segment-anything tacotron2 talking-face talking-face-generation talking-head tts vid2vid voice-cloning voice-recognition wunjo
Last synced: 17 Nov 2024
https://github.com/PlayVoice/lora-svc
singing voice change based on whisper, and lora for singing voice clone
lora singing-voice-conversion speech-to-sing uni-svc vits vits-svc voice-change voice-cloning voice-conversion whisper
Last synced: 11 Apr 2025
https://github.com/playvoice/lora-svc
singing voice change based on whisper, and lora for singing voice clone
lora singing-voice-conversion speech-to-sing uni-svc vits vits-svc voice-change voice-cloning voice-conversion whisper
Last synced: 04 Apr 2025
https://github.com/jackaduma/CycleGAN-VC2
Voice Conversion by CycleGAN (语音克隆/语音转换): CycleGAN-VC2
aigc cyclegan cyclegan-vc cyclegan-vc2 deep-learning deeplearning gan pix2pix pytorch-implementation speech-synthesis voice-cloning voice-conversion
Last synced: 11 Apr 2025
https://github.com/jackaduma/cyclegan-vc2
Voice Conversion by CycleGAN (语音克隆/语音转换): CycleGAN-VC2
aigc cyclegan cyclegan-vc cyclegan-vc2 deep-learning deeplearning gan pix2pix pytorch-implementation speech-synthesis voice-cloning voice-conversion
Last synced: 05 Apr 2025
https://github.com/vlomme/Multi-Tacotron-Voice-Cloning
Phoneme multilingual(Russian-English) voice cloning based on
deep-learning g2p pytorch russian tacotron tensorflow tts voice-cloning wavernn
Last synced: 27 Nov 2024
https://github.com/HKoon/ChatTTS-OpenVoice
Fuse ChatTTS with OpenVoice, upload a 10-second audio clip, and clone your personalized ChatTTS voice.
chattts openvoice tts voice-clone voice-cloning
Last synced: 11 Apr 2025
https://github.com/boltzmannentropy/xtts2-ui
A User Interface for XTTS-2 Text-Based Voice Cloning using only 10 seconds of speech
coqui-tts streamlit tts voice-cloning
Last synced: 12 Apr 2025
https://github.com/lukaszliniewicz/Pandrator
Turn PDFs and EPUBs into audiobooks, subtitles or videos into dubbed videos (including translation), and more. For free. Pandrator uses local models, notably XTTS, including voice-cloning (instant, RVC-enhanced, XTTS fine-tuning) and LLM processing. It aspires to be a user-friendly app with a GUI, an installer and all-in-one packages.
audiobook audiobook-creator audiobook-maker audiobooks customtkinterprojects dubbing llm pdf-to-audio rvc silero subtitle-to-speech subtitle-to-voice text-processing text-to-speech tkinter-gui voice-clone voice-cloning voicecraft xtts xttsv2
Last synced: 25 Jan 2025
https://github.com/FlorianEagox/WeeaBlind
A program to dub non-english media with modern AI speech synthesis, diarization, and voice cloning!
a11y accessibility anime blindness diariz dubbing python tts voice-cloning
Last synced: 20 Nov 2024
https://github.com/CMsmartvoice/One-Shot-Voice-Cloning
:relaxed: One Shot Voice Cloning base on Unet-TTS
one-shot style-transfer tts voice-cloning
Last synced: 27 Apr 2025
https://github.com/MiniMax-AI/MiniMax-MCP
Official MiniMax Model Context Protocol (MCP) server that enables interaction with powerful Text to Speech, image generation and video generation APIs.
image-generation image-to-video mcp mcp-server mcp-tools text-to-image text-to-speech text-to-video video-generation voice-cloning
Last synced: 27 Apr 2025
https://github.com/dunky11/voicesmith
[WIP] VoiceSmith makes training text to speech models easy.
dataset-manager delightfultts preprocessing speech-synthesis text-to-speech toolkit tts univnet voice-cloning
Last synced: 10 Jan 2025
https://github.com/aifsh/comfyui-gpt_sovits
a comfyui custom node for GPT-SoVITS! you can voice cloning and tts in comfyui now
Last synced: 14 Apr 2025
https://github.com/smoke-trees/voice-synthesis
This repository is an implementation of Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis (SV2TTS) with a vocoder that works in real-time. SV2TTS is a three-stage deep learning framework that allows to create a numerical representation of a voice from a few seconds of audio, and to use it to condition a text-to-speech model trained to generalize to new voices.
keras pytorch-implementation speech-to-text sv2tts tensorflow voice-cloning voice-synthesis
Last synced: 22 Nov 2024
https://github.com/jackaduma/cyclegan-vc3
Voice Conversion by CycleGAN (语音克隆/语音转换):CycleGAN-VC3
aigc cyclegan cyclegan-vc cyclegan-vc2 cyclegan-vc3 gan pytorch pytorch-implementation voice-cloning voice-conversion
Last synced: 27 Apr 2025
https://github.com/adiksondev/youtranslate
Takes a youtube video, clones the voice and re-creates that video in a different language
ai collaborate elevenlabs-api github localization-tool translation voice-cloning voice-recognition youtube
Last synced: 24 Jan 2025
https://github.com/olaviinha/neuraltexttoaudio
Text prompt steered synthetic audio generators
audio audio-generation audio-processing audio-synthesis audioldm colab colab-notebook mubert mubertai music-generation text2audio text2music voice-cloning voice-synthesis
Last synced: 14 Jan 2025
https://github.com/gooofy/zerovox
zero-shot realtime TTS system, fully offline, free and open source
deep-learning hifigan melgan multi-speaker-tts python pytorch speaker-encoder speaker-encodings speech speech-synthesis text-to-speech tts tts-model voice-cloning voice-synthesis
Last synced: 14 Apr 2025
https://github.com/pnkvalavala/digitaltwin
Using a single image and just 10 seconds of sample audio, our project enables you to create a video where it appears as if you're speaking the desired text.
audio-driven-talking-face deep-fake talking-face-generation text-to-speech voice-cloning
Last synced: 25 Jan 2025
https://github.com/ttop32/coqui_tts_korea
Korean TTS using coqui TTS (glowtts and multiband melgan) - 한국어 TTS
coqui coqui-ai deep-learning glow-tts half-life korea korean korean-language korean-letters korean-text-processing korean-tokenizer korean-tts multiband-melgan pytorch speech speech-synthesis text-to-speech tts vocoder voice-cloning
Last synced: 11 Nov 2024
https://github.com/nateraw/voice-cloning
Make Kanye sing any song ya want 🎤🔥
gradio huggingface kanye so-vits-svc voice-cloning voice-conversion
Last synced: 23 Apr 2025
https://github.com/pnkvalavala/multivoice
Multivoice: Enhance your foreign-language movie and TV show experience with personalized dubbed versions. Our project uses voice cloning and TTS to deliver natural and engaging dubbed dialogue for a seamless viewing adventure.
elevenlabs movies openai text-to-speech translation tv-shows voice-cloning
Last synced: 25 Jan 2025
https://github.com/ardagnsrn/elevenlabs-laravel
This is an Open Source PHP Laravel package for ElevenLabs Text to Speech API.
ai ai-speech ai-tts elevenlabs elevenlabs-api elevenlabs-laravel elevenlabs-php laravel text-to-speech tts tts-ai tts-api voice-cloneai voice-cloning
Last synced: 12 Apr 2025
https://github.com/adhadse/deepdubpy
A complete end-to-end Deep Learning system to generate high quality human like speech in English for Korean Drama (WIP)
cross-language deep-learning korean-drama machine-learning speech-sythesis tensorflow text-to-speech voice-cloning
Last synced: 17 Dec 2024
https://github.com/aryanvbw/aivoiceclone
Transform Your Voice: Replicate Your Unique Sound in a Pristine Pre-Trained Model and Cultivate Your Custom Voiceprint
ai ai-tools artificial-intelligence aryanshop aryanvbw audio-processing clonevoice vivek voice-cloning voice-imitation
Last synced: 14 Apr 2025
https://github.com/expectopatronm/Realtime-voice-cloning-as-a-microservice
SV2TTS as a Microservice (FastAPI endpoint)
Last synced: 11 Apr 2025
https://github.com/ardagnsrn/elevenlabs-js
This is an Open Source NodeJS package for ElevenLabs Text to Speech API.
ai ai-speech ai-tts elevenlabs elevenlabs-api elevenlabs-js elevenlabs-node text-to-speech tts tts-ai tts-api voice-cloneai voice-cloning
Last synced: 12 Apr 2025
https://github.com/iahispano/applio-api
Robust functionality, focused on granting convenient access to AI models developed using the RVC technology.
ai api applio rvc vc voice voice-clone voice-cloning
Last synced: 12 Apr 2025
https://github.com/mobile-artificial-intelligence/babylon.cpp
Babylon.cpp is a C and C++ library for grapheme to phoneme conversion and text to speech synthesis. For phonemization a ONNX runtime port of the DeepPhonemizer model is used. For speech synthesis VITS models are used. Piper models are compatible after a conversion script is run.
11labs artificial-intelligence deep-phonemizer elevenlabs g2p grapheme-to-phoneme neural-tts onnx onnx-models onnx-runtime onnxruntime phonemization test-to-speech tts vits voice-cloning
Last synced: 19 Apr 2025
https://github.com/isaiahbjork/csm-voice-cloning
Sesame CSM 1B Voice Cloning
Last synced: 15 Mar 2025
https://github.com/codename0og/codename-rvc-fork-3
Codename's rvc fork version 3, based on Applio.
ai applio pytorch speech speech-to-speech text-to-speech tts vc vits voice voice-cloning voice-conversion
Last synced: 29 Dec 2024
https://github.com/hrishikesh-gavai/nerv-translate
Problem Statement: Developing A Software For Dubbing Videos.
2024 cloning dubbing imitation problem-statement project python smart-india-hackathon text-to-speech voice-cloning voice-imitation voice-recognition
Last synced: 11 Apr 2025
https://github.com/richardn2002/shizuka-app
Data curation, training and deployment of VITS model(s) of 好本静, Yoshimoto Shizuka, from 君のことが大大大大大好きな100人.
shizuka tts vits voice-cloning yoshimoto
Last synced: 11 Apr 2025
https://github.com/deezer/real-cloned-singer-id
Repository for the ISMIR 2024 Paper "From Real to Cloned Singer Identification".
deepfake-detection music-information-retrieval singer-id source-separation voice-cloning
Last synced: 10 Feb 2025
https://github.com/veeeetzzzz/mars5-tts
Python implementation for the MARS5 TTS repo that allows you to clone a voice with a command line interface.
Last synced: 11 Jan 2025
https://github.com/thismodernday/f5-tts
F5-TTS is a web application that allows users to clone voices and generate text-to-speech audio using advanced AI models.
Last synced: 20 Nov 2024
https://github.com/coffee-expert/intelligent-transspeaker-
A service designed to translate speeches in multimedia using AI and ML voice cloning technology.
landing-page transcription translation voice-cloning
Last synced: 16 Mar 2025
https://github.com/spaceforgets-code/ai-voice-cloning-tool
A script that clones voices using AI and deep learning.
ai artificial-neural-networks clone clonevoice dub flask generative-ai script-generation speech-recognition voice-clone voice-cloneai voice-cloning voice-model voiceclone
Last synced: 07 Mar 2025
https://github.com/flo-bit/youtube-speaker-separation
simple python script that outputs separate audio files for each speaker in a youtube video, using whisper on replicate
speaker-diarization speech-to-text text-to-speech voice-cloning whisper youtube
Last synced: 06 Apr 2025
https://github.com/falkyn7/text-toolkit
Advanced MCP server providing comprehensive text transformation and formatting tools. TextToolkit offers over 40 specialized utilities for case conversion, encoding/decoding, formatting, analysis, and text manipulation - all accessible directly within your AI assistant workflow.
bert conformer glow-tts library melgan multi-speaker-tts nltk speech tacotron text-to-speech tts-model tty voice-cloning whisper
Last synced: 31 Mar 2025
https://github.com/yjg30737/coquitts-kaggle
Using coquiTTS in kaggle notebook
coqui coquitts jupyter-notebook kaggle vocoder voice-cloning
Last synced: 19 Feb 2025
https://github.com/ittia-research/speak
Education oriented TTS inference server
education grpc inference language-learning llm tts voice-cloning
Last synced: 14 Apr 2025
https://github.com/learncodingeasy/voice_clone
My Voice Clone
clone python python3 virtualenv voice voice-cloning
Last synced: 10 Apr 2025
https://github.com/chameleon-ai/redubber
Redubs audio or video with any voice using zero-shot voice cloning by leveraging Amphion Vevo and UltimateVocalRemover.
pytorch vocal-remover voice-cloning voice-conversion zero-shot-voice-conversion
Last synced: 28 Mar 2025