https://github.com/ebrahimar/ai-voice-cloner-xtts-v2
A Streamlit web app for AI-powered voice cloning using Coqui XTTS v2. Record or upload reference voices, clone speech in multiple languages, and generate natural audio outputs.
https://github.com/ebrahimar/ai-voice-cloner-xtts-v2
ai-voice coqui-tts deep-learning json multilingual-tts numpy pydub python speech-synthesis streamlit text-to-speech tts-model voice-cloning xtts-v2
Last synced: 22 days ago
JSON representation
A Streamlit web app for AI-powered voice cloning using Coqui XTTS v2. Record or upload reference voices, clone speech in multiple languages, and generate natural audio outputs.
- Host: GitHub
- URL: https://github.com/ebrahimar/ai-voice-cloner-xtts-v2
- Owner: EbrahimAR
- License: mit
- Created: 2025-09-08T03:37:37.000Z (about 1 month ago)
- Default Branch: main
- Last Pushed: 2025-09-08T14:10:24.000Z (about 1 month ago)
- Last Synced: 2025-09-08T16:25:46.849Z (about 1 month ago)
- Topics: ai-voice, coqui-tts, deep-learning, json, multilingual-tts, numpy, pydub, python, speech-synthesis, streamlit, text-to-speech, tts-model, voice-cloning, xtts-v2
- Language: Python
- Homepage:
- Size: 77.5 MB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.MD
- License: LICENSE
Awesome Lists containing this project
README
# 🗣️ AI Voice Cloner (XTTS v2)
A **Streamlit web app** for cloning voices using [Coqui TTS](https://github.com/coqui-ai/TTS).
Upload or record reference voices, fine-tune style & settings, and generate natural-sounding speech in 16+ languages — all through a clean, dark-themed interface.---
## 🚀 Features
- 📁 **Voice gallery management**
- Upload, record, save, activate, and delete reference voices
- Batch delete multiple voices at once
- Preview and switch between saved voices
- 🌍 **Multi-language support**
- 16+ languages including English, Spanish, French, German, Chinese, Japanese, and more
- 🎭 **Voice style controls**
- Choose between neutral, fast, and expressive styles
- Optional speed and emotion overrides (happy, sad, angry, surprised, fearful)
- ⚙️ **Advanced settings**
- Control voice stability and similarity for better cloning accuracy
- Toggle loudness normalization for consistent output
- 📊 **Generation history**
- Automatic metadata saving (text, style, reference voice, timestamp)
- Recent generations gallery with preview, download, and re-synthesize option
- 💾 **Multiple export formats**
- Download results as WAV or MP3 (requires FFmpeg)
- 🎨 **Modern dark UI**
- Custom-styled Streamlit interface with smooth UX and mobile-friendly layout---
## 🛠 Tech Stack
### **Frontend / UI**
- [Streamlit](https://streamlit.io/): Web interface for interaction and real-time updates
- **streamlit-option-menu**: Sidebar navigation with a clean UI
- Custom CSS styling for dark mode & better UX
- HTML5 `MediaRecorder` for in-browser recording### **Backend / Logic**
- [Coqui TTS (XTTS v2)](https://github.com/coqui-ai/TTS): Voice cloning & text-to-speech model
- **pydub**: Audio file conversion (WebM/MP3 ↔ WAV)
- **librosa & soundfile**: Audio signal processing
- **NumPy**: Array and numerical operations### **Utilities**
- **ffmpeg (system dependency)**: Required for audio encoding/decoding
- **streamlit.components.v1** – custom recorder integration---
## 📂 Project Structure
voice_clone_app/
│
├── outputs/ # Generated outputs (auto-created)
│ └── xtts_*.wav # Generated files
│
├── voices/ # Reference voices (auto-created)
│ ├── ref_*.wav # Uploaded/recorded samples
│
├── app.py # Main Streamlit application
├── run_app.py # Launcher (cross-platform, auto-opens browser)
├── requirements.txt # Python dependencies
├── LICENSE # Open-source license
├── .gitignore # Ignored files/folders
└── README.md # Project documentation---
## ⚙️ Installation
### **1. Clone the Repository**
```bash
git clone https://github.com/EbrahimAR/AI-Voice-Cloner-XTTS-v2.git
cd AI-Voice-Cloner-XTTS-v2
```### **2. Create virtual environment**
```bash
python -m venv .venv
source .venv/bin/activate # On Linux/Mac
.venv\Scripts\activate # On Windows
```### **3. Install dependencies**
```bash
pip install -r requirements.txt
```### **4. Install system dependencies**
- **ffmpeg** is required for MP3 conversion:
- Windows: `choco install ffmpeg`
- macOS: `brew install ffmpeg`
- Linux: `sudo apt-get install ffmpeg`---
## ▶️ Run the app
```bash
streamlit run app.py
```Or use the helper script:
```bash
python run_app.py
```
App will open at: http://localhost:8502---
## 👨💻 Author
Ebrahim Abdul Raoof[LinkedIn](https://www.linkedin.com/in/ebrahim-ar/) | [GitHub](https://github.com/EbrahimAR)
---
## 📜 License
This project is licensed under the MIT License. See [LICENSE](https://github.com/EbrahimAR/AI-Voice-Cloner-XTTS-v2/blob/main/LICENSE) for details.