https://github.com/dwain-barnes/local-multilingual-voice-translator

This project is a real-time, multilingual voice translator that leverages the power of local AI models for speech-to-text, translation, and text-to-speech. It is designed to be a powerful and flexible tool for anyone who needs to communicate across language barriers.
https://github.com/dwain-barnes/local-multilingual-voice-translator

Last synced: 29 days ago
JSON representation

Host: GitHub
URL: https://github.com/dwain-barnes/local-multilingual-voice-translator
Owner: dwain-barnes
Created: 2025-09-07T12:35:16.000Z (about 1 month ago)
Default Branch: main
Last Pushed: 2025-09-07T12:42:37.000Z (about 1 month ago)
Last Synced: 2025-09-07T14:39:13.907Z (about 1 month ago)
Language: HTML
Size: 22.5 KB
Stars: 0
Watchers: 0
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# Real-Time Multilingual Voice Translator

## Demo

https://www.youtube.com/watch?v=2XjPpDcFOgQ

## Key Features

- **Real-Time Translation**: Speak in any supported language and hear translations with natural voice synthesis.
- **Multilingual Support**: Supports over 23 languages for both translation and voice synthesis.
- **Local AI-Powered**: Utilizes local models, ensuring privacy and offline functionality.
- **High-Quality Voice Synthesis**: Powered by Chatterbox TTS for natural-sounding voice output.
- **Accurate Speech-to-Text**: Integrates with Distil-Whisper FastRTC for precise transcriptions.
- **Web Interface**: User-friendly web interface built with Gradio and FastAPI.

## Supported Languages

| Code | Language |
|:-----|:-----------|
| ar | Arabic |
| da | Danish |
| de | German |
| el | Greek |
| en | English |
| es | Spanish |
| fi | Finnish |
| fr | French |
| he | Hebrew |
| hi | Hindi |
| it | Italian |
| ja | Japanese |
| ko | Korean |
| ms | Malay |
| nl | Dutch |
| no | Norwegian |
| pl | Polish |
| pt | Portuguese |
| ru | Russian |
| sv | Swedish |
| sw | Swahili |
| tr | Turkish |
| zh | Chinese |

## Installation Guide

Follow these steps precisely to ensure all dependencies are installed in the correct order.

1. **Clone the Repository**
```bash
git clone https://github.com/dwain-barnes/multilingual-voice-translator-realtime.git
cd multilingual-voice-translator-realtime
```

2. **Create and Activate a Conda Environment**

We recommend using Anaconda to manage the environment, as shown in the setup video.
```bash
# Create a new environment named 'translator' with Python 3.11
conda create -n translator python=3.11 -y

# Activate the new environment
conda activate translator
```

3. **Install Dependencies in Order**

Run each of the following commands one by one. This specific order is crucial for the application to work correctly.
```bash
# 1. Install core numerical and scientific libraries
pip install numpy scipy

# 2. Install the specific PyTorch version required by Chatterbox (with CUDA 11.8)
# If you don't have an NVIDIA GPU, you can try removing "+cu118" and the --index-url
pip install torch==2.0.0+cu118 torchaudio==2.0.0+cu118 --index-url https://download.pytorch.org/whl/cu118

# 3. Install other requirements for TTS and audio processing
pip install librosa transformers diffusers safetensors requests httpx pyaudio

# 4. Install Chatterbox TTS
pip install chatterbox-tts

# 5. Install dependencies for the web server, real-time communication, and STT
pip install python-dotenv
pip install "fastrtc[vad,stt,tts]"
pip install distil-whisper-fastrtc
pip install openai
pip install uvicorn
```

4. **Set Up Local AI Models**
- **LM Studio**: Download and install LM Studio from [https://lmstudio.ai/](https://lmstudio.ai/).
- After installing, search for and download the **`Hunyuan-MT-7B-GGUF`** model from the in-app model browser.
- Once downloaded, navigate to the local server tab (`<->`) and load the model.
- Start the server.
- **Chatterbox & Whisper Models**: The required models for text-to-speech and speech-to-text will be downloaded automatically the first time you run the application.

5. **Create Environment File**

Create a file named `.env` in the root directory (`local-multilingual-voice-translator`) and add the following content:
``` LM_STUDIO_BASE_URL=http://localhost:1234/v1
LM_STUDIO_API_KEY=lm-studio
WHISPER_MODEL=distil-whisper/distil-large-v3
```

## How to Run

1. Ensure your **LM Studio server is running** with the **`Hunyuan-MT-7B-GGUF`** model loaded.
2. Make sure your `translator` conda environment is active in your terminal.
3. Run the application from your terminal:
```bash
python translator.py
```
4. Open your web browser and navigate to **`http://127.0.0.1:7860`**.
5. Select your source and target languages, and start translating!

## Contributing

Contributions are welcome! Please feel free to submit a pull request or open an issue if you have any feedback or suggestions.

## License

This project is licensed under the MIT License. See the [LICENSE](LICENSE) file for more details.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/dwain-barnes/local-multilingual-voice-translator

Awesome Lists containing this project

README