https://github.com/furkankarakuz/translateai

TranslateAI is a powerful real-time speech translation desktop application built using PyQt and Hugging Face models. It enables users to convert spoken words into text and translate them into different languages.
https://github.com/furkankarakuz/translateai

huggingface-models pyqt speech-recognition translation

Last synced: about 2 months ago
JSON representation

Host: GitHub
URL: https://github.com/furkankarakuz/translateai
Owner: furkankarakuz
License: apache-2.0
Created: 2025-03-21T21:27:25.000Z (about 2 months ago)
Default Branch: main
Last Pushed: 2025-03-24T12:20:36.000Z (about 2 months ago)
Last Synced: 2025-03-24T13:28:45.221Z (about 2 months ago)
Topics: huggingface-models, pyqt, speech-recognition, translation
Language: Python
Homepage:
Size: 111 KB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# TranslateAI

MacOS

Windows

https://github.com/user-attachments/assets/6f808863-a731-440b-bffb-5955a849544a

This app allows you to:

- Select your preferred input device, whether it's a microphone or system audio, ensuring flexibility in capturing speech.
- Speak or even play pre-recorded audio files, and the app will process and transcribe the speech into text in real time.
- Enjoy an automatic translation trigger, which activates after about 1 second of silence, making the experience smooth and natural.
- Translate between multiple languages, including Turkish 🇹🇷, English 🇬🇧, Spanish 🇪🇸, and French 🇫🇷, with the possibility of adding more in the future.

Designed with ease of use and efficiency in mind, TranslateAI brings a fluid translation experience for various use cases, whether you're learning a new language, working in multilingual environments, or simply looking for an intuitive speech-to-text tool.

## 📌 Contents

- 🚀 Features
- 📦 Installation
- MacOS
- Windows
- Linux
- 🛠️ Usage
- 🧠 How It Works
- Speech Detection
- Translation Model
- ⚙️ Configuration
- 🤝 Contributing
- 📝 License

---

## 🚀 Features

- 🎙️ **Real-time Speech & Audio Translation**
- 🔊 **Visual Indicator for Sound Intensity**
- 🌐 **Multi-language Support: Turkish 🇹🇷, English 🇺🇸, Spanish 🇪🇸, French 🇫🇷 (More to come!)**
- 🤖 **Hugging Face Model Integration (Helsinki-NLP)**
- ⚡ **Automatic Model Download Based on Selected Language**
- 🖥️ **User-Friendly PyQt GUI**

---

## 📦 Installation

### 1. Clone the Repository

```bash
git clone https://github.com/furkankarakuz/TranslateAI.git
cd TranslateAI
```

### 2. Set Up a Python Environment (Recommended)

It’s often best to use a virtual environment:

```bash
python -m venv venv
source venv/bin/activate # On Mac/Linux
venv\Scripts\activate # On Windows
```

### 3. Install Dependencies

Install required Python packages:

```bash
pip install -r requirements.txt
```

> **Note**: If `requirements.txt` doesn’t exist or you prefer manual installation, make sure to install:
> - `PyQt5` or `PyQt6` (depending on your code)
> - `pyaudio` (or `sounddevice` / `pydub` if you adapt the code)
> - `torch`, `transformers`, `datasets` (for Hugging Face)

### MacOS

On **macOS**, you may need to install `portaudio` via Homebrew before installing `pyaudio`:

```bash
brew install portaudio
```

Then:

```bash
pip install pyaudio
```

### Windows

On **Windows**, you can directly install `pyaudio` from PyPI:

```bash
pip install pyaudio
```

If you encounter issues, you might need to install the appropriate **Microsoft Visual C++ Build Tools**.

### Linux

On **Linux** (Ubuntu/Debian-based), you may need:

```bash
sudo apt-get update
sudo apt-get install portaudio19-dev python3-pyaudio
```

Then install Python packages as usual:

```bash
pip install -r requirements.txt
```

---

## 🛠️ Usage

1. **Run the Application**
```bash
python app.py
```

2. **Select Your Audio Device**
- In the top section of the UI, choose your microphone or system audio input (e.g., “MacBook Air Microphone …”).

3. **Choose Language Pair**
- Select the **source** and **target** language. For example, `EN-TR` means you will speak English and get Turkish translations.

4. **Start Recording**
- Click **Start Record** to begin capturing audio.
- Watch the volume indicator to see if audio is being detected.
- After you **stop speaking for about 1 second**, the application will automatically **trigger the translation**.

5. **View Translations**
- The translated text will appear in the console.

6. **Stop Recording**
- Click **Stop Record** when finished.

---

## 🧠 How It Works

### Speech Detection

- **PyAudio** captures your microphone input (or chosen device).
- A **volume level meter** is displayed to help you monitor input levels.
- Once **silence** (below a certain threshold) is detected for ~1 second, the app processes the captured audio chunk and sends it for transcription & translation.

### Translation Model

- The application uses **Hugging Face** Transformers to download and run the appropriate translation model for the chosen language pair.
- **Model caching**: Once a model is downloaded, it should be reused for subsequent translations to save time.
- The translations are displayed in real-time, providing an **instant** feedback loop.

---

## ⚙️ Configuration

- **Silence Threshold & Delay**: Currently set to about 1 second. You can modify this value in the code if you want quicker or slower triggers.
- **Language Support**: To add a new language pair, you need to:
- Find a Hugging Face translation model that supports that pair.
- Update the UI to include the new language option.
- Adjust the code that downloads/loads the model.

---

## 🤝 Contributing

Contributions are welcome! Feel free to:

- **Fork** this repository.
- **Create** a new branch.
- **Commit** your changes.
- Open a **Pull Request** describing the improvements or bug fixes you’ve made.

---

## 📝 License

This project is licensed under the Apache License 2.0

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/furkankarakuz/translateai

Awesome Lists containing this project

README