Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/tristan-mcinnis/simultaneous-interpretation

Simultaneous-Interpretation is an advanced tool for real-time simultaneous interpretation. It transcribes and translates spoken language from a microphone input instantaneously, continually refining translations for accuracy. Ideal for business meetings, educational settings, and live events, it enhances multilingual communication effortlessly.
https://github.com/tristan-mcinnis/simultaneous-interpretation

agents asr faster-whisper openai pyaudio simultaneous-intepreting simultaneous-translation speech-recognition speech-to-text transcription translation whisper

Last synced: 7 days ago
JSON representation

Host: GitHub
URL: https://github.com/tristan-mcinnis/simultaneous-interpretation
Owner: tristan-mcinnis
License: mit
Created: 2024-07-08T08:48:22.000Z (7 months ago)
Default Branch: main
Last Pushed: 2024-07-08T10:39:17.000Z (7 months ago)
Last Synced: 2024-11-16T18:43:21.771Z (2 months ago)
Topics: agents, asr, faster-whisper, openai, pyaudio, simultaneous-intepreting, simultaneous-translation, speech-recognition, speech-to-text, transcription, translation, whisper
Language: Python
Homepage:
Size: 10.7 KB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# Simultaneous-Interpretation

## Introduction
Simultaneous-Interpretation is an advanced tool designed to provide real-time simultaneous interpretation. Harnessing the power of leading transcription and translation technologies, Simultaneous-Interpretation transcribes spoken language from a microphone input and translates it almost instantaneously. This tool is heavily inspired by Andrew Ng's idea of 'agentic' translation, ensuring that translations are refined and improved recursively for higher accuracy and contextual relevance.

## Features
- **Simultaneous Transcription and Interpretation**: Transcripts speech from your microphone in real-time and translates it into your chosen language.
- **Recursive Interpretation**: Continuously refines translations for enhanced accuracy.
- **Custom Dictionary Integration**: Preprocess translations using a custom dictionary for specialized terminology.
- **Rich Logging**: Comprehensive logging of transcriptions and interpretations for review.
- **Audio Translation Playback**: Plays back translated text through your preferred output device.

## Understanding Simultaneous Interpretation
Simultaneous interpretation is crucial for environments where real-time multilingual communication is necessary, such as international conferences, multilingual meetings, and live broadcasts. Simultaneous-Interpretation leverages AI models and speech recognition technologies to provide quick and accurate translations, making it an invaluable tool for scenarios like:
- **Business Meetings**: Facilitate clear communication across different languages.
- **Educational Settings**: Enhance comprehension of lectures or seminars delivered in a foreign language.
- **Live Events**: Provide on-the-fly translations for conferences, webinars, and live broadcasts.

## Installation
To install and run Simultaneous-Interpretation:
1. **Clone the Repository**:
```sh
git clone https://github.com/yourusername/simultaneous-interpretation.git
cd simultaneous-interpretation
```

2. **Create a Virtual Environment**:
```sh
python -m venv venv
source venv/bin/activate # On Windows, use `venv\Scripts\activate`
```

3. **Install Required Packages**:
```sh
pip install -r requirements.txt
```

4. **Set up Environment Variables**:
```sh
OPENAI_API_KEY=your_openai_api_key
````
## Usage
To start:
1. **List Available Audio Devices**:
```sh
python simultaneous_interpretation.py
```
2. **Select Devices and Begin Interpretation**:
Follow the prompts to:
- Select your input and output devices.
- Choose the input language code (en for English, zh for Chinese).
- Enable translation and audio playback as needed.
- Optionally, provide a custom dictionary file.
- Specify the topic for contextual accuracy.

3. **Stop Listening**:
Press CTRL + C to stop. The log will be saved to a file in your Downloads folder.

## Custom Dictionary Format
To use a custom dictionary, create a text file with each term-to-translation mapping on a new line:
```sh
term1=translation1
term2=translation2
...
```
## Influences
This project is inspired by Andrew Ng’s concept of 'agentic' translation, emphasizing continuous refinement and accuracy.