https://github.com/amir-mohseni/voicebridge

This repository provides a dockerized Speech-to-Speech application that supports text-to-audio conversion, audio-to-text transcription, and interactive voice-based conversations. It is easy to set up and use, offering a versatile platform for speech and text processing.
https://github.com/amir-mohseni/voicebridge

docker huggingface python transformer tts whisper

Last synced: 6 months ago
JSON representation

Host: GitHub
URL: https://github.com/amir-mohseni/voicebridge
Owner: Amir-Mohseni
Created: 2025-01-12T22:11:06.000Z (6 months ago)
Default Branch: main
Last Pushed: 2025-01-14T13:57:14.000Z (6 months ago)
Last Synced: 2025-01-14T14:27:18.861Z (6 months ago)
Topics: docker, huggingface, python, transformer, tts, whisper
Language: Python
Homepage:
Size: 371 KB
Stars: 2
Watchers: 1
Forks: 1
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# VoiceBridge: A Speech-to-Speech (STS) Application

This repository contains a Text-to-Speech, Speech-to-text, and Speech-to-Speech application that is dockerized and easy to use. The application allows you to convert text to audio, transcribe audio to text using Whisper, and have back-and-forth conversations with an LLM.

## Prerequisites

- Docker installed on your machine

## Running the Application

1. **Clone the repository:**

```sh
git clone https://github.com/Amir-Mohseni/VoiceBridge.git
cd VoiceBridge
```

2. **Build the Docker image:**

```sh
docker build -t vb-app .
```

3. **Run the Docker container:**

```sh
docker run -it --rm -p 7860:7860 vb-app
```

This will start the application and map port 7860 of the container to port 7860 on your host machine.

4. **Access the application:**

Open your web browser and navigate to `http://127.0.0.1:7860` to use the application.

## Features

- **Text to Audio:** Convert text to speech using various voices.
- **Audio to Text:** Transcribe audio files to text using Whisper.
- **Audio Conversation:** Have a conversation with the AI using voice input and receive voice responses.

![App Showcase](docs/App.png)

## Files

- `main.py`: The main script to run the application.
- `text2audio.py`: The script for converting text to audio.
- `transcriber.py`: The script for transcribing audio to text.
- `llm.py`: The script for interacting with the LLM. The Hugging Face model can be easily changed in this file.
- `Dockerfile`: The Dockerfile used to build the Docker image.
- `requirements.txt`: The file containing Python dependencies.

## License

This project is licensed under the MIT License. See the [LICENSE](LICENSE) file for details.

## Contributing

Contributions are welcome! Please open an issue or submit a pull request.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/amir-mohseni/voicebridge

Awesome Lists containing this project

README