https://github.com/krishnaadithya/revoice.ai
https://github.com/krishnaadithya/revoice.ai
Last synced: about 2 months ago
JSON representation
- Host: GitHub
- URL: https://github.com/krishnaadithya/revoice.ai
- Owner: krishnaadithya
- License: apache-2.0
- Created: 2025-02-07T05:30:50.000Z (3 months ago)
- Default Branch: main
- Last Pushed: 2025-02-07T07:03:35.000Z (3 months ago)
- Last Synced: 2025-02-07T07:28:41.516Z (3 months ago)
- Language: Python
- Size: 4.88 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# revoice.ai
A simple web application that converts YouTube videos and audio files into synthesized speech using AI models.
## Features
- Process YouTube videos by extracting and converting captions to speech
- Convert uploaded audio files through transcription and voice synthesis
- User-friendly web interface built with Gradio
- Multiple voice options for synthesis
- Automatic caption extraction from YouTube videos## Project Structure
```
project/
├── src/
│ ├── __init__.py # Makes src a package
│ ├── utils/
│ │ ├── __init__.py # Makes utils a package
│ │ ├── audio.py # Audio processing functions
│ │ └── youtube.py # YouTube caption extraction
│ ├── app.py # Gradio interface
│ └── config.py # Configuration settings
├── requirements.txt # Dependencies
├── .env.example # Example environment variables
└── README.md # Documentation
```## Installation
1. Clone the repository:
```bash
git clone
cd
```2. Install system dependencies (Linux):
```bash
apt-get install espeak-ng
```3. Set up environment variables:
```bash
cp .env.example .env
# Edit .env and add your GROQ_API_KEY
```4. Install Python dependencies:
```bash
pip install -r requirements.txt
```## Usage
1. Start the application:
```bash
python -m src.app
```2. Open your web browser and navigate to the provided URL (usually http://127.0.0.1:7860)
3. Use the app by either:
- Entering a YouTube URL
- Uploading an audio file4. Click "Process" and wait for the generated audio
## Technologies Used
- [Gradio](https://gradio.app/): Web interface framework
- [Kokoro](https://github.com/kairess/kokoro): Text-to-speech synthesis
- [Groq](https://groq.com/): Audio transcription using Whisper model
- [PyTubeFix](https://github.com/JuanBindez/pytubefix): YouTube video processing
- [soundfile](https://github.com/bastibe/python-soundfile): Audio file handling
- [pydub](https://github.com/jiaaro/pydub): Audio processing## Requirements
- Python 3.8+
- Groq API key
- espeak-ng (for Linux systems)
- Internet connection for YouTube processing## Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
## License
This project is licensed under the MIT License - see the LICENSE file for details.