https://github.com/rizkirakasiwi/rtvc
Realtime AI Voice Chat Example
https://github.com/rizkirakasiwi/rtvc
ai llm openai realtime-voice voicechat
Last synced: about 1 month ago
JSON representation
Realtime AI Voice Chat Example
- Host: GitHub
- URL: https://github.com/rizkirakasiwi/rtvc
- Owner: rizkirakasiwi
- Created: 2025-08-29T18:06:53.000Z (about 1 month ago)
- Default Branch: main
- Last Pushed: 2025-08-29T19:09:51.000Z (about 1 month ago)
- Last Synced: 2025-08-29T21:44:21.273Z (about 1 month ago)
- Topics: ai, llm, openai, realtime-voice, voicechat
- Language: Python
- Homepage:
- Size: 15.6 KB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Real-Time Voice Conversation (RTVC)
Project Example for Basic real-time voice chat application that demonstrates how to build AI-powered voice conversations using Python. This project showcases integration between OpenAI's AI services, real-time audio streaming, and web technologies to create a seamless voice chat experience.
## 🎯 Overview
This application enables natural voice conversations with AI through a web browser. Users can speak directly to the application, which transcribes their speech, generates intelligent responses using GPT, and speaks back using text-to-speech - all in real-time.
## ✨ Features
- 🎤 **Real-time voice streaming** using FastRTC for low-latency audio
- 🗣️ **Speech-to-text** transcription with OpenAI Whisper
- 🧠 **AI-powered responses** using GPT models
- 🔊 **Text-to-speech** output with OpenAI TTS
- 🌐 **Web-based interface** built with Gradio and FastAPI
- 📝 **Conversation history** management and persistence
- ☁️ **Cloud deployment ready** with Gradio Spaces support
- 🔄 **Automatic error handling** and recovery## 🚀 Quick Start
### Prerequisites
- Python 3.8 or higher
- OpenAI API key ([Get one here](https://platform.openai.com/api-keys))
- Microphone access in your browser### Installation
1. **Clone the repository:**
```bash
git clone https://github.com/rizkirakasiwi/rtvc.git
cd rtvc
```2. **Create and activate a virtual environment:**
```bash
python -m venv venv
# Windows
venv\Scripts\activate
# macOS/Linux
source venv/bin/activate
```3. **Install dependencies:**
```bash
pip install -r requirements.txt
```4. **Set up environment variables:**
Create a `.env` file in the project root:
```env
OPENAI_API_KEY=your_openai_api_key_here
```5. **Run the application:**
```bash
python main.py
```6. **Open your browser:**
Navigate to `http://localhost:7860` and start talking!## 🏗️ Architecture
The application follows a modular architecture:
```
rtvc/
├── main.py # API entry point
├── ui.py # UI entry point (can't run both main.py and ui.py)
├── app_config.py # Stream configuration
├── stream_handler.py # Main audio processing pipeline
├── ai_services.py # OpenAI API integrations
├── audio_processing.py # Audio format conversion
├── requirements.txt # Python dependencies
└── .env # Environment variables
```### Component Overview
- **FastAPI Backend**: Handles HTTP requests and WebSocket connections
- **Gradio Frontend**: Provides the web interface and real-time audio streaming
- **Audio Pipeline**: 16kHz input → Processing → 24kHz output
- **AI Services**: Whisper (STT) → GPT (Text) → TTS (Speech)## 🔧 Configuration
### Environment Variables
| Variable | Description | Required |
|----------|-------------|----------|
| `OPENAI_API_KEY` | Your OpenAI API key | ✅ Yes |
| `TWILIO_ACCOUNT_SID` | Twilio account SID (for Spaces deployment) | ❌ No |
| `TWILIO_AUTH_TOKEN` | Twilio auth token (for Spaces deployment) | ❌ No |### Audio Settings
- **Input**: 16kHz sample rate, mono channel
- **Output**: 24kHz sample rate, WAV format
- **Processing**: Real-time chunks with automatic padding## 🌐 Deployment
### Local Development
```bash
python main.py
```### Gradio Spaces
The application automatically detects Gradio Spaces environment and configures:
- Twilio TURN credentials for WebRTC
- Concurrency limits (5 concurrent users)
- Time limits (90 seconds per session)### Docker (Optional)
```dockerfile
FROM python:3.9-slimWORKDIR /app
COPY requirements.txt .
RUN pip install -r requirements.txtCOPY . .
EXPOSE 7860CMD ["python", "main.py"]
```## 🛠️ Development
### Project Structure
- `main.py` - FastAPI application entry point
- `ui.py` - UI application entry point
- `app_config.py` - Gradio Stream configuration with audio settings
- `stream_handler.py` - Core audio processing pipeline
- `ai_services.py` - OpenAI API integrations (Whisper, GPT, TTS)
- `audio_processing.py` - Audio format conversion utilities### Key Dependencies
| Package | Purpose |
|---------|---------|
| `fastapi[standard]` | Web framework and server |
| `fastrtc[vad]` | Real-time audio streaming with voice activity detection |
| `gradio` | Web UI framework |
| `openai` | OpenAI API client |
| `python-dotenv` | Environment variable management |
| `numpy` | Audio data processing |### Adding Features
1. **Custom AI Models**: Modify `ai_services.py` to use different models
2. **Audio Effects**: Extend `audio_processing.py` for filters or effects
3. **UI Customization**: Update `app_config.py` to modify the Gradio interface
4. **New Endpoints**: Add routes in `main.py` for additional functionality## 🤝 Contributing
Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.
1. Fork the repository
2. Create your feature branch (`git checkout -b feature/AmazingFeature`)
3. Commit your changes (`git commit -m 'Add some AmazingFeature'`)
4. Push to the branch (`git push origin feature/AmazingFeature`)
5. Open a Pull Request## 📝 License
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
## 🙏 Acknowledgments
- [OpenAI](https://openai.com/) for providing the AI services
- [Gradio](https://gradio.app/) for the excellent web interface framework
- [FastRTC](https://github.com/gradio-app/fastrtc) for real-time audio streaming
- The open-source community for inspiration and tools## 📞 Support
If you have any questions or run into issues:
1. Check the [Issues](https://github.com/rizkirakasiwi/rtvc/issues) page
2. Create a new issue with detailed information
3. Join the discussion in existing issues---
**Built with ❤️ using Python, OpenAI, and Gradio**