Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/wartem/seeed_tts_service
https://github.com/wartem/seeed_tts_service
Last synced: about 1 month ago
JSON representation
- Host: GitHub
- URL: https://github.com/wartem/seeed_tts_service
- Owner: Wartem
- Created: 2024-12-01T12:51:57.000Z (about 1 month ago)
- Default Branch: main
- Last Pushed: 2024-12-01T13:21:35.000Z (about 1 month ago)
- Last Synced: 2024-12-01T13:41:28.947Z (about 1 month ago)
- Language: Python
- Size: 0 Bytes
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# FastAPI TTS Service
A high-performance Text-to-Speech (TTS) service built with FastAPI and Piper TTS, featuring real-time audio playback and a command-line interface. The service is optimized for the Seeed ReSpeaker device but works with standard audio output devices as well.
## Features
- Real-time text-to-speech synthesis using Piper TTS
- Optimized audio playback with device-specific configurations
- Audio queue management with playback controls
- REST API endpoints for text input and audio control
- Interactive CLI client with rich terminal interface
- Comprehensive status monitoring and logging
- Support for custom sample rates and audio resampling
- Built-in error handling and resource management## Prerequisites
- Python 3.8+
- Piper TTS models
- PyAudio
- Seeed ReSpeaker (optional)## Installation
1. Clone the repository:
```bash
git clone https://github.com/Wartem/fastapi-tts-service
cd fastapi-tts-service
```2. Install dependencies:
```bash
pip install fastapi uvicorn sounddevice numpy pyaudio piper-tts rich aiohttp
```3. Download Piper TTS models:
```bash
mkdir piper-models
# Download your preferred model and place in piper-models directory
# Example: sv_SE-nst-medium.onnx and sv_SE-nst-medium.onnx.json
```## Usage
### Starting the Server
1. Run the FastAPI server:
```bash
python server.py
```The server will start on `http://localhost:8912`
### Using the CLI Client
1. Run the client:
```bash
python client.py
```2. Use the interactive menu to:
- Send text for speech synthesis
- Stop playback
- Check service status
- Exit the application## API Endpoints
- `POST /text`: Convert text to speech
```json
{
"text": "Text to be spoken"
}
```- `POST /play`: Queue raw audio for playback
```json
{
"audio_data": [float_array],
"sample_rate": 22050
}
```- `POST /stop`: Stop current playback and clear queue
- `GET /status`: Get service status
## Configuration
### Audio Settings
Default audio configuration in `server.py`:
```python
RATE = 48000 # Sample rate
CHANNELS = 2 # Stereo output
FORMAT = pyaudio.paFloat32
CHUNK = 1024 # Buffer size
```### TTS Model
Default Piper model configuration in `server.py`:
```python
model_path = "./piper-models/sv_SE-nst-medium.onnx"
config_path = "./piper-models/sv_SE-nst-medium.onnx.json"
```## Error Handling
The service includes comprehensive error handling:
- Audio device initialization failures
- TTS synthesis errors
- Network communication issues
- Resource cleanup on shutdown## Technical Details
- **Audio Processing**: High-quality resampling with linear interpolation
- **Queue Management**: Thread-safe audio queue with status tracking
- **Device Management**: Automatic Seeed ReSpeaker detection with fallback
- **Resource Management**: Proper cleanup of audio resources and temporary files
- **Async Support**: Full async/await support in both server and client## System Requirements
- CPU: 1+ cores
- RAM: 2GB+ recommended
- Storage: 100MB+ for models
- Network: Local network access
- Audio: Compatible audio output device## Troubleshooting
1. **Audio Device Not Found**:
- Check audio device connections
- Verify PyAudio installation
- Check device permissions2. **Model Loading Failed**:
- Verify model files exist in correct location
- Check model file permissions
- Ensure correct model format3. **Playback Issues**:
- Check audio device settings
- Verify audio format compatibility
- Check system volume levels