https://github.com/faisal-fida/voiceflow-ai
VoiceFlow AI is a production-ready FastAPI microservices platform that provides real-time speech-to-text transcription and intelligent text classification, specifically optimized for insurance call center analytics.
https://github.com/faisal-fida/voiceflow-ai
classification-model fastapi machine-learning-algorithms torch transformers whisper
Last synced: 2 months ago
JSON representation
VoiceFlow AI is a production-ready FastAPI microservices platform that provides real-time speech-to-text transcription and intelligent text classification, specifically optimized for insurance call center analytics.
- Host: GitHub
- URL: https://github.com/faisal-fida/voiceflow-ai
- Owner: faisal-fida
- Created: 2025-08-03T15:29:56.000Z (11 months ago)
- Default Branch: master
- Last Pushed: 2025-08-03T15:36:36.000Z (11 months ago)
- Last Synced: 2025-08-03T17:26:47.686Z (11 months ago)
- Topics: classification-model, fastapi, machine-learning-algorithms, torch, transformers, whisper
- Language: Python
- Homepage:
- Size: 27.3 KB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# VoiceFlow AI
🎙️ **AI-Powered Voice Analytics Platform for Insurance Industry**
VoiceFlow AI is a production-ready FastAPI microservices platform that provides real-time speech-to-text transcription and intelligent text classification, specifically optimized for insurance call center analytics.
## 🌟 Key Features
### 🎯 **Dual-Service Architecture**
- **Transcription Service**: High-performance speech-to-text using OpenAI Whisper
- **Classification Service**: Intent/outcome classification using specialized DistilBERT models
### 🏥 **Industry-Specialized Models**
- **Medicare**: Optimized for Medicare-related calls and terminology
- **ACA (Affordable Care Act)**: Healthcare marketplace conversations
- **Final Expense**: Life insurance and burial coverage discussions
### 🧠 **Context-Aware Processing**
- Dynamic prompts based on call type and conversation turn
- Specialized models for different conversation stages
- Industry-specific vocabulary and terminology handling
### ⚡ **Production-Ready Features**
- GPU acceleration with automatic CPU fallback
- Docker Swarm deployment with health checks
- Graceful shutdown and resource management
- Comprehensive logging and monitoring
- Async processing for high throughput
## 🚀 Quick Start
### Prerequisites
- Docker & Docker Compose
- NVIDIA GPU (optional, will fallback to CPU)
- Python 3.8+ (for development)
### 🐳 Docker Deployment
1. **Clone the repository**
```bash
git clone
cd voiceflow-ai
```
2. **Build the services**
```bash
# Build transcription service
docker build -f Dockerfile_transcription -t api-transcription:latest .
# Build classification service
docker build -f Dockerfile_classification -t api-classification:latest .
```
3. **Deploy with Docker Swarm**
```bash
# Initialize swarm (if not already done)
docker swarm init
# Create external network
docker network create --driver overlay voiceflow-net
# Deploy the stack
docker stack deploy -c docker-stack.yml voiceflow
```
### 🔧 Development Setup
1. **Install dependencies**
```bash
pip install -r requirements.txt
```
2. **Configure models** (Update `voiceflow_ai/core/config.py`)
```python
# Set your model paths
DISTIL_MODEL = "path/to/your/classification/model"
MEDICARE_MODEL_B = "path/to/medicare/model"
ACA_MODEL = "path/to/aca/model"
# ... other model configurations
```
3. **Run services locally**
```bash
# Transcription service (port 8000)
uvicorn voiceflow_ai.transcription_app:app --host 0.0.0.0 --port 8000
# Classification service (port 9000)
uvicorn voiceflow_ai.classification_app:app --host 0.0.0.0 --port 9000
```
## 📡 API Endpoints
### Transcription Service (Port 8000)
#### `POST /transcribe/`
Convert audio to text with intelligent classification
**Request:**
```bash
curl -X POST "http://localhost:8000/transcribe/" \
-F "file=@audio.wav" \
-F "uuid=unique-call-id" \
-F "connection_id=conn-123" \
-F "turn_number=1" \
-F "model_type=A" \
-F "call_type=medicare"
```
**Response:**
```json
{
"uuid": "unique-call-id",
"transcription": "Hello, I'm calling about Medicare benefits",
"label": "P",
"confidence": 0.89,
"transcription_time": 2.1,
"classification_time": 0.3,
"processed_transcribed_text": "hello i'm calling about medicare benefits",
"model_used": "mc_10.3"
}
```
### Classification Service (Port 9000)
#### `POST /classify/`
Classify transcribed text for call analysis
**Request:**
```bash
curl -X POST "http://localhost:9000/classify/" \
-F "transcribed_text=I'm interested in Medicare plans" \
-F "serial_number=conn-123" \
-F "model_type=A" \
-F "call_type=medicare"
```
**Response:**
```json
{
"label": "P",
"confidence": 0.92,
"model_used": "mc_10.3"
}
```
### Health Checks
- `GET /health` - Service health status
- `POST /shutdown` - Graceful shutdown
## 🏗️ Architecture
```
┌─────────────────────┐ ┌─────────────────────┐
│ Client/Frontend │ │ Load Balancer │
└──────────┬──────────┘ └──────────┬──────────┘
│ │
└──────────────────────────┘
│
┌────────────┴────────────┐
│ │
┌────────▼────────┐ ┌───────────▼──────────┐
│ Transcription │ │ Classification │
│ Service │ │ Service │
│ (Port 8000) │ │ (Port 9000) │
│ │ │ │
│ ┌─────────────┐ │ │ ┌──────────────────┐ │
│ │ Whisper │ │ │ │ DistilBERT │ │
│ │ Model │ │ │ │ Models │ │
│ └─────────────┘ │ │ │ - Medicare │ │
└─────────────────┘ │ │ - ACA │ │
│ │ - Final Expense │ │
│ └──────────────────┘ │
└──────────────────────┘
```
## 🔧 Configuration
Key configuration options in `voiceflow_ai/core/config.py`:
```python
class Settings:
# Model paths
DISTIL_MODEL = "path/to/base/classification/model"
MEDICARE_MODEL_B = "path/to/medicare/model"
ACA_MODEL = "path/to/aca/model"
FE_MODEL_B = "path/to/final-expense/model"
# Whisper configuration
WHISPER_MODEL = "openai/whisper-tiny.en"
# Processing type
TYPE = True # True for multi-class, False for 3-class classification
```
## 📊 Call Classification Labels
### Medicare Calls
- **P**: Positive/Interested
- **N**: Negative/Not Interested
- **U**: Unclear/Uncertain
- **DNC**: Do Not Call
- **CB**: Call Back
- **AP**: Age-related responses
- And 70+ specialized labels...
### ACA Calls
- **ELI**: Eligible
- **SUB**: Subsidy-related
- **ICE**: Insurance coverage existing
- **ACA**: ACA-specific responses
- And 45+ specialized labels...
### Final Expense Calls
- **BENE**: Beneficiary discussions
- **COST**: Cost-related inquiries
- **SCAM**: Scam detection
- And 45+ specialized labels...
## 🚀 Performance & Scaling
### Resource Requirements
- **GPU**: NVIDIA GPU with CUDA support (recommended)
- **RAM**: 8GB+ (16GB+ for production)
- **CPU**: 4+ cores
- **Storage**: 10GB+ for models
### Scaling Configuration
```yaml
# docker-stack.yml
services:
transcription:
deploy:
replicas: 9 # Scale based on load
classification:
deploy:
replicas: 1 # Lighter classification service
```
## 🔒 Security & Compliance
- Google Service Account integration for cloud services
- Secure file handling with automatic cleanup
- Request tracking with unique connection IDs
- Health monitoring and graceful shutdowns
## 📈 Monitoring & Logging
Comprehensive logging with:
- Request/response tracking
- Performance metrics (processing times)
- Error handling and debugging
- Serial number tracking for call tracing
## 🛠️ Development
### Project Structure
```
voiceflow_ai/
├── core/ # Core utilities and config
│ ├── config.py # Configuration settings
│ ├── dependencies.py # Dependency injection
│ ├── logger.py # Logging configuration
│ └── transcription_processor.py
├── routers/ # API route handlers
│ ├── transcription_router.py
│ └── classification_router.py
├── services/ # Business logic services
│ ├── transcription_service.py
│ └── classification_service.py
├── classification_app.py # Classification FastAPI app
└── transcription_app.py # Transcription FastAPI app
```
### Adding New Models
1. **Update configuration** in `config.py`
```python
NEW_MODEL_PATH = "path/to/new/model"
```
2. **Initialize in service**
```python
self.new_model = AutoModelForSequenceClassification.from_pretrained(NEW_MODEL_PATH)
```
3. **Add classification logic**
```python
if call_type == "new_type":
model = self.new_model
tokenizer = self.new_tokenizer
```
## 🤝 Contributing
1. Fork the repository
2. Create a feature branch (`git checkout -b feature/amazing-feature`)
3. Commit your changes (`git commit -m 'Add amazing feature'`)
4. Push to the branch (`git push origin feature/amazing-feature`)
5. Open a Pull Request
## 🙏 Acknowledgments
- **OpenAI Whisper** for speech recognition capabilities
- **Hugging Face Transformers** for BERT-based classification
- **FastAPI** for the high-performance web framework
- **Docker** for containerization and deployment