https://github.com/kudosscience/fer_and_ter_model
A comprehensive real-time emotion recognition system combining facial and textual analysis with Furhat robot integration for social robotics applications.
https://github.com/kudosscience/fer_and_ter_model
computer-vision deep-learning distilbert emotion-recognition facial-emotion-detection fer2013 furhat-robot machine-learning multimodal-fusion natural-language-processing opencv pytorch real-time-processing social-robotics text-emotion-detection
Last synced: 3 months ago
JSON representation
A comprehensive real-time emotion recognition system combining facial and textual analysis with Furhat robot integration for social robotics applications.
- Host: GitHub
- URL: https://github.com/kudosscience/fer_and_ter_model
- Owner: kudosscience
- Created: 2025-07-29T17:42:41.000Z (9 months ago)
- Default Branch: main
- Last Pushed: 2026-01-01T18:25:59.000Z (4 months ago)
- Last Synced: 2026-01-06T16:52:18.522Z (4 months ago)
- Topics: computer-vision, deep-learning, distilbert, emotion-recognition, facial-emotion-detection, fer2013, furhat-robot, machine-learning, multimodal-fusion, natural-language-processing, opencv, pytorch, real-time-processing, social-robotics, text-emotion-detection
- Language: Jupyter Notebook
- Homepage:
- Size: 10.3 MB
- Stars: 4
- Watchers: 0
- Forks: 0
- Open Issues: 14
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Multimodal Emotion Recognition System
A comprehensive, real-time emotion recognition system that combines **Facial Emotion Recognition (FER)** and **Textual Emotion Recognition (TER)** with advanced multimodal fusion capabilities and **Furhat robot integration**. This system provides accurate emotion detection through computer vision and natural language processing, with support for interactive robotics applications.
## ๐ Key Features
- **๐ญ Facial Emotion Recognition**: Real-time emotion detection from camera feed using CNN models trained on FER2013
- **๐ฌ Textual Emotion Recognition**: Voice-to-text emotion analysis using DistilBERT-based models
- **๐ Multimodal Fusion**: Advanced fusion strategies (confidence-based, weighted average, and formula-based)
- **๐ค Furhat Integration**: Complete social robot platform integration with real-time emotion feedback
- **โก Real-time Processing**: Live emotion recognition with interactive GUI interfaces
- **๐ก๏ธ Robust Architecture**: Comprehensive error handling and fallback mechanisms
- **๐ Comprehensive Testing**: Full test suite with system health checks and validation
## ๐งช Testing & Validation
The system includes comprehensive testing capabilities to ensure reliability:
### Run All Tests
```bash
# Run comprehensive system tests
python tests/test_multimodal_system.py # Complete system validation
python tests/test_fer_model.py # FER component testing
python tests/test_furhat_integration.py # Furhat integration testing
python tests/test_formula_fusion.py # Fusion algorithm testing
python tests/test_corrected_formula.py # Mathematical validation
```
### System Health Check
```bash
# Validate system configuration and dependencies
python demos/demo_multimodal_usage.py
```
The test suite validates:
- Model loading and initialization
- Camera and microphone availability
- Emotion fusion algorithm accuracy
- Robot integration functionality
- Error handling and fallback mechanisms
## ๐ Project Architecture
```text
fer_and_ter_model/
โโโ src/ # Source code
โ โโโ fer/ # Facial Emotion Recognition
โ โ โโโ camera_fer_inference.py
โ โโโ ter/ # Textual Emotion Recognition
โ โ โโโ voice_ter_inference.py
โ โ โโโ setup_voice_ter.py
โ โโโ multimodal/ # Multimodal Fusion
โ โ โโโ multimodal_emotion_inference.py
โ โโโ furhat/ # Furhat Robot Integration
โ โ โโโ furhat_multimodal_emotion_inference.py
โ โโโ utils/ # Shared utilities
โโโ models/ # Trained models
โ โโโ fer2013_final_model.pth
โ โโโ ter_distilbert_model/
โโโ datasets/ # Dataset files
โ โโโ multimodal_emotion_dataset.json
โโโ notebooks/ # Jupyter notebooks
โ โโโ fer2013_model_training.ipynb
โ โโโ multimodal_emotion_fusion.ipynb
โ โโโ multimodal_emotion_recognition.ipynb
โ โโโ textual_emotion_recognition_distilbert.ipynb
โโโ tests/ # Test scripts
โ โโโ test_corrected_formula.py
โ โโโ test_fer_model.py
โ โโโ test_formula_fusion.py
โ โโโ test_furhat_integration.py
โ โโโ test_multimodal_system.py
โโโ demos/ # Demo scripts
โ โโโ demo_furhat_usage.py
โ โโโ demo_multimodal_usage.py
โ โโโ demo_usage.py
โ โโโ demo_voice_ter.py
โโโ docs/ # Documentation
โ โโโ DATASET_SETUP.md
โ โโโ FER_PROJECT_SUMMARY.md
โ โโโ FURHAT_INTEGRATION_SUMMARY.md
โ โโโ IMPLEMENTATION_SUMMARY.md
โ โโโ JUNIE.md
โ โโโ README_*.md files
โโโ requirements/ # Requirements files
โ โโโ requirements_backup.txt
โ โโโ requirements_camera_inference.txt
โ โโโ requirements_furhat.txt
โ โโโ requirements_multimodal.txt
โ โโโ requirements_voice_ter.txt
โโโ README.md # This file
```
## ๐ Quick Start
### Prerequisites
- Python 3.8+ (3.11+ recommended)
- Webcam for facial emotion recognition
- Microphone for voice/text emotion recognition
- GPU support recommended for optimal performance
### Installation
1. **Clone the repository**:
```bash
git clone https://github.com/kudosscience/fer_and_ter_model.git
cd fer_and_ter_model
```
2. **Install the package**:
```bash
# Install with all components
pip install -e ".[all]"
# OR install specific components:
pip install -e ".[multimodal]" # For multimodal processing
pip install -e ".[furhat]" # For Furhat robot integration
pip install -e ".[camera_inference]" # For FER only
pip install -e ".[voice_ter]" # For TER only
```
3. **Alternative: Install from requirements files**:
```bash
# Choose based on your use case:
pip install -r requirements/requirements_multimodal.txt # Recommended
pip install -r requirements/requirements_furhat.txt # For robot integration
pip install -r requirements/requirements_camera_inference.txt # FER only
pip install -r requirements/requirements_voice_ter.txt # TER only
```
### Quick Demo
```bash
# Run the comprehensive multimodal demo
python demos/demo_multimodal_usage.py
# Try individual components
python demos/demo_usage.py # Basic FER demo
python demos/demo_voice_ter.py # TER demo
python demos/demo_furhat_usage.py # Furhat integration demo
```
### Console Commands
After installation, you can use these console commands:
```bash
# Run individual components
fer-camera # Launch facial emotion recognition
ter-voice # Launch textual emotion recognition
multimodal-emotion # Launch multimodal system
furhat-emotion # Launch Furhat integration
```
## ๐งฉ System Components
### ๐ญ FER (Facial Emotion Recognition)
- **Real-time Processing**: Live camera feed emotion detection with GUI overlay
- **CNN Architecture**: Deep learning model trained on FER2013 dataset
- **7 Emotion Classes**: Happy, Sad, Angry, Fear, Surprise, Disgust, Neutral
- **High Accuracy**: Optimized model with robust preprocessing pipeline
- **Fallback Support**: Graceful handling of camera unavailability
**Usage**:
```bash
# Direct script execution
python src/fer/camera_fer_inference.py
# Console command
fer-camera
```
### ๐ฌ TER (Textual Emotion Recognition)
- **Voice-to-Text Pipeline**: Real-time speech recognition and emotion analysis
- **DistilBERT Model**: State-of-the-art transformer-based emotion classification
- **Multi-emotion Support**: Comprehensive emotion category detection
- **Robust Processing**: Advanced text preprocessing and normalization
- **Audio Fallback**: Multiple audio input handling strategies
**Usage**:
```bash
# Direct script execution
python src/ter/voice_ter_inference.py
# Console command
ter-voice
```
### ๐ Multimodal Fusion System
The core innovation of this system - combines FER and TER for superior accuracy:
- **Multiple Fusion Strategies**:
- **Confidence-based**: Selects prediction with highest confidence score
- **Weighted Average**: Combines predictions with 60% facial, 40% textual weighting
- **Formula-based**: Advanced mathematical fusion using custom algorithm
- **Real-time Integration**: Simultaneous processing of visual and audio streams
- **Adaptive Fallback**: Works with single modality when needed
- **Interactive GUI**: Live visualization of both streams and fusion results
**Usage**:
```bash
# Default multimodal processing
python src/multimodal/multimodal_emotion_inference.py
# With specific fusion strategy
python src/multimodal/multimodal_emotion_inference.py --fusion confidence_based
python src/multimodal/multimodal_emotion_inference.py --fusion weighted_average
python src/multimodal/multimodal_emotion_inference.py --fusion formula_based
# Console command
multimodal-emotion
```
### ๐ค Furhat Robot Integration
Complete social robotics platform integration for interactive emotion recognition:
- **Furhat Remote API**: Official SDK integration for robot communication
- **Interactive Responses**: Robot gestures, speech, and LED feedback based on detected emotions
- **Voice Integration**: Uses robot's microphone for natural interaction
- **Real-time Feedback**: Immediate emotional responses and social cues
- **Robust Connection**: Graceful fallback when robot unavailable
**Robot Capabilities**:
- Emotional gesture mapping (BigSmile, Frown, Surprised, etc.)
- LED color changes reflecting emotional states
- Speech synthesis for emotion acknowledgment
- Interactive conversation flow
**Usage**:
```bash
# Furhat integration (requires robot connection)
python src/furhat/furhat_multimodal_emotion_inference.py
# With fusion strategy
python src/furhat/furhat_multimodal_emotion_inference.py --fusion formula_based
# Console command
furhat-emotion
```
## ๐ Comprehensive Documentation
Detailed documentation for each component is available in the `docs/` directory:
### Core Documentation
- **[IMPLEMENTATION_SUMMARY.md](docs/IMPLEMENTATION_SUMMARY.md)** - Complete technical implementation overview
- **[FER_PROJECT_SUMMARY.md](docs/FER_PROJECT_SUMMARY.md)** - Facial emotion recognition deep dive
- **[FURHAT_INTEGRATION_SUMMARY.md](docs/FURHAT_INTEGRATION_SUMMARY.md)** - Robot integration guide
- **[DATASET_SETUP.md](docs/DATASET_SETUP.md)** - Dataset preparation and training
### Component-Specific Guides
- **[README_multimodal.md](docs/README_multimodal.md)** - Multimodal system usage and configuration
- **[README_furhat.md](docs/README_furhat.md)** - Furhat robot setup and integration
- **[README_camera_inference.md](docs/README_camera_inference.md)** - FER system configuration
- **[README_voice_ter.md](docs/README_voice_ter.md)** - Voice/text emotion recognition setup
## โ๏ธ Advanced Configuration
### Custom Model Usage
```bash
# Use custom FER model
python src/multimodal/multimodal_emotion_inference.py \
--fer_model ./path/to/custom_fer_model.pth
# Use custom TER model
python src/multimodal/multimodal_emotion_inference.py \
--ter_model ./path/to/custom_ter_model/
```
### Fusion Strategy Selection
```bash
# Confidence-based fusion (chooses most confident prediction)
python src/multimodal/multimodal_emotion_inference.py --fusion confidence_based
# Weighted average fusion (60% facial, 40% textual)
python src/multimodal/multimodal_emotion_inference.py --fusion weighted_average
# Formula-based fusion (mathematical optimization)
python src/multimodal/multimodal_emotion_inference.py --fusion formula_based
```
### Performance Optimization
```bash
# GPU acceleration (if available)
export CUDA_VISIBLE_DEVICES=0
python src/multimodal/multimodal_emotion_inference.py
# CPU-only mode
export CUDA_VISIBLE_DEVICES=""
python src/multimodal/multimodal_emotion_inference.py
```
## ๐ง Development & Extension
### Project Structure Overview
The system follows a modular architecture allowing easy extension and modification:
- **`src/`** - Core source code with modular components
- **`models/`** - Pre-trained models (FER CNN, TER DistilBERT)
- **`datasets/`** - Training and evaluation datasets
- **`tests/`** - Comprehensive test suite
- **`demos/`** - Usage examples and demonstrations
- **`docs/`** - Detailed technical documentation
- **`requirements/`** - Component-specific dependency management
### Adding New Components
1. Create module in appropriate `src/` subdirectory
2. Add requirements to relevant requirements file
3. Create tests in `tests/` directory
4. Add demo in `demos/` directory
5. Update documentation in `docs/`
### Contributing
This project uses professional development practices:
- **Modular Design**: Each component is self-contained and reusable
- **Comprehensive Testing**: Full test coverage with validation scripts
- **Documentation**: Extensive documentation for all components
- **Package Management**: Proper Python packaging with setuptools
- **Console Integration**: Command-line tools for easy usage
## ๐ ๏ธ Troubleshooting
### Common Issues
**Camera not detected**:
```bash
# Test camera access
python -c "import cv2; print('Camera available:', cv2.VideoCapture(0).isOpened())"
```
**Microphone issues**:
```bash
# Test microphone access
python -c "import speech_recognition as sr; print('Microphone available:', len(sr.Microphone.list_microphone_names()) > 0)"
```
**Model loading errors**:
- Ensure models are downloaded and in `models/` directory
- Check file permissions and paths
- Verify CUDA availability for GPU models
**Furhat connection issues**:
- Verify robot IP address and port
- Check network connectivity
- Ensure Furhat Remote API service is running
### Performance Tips
- Use GPU acceleration when available
- Close other applications using camera/microphone
- Ensure adequate lighting for facial recognition
- Use external microphone for better voice recognition
- Run system health check before important usage
## ๐ Model Information
### FER Model (Facial Emotion Recognition)
- **Architecture**: Custom CNN trained on FER2013
- **Input**: 48x48 grayscale facial images
- **Output**: 7 emotion classes (Happy, Sad, Angry, Fear, Surprise, Disgust, Neutral)
- **Accuracy**: Optimized for real-time performance with robust preprocessing
### TER Model (Textual Emotion Recognition)
- **Architecture**: DistilBERT-based transformer
- **Input**: Text transcribed from speech
- **Output**: Multi-dimensional emotion classification
- **Features**: Advanced text preprocessing and normalization
### Fusion Algorithms
- **Confidence-based**: Selects highest confidence prediction
- **Weighted Average**: Optimized 60/40 facial/textual weighting
- **Formula-based**: Mathematical fusion using correlation analysis
## ๐ฏ Use Cases
- **Research**: Emotion recognition research and experimentation
- **Education**: Teaching multimodal AI and emotion recognition
- **Healthcare**: Patient emotion monitoring and therapy assistance
- **Human-Computer Interaction**: Emotional interfaces and feedback systems
- **Social Robotics**: Interactive robots with emotional intelligence
- **Accessibility**: Emotion-aware assistive technologies
## ๐ Performance Metrics
The system has been validated across multiple scenarios:
- **Real-time Processing**: < 100ms latency for combined FER+TER
- **Accuracy**: Improved performance through multimodal fusion
- **Robustness**: Graceful degradation with single modality
- **Scalability**: Modular architecture supports easy extension
## ๐ Related Projects
- [FER2013 Dataset](https://www.kaggle.com/deadskull7/fer2013)
- [Furhat Robot Platform](https://furhatrobotics.com/)
- [DistilBERT Model](https://huggingface.co/distilbert-base-uncased)
- [OpenCV Computer Vision](https://opencv.org/)
## ๐ License
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
## ๐ค Author
Henry Ward
- GitHub: [@kudosscience](https://github.com/kudosscience)
- Email: [45144290+kudosscience@users.noreply.github.com](mailto:45144290+kudosscience@users.noreply.github.com)
## ๐ Acknowledgments
- FER2013 dataset contributors
- Hugging Face Transformers library
- OpenCV community
- Furhat Robotics platform
- PyTorch and scikit-learn teams
## ๐ Support
For issues, questions, or contributions:
1. Check the [Issues](https://github.com/kudosscience/fer_and_ter_model/issues) page
2. Review the comprehensive documentation in `docs/`
3. Run the system health check: `python demos/demo_multimodal_usage.py`
4. Create a new issue with detailed information
---
Built with โค๏ธ for multimodal emotion recognition research and applications