https://github.com/michealxq/gesture-recognition-project
🎯 Real-time Hand Gesture Recognition using MediaPipe and MobileNetV3 (Final Year Project)
https://github.com/michealxq/gesture-recognition-project
ai-project computer-vision deep-learning final-year-project gesture-recognition mediapipe mobilenetv3 python real-time tensorflow
Last synced: 3 months ago
JSON representation
🎯 Real-time Hand Gesture Recognition using MediaPipe and MobileNetV3 (Final Year Project)
- Host: GitHub
- URL: https://github.com/michealxq/gesture-recognition-project
- Owner: michealxq
- Created: 2025-06-22T08:33:06.000Z (4 months ago)
- Default Branch: main
- Last Pushed: 2025-06-23T07:11:39.000Z (4 months ago)
- Last Synced: 2025-06-23T08:24:43.352Z (4 months ago)
- Topics: ai-project, computer-vision, deep-learning, final-year-project, gesture-recognition, mediapipe, mobilenetv3, python, real-time, tensorflow
- Language: Jupyter Notebook
- Homepage:
- Size: 4.14 MB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Gesture-Controlled Media Player using MediaPipe & MobileNetV3
Estimate hand gestures in real-time using MediaPipe and recognize them with a MobileNetV3 model to control a media player application via webcam.---
## Contents
This repository contains:
- Gesture-controlled media player application (PyQt5 GUI)
- Real-time webcam hand gesture prediction using MediaPipe
- MobileNetV3 fine-tuned gesture classification model
- Demo videos and gesture-action mapping
- Icons for visual action feedback---
## Requirements
- Python 3.8+
- TensorFlow 2.11+
- OpenCV 4.5+
- MediaPipe 0.9+
- PyQt5
- VLC Python bindings (`python-vlc`)Install dependencies:
```bash
pip install -r requirements.txt
```---
## Demo
To launch the gesture-controlled media player with webcam input:
```bash
python app.py
```Ensure that you have `.mp4` videos inside the `videos/` directory. If no videos are found, the app will exit with an error.
---
## Gesture to Action Mapping
The following table lists the recognized gestures and their corresponding media commands:| Gesture | Action | Emoji |
|--------------------|------------------|--------|
| Palm | Play | ▶️ |
| Fist | Pause | ⏸️ |
| Call | Next Video | ⏭️ |
| Rock | Previous Video | ⏮️ |
| Like | Volume Up | 🔊 |
| Dislike | Volume Down | 🔉 |
| Two Up | Skip Forward | ⏩ |
| Two Reversed | Skip Backward | ⏪ |
| Stop | Quit App | 🛑 |## How it Works
1. **Hand Detection**: Uses MediaPipe Hands to detect hand landmarks in the webcam frame.
2. **Cropping & Preprocessing**: Extracts and pads the hand region, then resizes and normalizes the image for MobileNetV3.
3. **Gesture Prediction**: Predicts gesture label using the Keras MobileNetV3 model.
4. **Command Execution**: Maps gesture to media control command (play, pause, volume, etc.) and sends it to VLC.
5. **UI Feedback**: Displays gesture label, confidence, and corresponding action icon in a PyQt5 GUI.---
## Model Training (External)
This application uses a MobileNetV3 model fine-tuned on a subset of the HaGRID gesture dataset.
For training instructions, see the companion notebook:
- [`notebooks/HaGRID_mobilenetv3_2.ipynb`](notebooks/HaGRID_mobilenetv3_2.ipynb)---
## Screenshots
---
## License
This project is licensed under the MIT License.---
## Credits
- Gesture detection powered by [MediaPipe Hands](https://google.github.io/mediapipe/solutions/hands.html)
- Model architecture: [MobileNetV3-Small](https://arxiv.org/abs/1905.02244)
- Dataset used for fine-tuning: [HaGRID Dataset](https://paperswithcode.com/dataset/hagrid)
- VLC integration via [python-vlc](https://pypi.org/project/python-vlc/)---
Final Year Project