https://github.com/vijaykr338/gesture-control
A Hand-Gesture Recognition System
https://github.com/vijaykr338/gesture-control
mediapipe opencv openvino-toolkit
Last synced: 4 months ago
JSON representation
A Hand-Gesture Recognition System
- Host: GitHub
- URL: https://github.com/vijaykr338/gesture-control
- Owner: vijaykr338
- Created: 2025-05-09T06:20:46.000Z (about 1 year ago)
- Default Branch: master
- Last Pushed: 2025-07-31T21:32:13.000Z (11 months ago)
- Last Synced: 2025-08-01T00:03:35.063Z (11 months ago)
- Topics: mediapipe, opencv, openvino-toolkit
- Language: Python
- Homepage:
- Size: 12.4 MB
- Stars: 2
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# ποΈ PalmPilot: Gesture Control with OpenVINO
PalmPilot is a high-performance, real-time hand gesture control system for your desktop. Built with Python and powered by Intel's OpenVINO toolkit, it offers a fast and intuitive way to interact with your applications without a mouse or keyboard.
---
## β¨ Features
- **High-Performance Inference**: Leverages the OpenVINOβ’ toolkit for optimized, low-latency gesture recognition on Intel hardware.
- **Modern GUI**: A clean and user-friendly dashboard built with PyQt6 to manage the engine, select modes, and view gesture mappings.
- **Customizable Application Modes**: Pre-configured modes for presentations, media players, and general desktop use. Easily create and edit your own modes.
- **Direct System Control**: Seamlessly controls the mouse cursor, performs clicks, scrolls, and executes keyboard shortcuts based on your hand movements.
- **Robust Hand Tracking**: Utilizes a multi-stage pipeline for accurate hand detection and 21-point landmark tracking.
---
## π οΈ Technology Stack
- **Core Engine**: Python
- **Inference**: OpenVINOβ’
- **GUI**: PyQt6
- **CV & Image Processing**: OpenCV, NumPy
- **System Automation**: PyAutoGUI
---
## π§ Model Pipeline
PalmPilot uses a chain of OpenVINO IR models (`.xml` and `.bin`) to process the video feed and recognize gestures.
| Model | Description |
| ------------------------- | ---------------------------------------------------- |
| `hand_detector` | Detects the presence and bounding box of a hand. |
| `hand_landmarks_detector` | Locates 21 key hand landmarks within the box. |
| `gesture_embedder` | Encodes landmark vectors into a feature representation.|
| `canned_gesture_classifier`| Classifies the feature vector into a specific gesture. |
---
## βοΈ Setup and Installation
### 1. Clone the Repository
```bash
git clone https://github.com/vijaykr338/gesture-control.git
cd gesture-control
```
### 2. Create a Virtual Environment
Using a virtual environment is strongly recommended to keep dependencies isolated.
```bash
# Create the environment
python -m venv venv
# Activate the environment
# On Windows:
venv\Scripts\activate
# On macOS/Linux:
source venv/bin/activate
```
### 3. Install Dependencies
The project's dependencies are listed in `requirements.txt`. Install them using pip.
```bash
pip install -r requirements.txt
```
*(Note: The requirements file contains all packages, including development tools like Jupyter.)*
---
## π How to Run
Once the setup is complete, launch the main application:
```bash
python gui_main.py
```
From the GUI, you can:
- **Start/Stop** the gesture detection engine.
- **Select** the active application mode from the dropdown.
- **View** the gestures and their assigned actions for the current mode.
- **Create or Edit** custom modes to fit your workflow.
---
## π Project Structure
```
.
βββ mediapipeModels/ # OpenVINO IR models (.xml, .bin)
βββ application_modes.py # Handles loading and managing control modes.
βββ config_manager.py # Manages gesture configurations from JSON.
βββ detection_models.py # Loads and manages the OpenVINO models.
βββ event_system.py # A simple event bus for component communication.
βββ gesture_config.json # Main JSON configuration for gestures and modes.
βββ gesture_engine.py # Core logic for the gesture recognition pipeline.
βββ gesture_processor.py # Translates gestures into system actions (e.g., mouse clicks).
βββ gui_main.py # The main entry point for the PyQt6 GUI.
βββ gui_worker.py # QThread worker to run the engine without freezing the GUI.
βββ hand_landmark.py # Constants for hand landmark points.
βββ requirement.txt # Python dependencies.
βββ README.md # This file.
```
---
## π License
This project is licensed under the MIT License. See the `LICENSE` file for details.
*(You should add a LICENSE file to your repository)*