An open API service indexing awesome lists of open source software.

https://github.com/ansh2222949/ai-mouse

Real-time hand gesture controlled mouse using computer vision and hybrid ML
https://github.com/ansh2222949/ai-mouse

computer-vision-opencv gesture-recognition hand-tracking machine-learning mediapipe python real-time-systems research-project

Last synced: 6 months ago
JSON representation

Real-time hand gesture controlled mouse using computer vision and hybrid ML

Awesome Lists containing this project

README

          


๐Ÿ–ฑ๏ธ AI Mouse


Real-Time Hand Gesture Control System



Computer Vision โ€ข Hybrid Machine Learning โ€ข Low Latency




Python
Machine Learning
Computer Vision


---

## ๐Ÿง  Project Overview

**AI Mouse** is a real-time computer vision research project that explores mouse control using hand gestures captured via a standard webcam.

Unlike typical deep learning approaches, this system combines **MediaPipe** hand tracking with a **Hybrid Machine Learning architecture (KNN + Random Forest)**. This specific design choice ensures ultra-low latency, stability, and the ability to adapt to new gestures on the fly without heavy retraining.

> ๐ŸŽฏ **Focus:** System design, real-time performance, and practical ML decision-making rather than deep learning complexity.

---

## โœจ Key Features

* **๐ŸŽฅ Real-Time Tracking:** Uses MediaPipe for robust hand landmark detection.
* **โœ‹ Gesture Control:** Full mouse navigation including **Move, Scroll, and Click**.
* **๐Ÿง  Hybrid ML Engine:**
* **KNN:** For fast, online incremental adaptation.
* **Random Forest:** For stabilizing confidence scores.
* **๐ŸŽฏ Smart Execution:** Temporal buffering to reduce jitter and false positives.
* **๐Ÿ“ฆ Modular Architecture:** Clean separation between vision, logic, and execution layers.
* **๐Ÿ–ฅ๏ธ Fully Offline:** No internet connection required.

---

## ๐Ÿงฑ Technical Architecture

The system avoids deep learning to prioritize speed and interpretability.

```text
AI_MOUSE/
โ”‚
โ”œโ”€โ”€ core/
โ”‚ โ”œโ”€โ”€ config.py # System sensitivity & configuration
โ”‚ โ”œโ”€โ”€ features.py # Hand landmark feature extraction
โ”‚ โ”œโ”€โ”€ model.py # Hybrid KNN + Random Forest logic
โ”‚ โ”œโ”€โ”€ actions.py # PyAutoGUI execution (Mouse/Click)
โ”‚ โ””โ”€โ”€ __init__.py
โ”‚
โ”œโ”€โ”€ main.py # Camera loop & orchestration
โ”œโ”€โ”€ requirements.txt # Dependencies
โ””โ”€โ”€ README.md # Documentation

```

### ๐Ÿง  Why Hybrid ML? (The Research Angle)

This problem demands **ultra-low latency** and **online learning**. Deep learning models often introduce unnecessary overhead.

1. **KNN (K-Nearest Neighbors):** Allows for instant adaptation to a specific user's hand shape.
2. **Random Forest:** Acts as a stabilizer to filter out noise from the webcam.
3. **Result:** A system that is faster and more responsive than heavy neural networks for this specific task.

---

## ๐Ÿงช Gestures & Controls

Gestures are trained **live** during runtime to match the user's specific hand.

| ID | Gesture Name | Action |
| --- | --- | --- |
| **1** | **MOVE** | Cursor follows hand movement |
| **2** | **SCROLL** | Scroll Up / Down |
| **3** | **CLICK** | Left Mouse Click |
| **4** | **IDLE** | No Action (Safety state) |

### โŒจ๏ธ Keyboard Controls

| Key | Function |
| --- | --- |
| `1` / `2` / `3` / `4` | **Train** the respective gesture (Hold to capture data) |
| `s` | **Save** trained model data locally |
| `r` | **Reset** / Clear current calibration |
| `ESC` | **Exit** the program |

---

## ๐Ÿš€ How to Run

### 1๏ธโƒฃ Prerequisites

* Python 3.10+
* A working Webcam

### 2๏ธโƒฃ Installation

```bash
# Create Virtual Environment
py -3.10 -m venv .venv
.venv\Scripts\activate

# Install Dependencies
pip install -r requirements.txt

```

### 3๏ธโƒฃ Execution

```bash
python main.py

```

---

## โš ๏ธ Safety & Warning

This project creates a virtual mouse interface. To ensure smooth movement, the failsafe is disabled:
`pyautogui.FAILSAFE = False`

**If the mouse behaves unexpectedly or gets stuck:**

1. Press **`ESC`** immediately to kill the script.
2. Or press **`Alt + Tab`** to switch windows.
3. Or close the **OpenCV window**.

> *Use with caution. This behavior is intentional for experimentation.*

---

## ๐Ÿ“Œ Notes

* **User Specific:** Trained data (`.pkl`) is specific to your hand and lighting conditions. It is not synced to Git.
* **OS:** Designed and tested on **Windows**.
* **Scope:** This is an experimental research project, not intended for production accessibility tools.

---

## ๐Ÿ”ฎ Future Improvements

* [ ] Visual overlays for gesture confidence.
* [ ] Dynamic sensitivity tuning via GUI.
* [ ] Comparative latency study against CNN models.

---


Shared for Educational & Research Purposes

```

```