Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/andystmc/hand2num
A deep learning model that converts hand gestures into numerical values (1-5) using Convolutional Neural Networks for efficient and accurate recognition.
https://github.com/andystmc/hand2num
classification cnn computer-vision deep-learning hand-gestures machine-learning media-pipe numbers opencv real-time-prediction
Last synced: 29 days ago
JSON representation
A deep learning model that converts hand gestures into numerical values (1-5) using Convolutional Neural Networks for efficient and accurate recognition.
- Host: GitHub
- URL: https://github.com/andystmc/hand2num
- Owner: AndysTMC
- Created: 2024-11-28T10:58:09.000Z (about 2 months ago)
- Default Branch: main
- Last Pushed: 2024-11-28T18:18:57.000Z (about 2 months ago)
- Last Synced: 2024-12-22T14:16:21.037Z (29 days ago)
- Topics: classification, cnn, computer-vision, deep-learning, hand-gestures, machine-learning, media-pipe, numbers, opencv, real-time-prediction
- Language: Jupyter Notebook
- Homepage:
- Size: 91.4 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Hand2Num
## Project Overview
This project is a real-time hand gesture recognition system that uses computer vision and deep learning technologies to classify hand gestures from webcam input. The system leverages MediaPipe for hand landmark detection and a custom Convolutional Neural Network (CNN) for gesture classification.
![Theme_Image_Transparent](https://github.com/user-attachments/assets/a34aa079-684e-4368-8e70-8a9d87b1cdcd)## Key Features
- Real-time hand gesture recognition
- Uses MediaPipe for hand landmark detection
- Custom CNN model for gesture classification
- Developed and trained on Google Colab
- Supports multiple gesture categories### Key Training Environment Features
- Direct Google Drive file access
- Compressed image dataset handling
- Automated model training and checkpointing
- GPU/TPU acceleration for faster computations## Technologies Used
- **Development Platform**: Google Colab
- **Hardware Acceleration**: TPU
- **Computer Vision**: OpenCV (cv2)
- **Hand Tracking**: MediaPipe
- **Deep Learning**: TensorFlow/Keras
- **Programming Language**: Python## Project Structure
### 1. Data Generation (`generate.py`)
- Captures hand landmark images using webcam
- Processes and saves landmark images for training
- Supports different hand configurations (left/right, normal/flipped)
#### Preprocessed Images
##### One-Left-Normal | One-Right Normal | One-Left Flipped | One-Right Flipped
##### Two-Left-Normal | Two-Right Normal | Two-Left Flipped | Two-Right Flipped
##### Three-Left-Normal | Three-Right Normal | Three-Left Flipped | Three-Right Flipped
##### Four-Left-Normal | Four-Right Normal | Four-Left Flipped | Four-Right Flipped
##### Five-Left-Normal | Five-Right Normal | Five-Left Flipped | Five-Right Flipped
### 2. Model Training (`Project_HGR.ipynb`)
- Prepares and preprocesses image dataset
- Builds a Convolutional Neural Network (CNN)
- Trains and validates the gesture recognition model
- Saves the best performing model
### Model Architecture### Few Testing Results
### 3. Live Classification (`live_cam_test.py`)
- Loads pre-trained model
- Processes real-time webcam input
- Performs hand gesture recognition
- Displays prediction results
### Some Real-time Testing Results
## Setup and Reproduction
### Prerequisites
- Google Account
- Google Colab access
- Prepared image dataset### Steps to Reproduce
1. Open Google Colab
2. Create new notebook
3. Upload or link to required Python scripts
4. Mount Google Drive
5. Upload compressed image dataset
6. Run training notebook (Project_HGR.ipynb)## Model Deployment
After training in Colab:
- Download the best performing model
- Use `live_cam_test.py` for real-time gesture recognition
- Ensure all dependencies are installed locally## Potential Improvements
- Increase training dataset diversity
- Implement data augmentation
- Experiment with model architectures
- Add more gesture categories## Limitations
- Requires good lighting conditions
- Performance depends on training data quality
- Currently supports a limited number of gesture categories