Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/dms-codes/hand-gesture
This project performs gesture recognition using a Convolutional Neural Network (CNN) model on a custom dataset of grayscale images. The dataset is structured in folders, each representing a unique gesture, and the project involves preprocessing the images, training a CNN model, and evaluating it with multiple metrics.
https://github.com/dms-codes/hand-gesture
cnn handgesture-recognition python
Last synced: 2 days ago
JSON representation
This project performs gesture recognition using a Convolutional Neural Network (CNN) model on a custom dataset of grayscale images. The dataset is structured in folders, each representing a unique gesture, and the project involves preprocessing the images, training a CNN model, and evaluating it with multiple metrics.
- Host: GitHub
- URL: https://github.com/dms-codes/hand-gesture
- Owner: dms-codes
- Created: 2024-10-27T07:10:28.000Z (3 months ago)
- Default Branch: main
- Last Pushed: 2024-11-26T01:14:34.000Z (about 2 months ago)
- Last Synced: 2024-11-26T01:24:20.139Z (about 2 months ago)
- Topics: cnn, handgesture-recognition, python
- Language: Python
- Homepage: https://github.com/dms-codes/hand-gesture
- Size: 82 KB
- Stars: 0
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Hand Gesture Recognition using Deep Learning Models
This project implements multiple deep learning models to recognize hand gestures using a dataset of grayscale images. The supported architectures include CNN, LeNet-5, AlexNet, and VGGNet. Each model is evaluated for its effectiveness, accuracy, and resource efficiency, allowing users to select a model suitable for their specific application.
---
## Features
- **Supports Multiple Architectures**: CNN, LeNet-5, AlexNet, and VGGNet.
- **Model Evaluation**: Generates confusion matrices, classification reports, and accuracy/loss graphs.
- **Custom Dataset Handling**: Creates datasets from labeled gesture images.
- **Model Persistence**: Saves and loads trained models for future use.
- **Plug-and-Play**: Easily switch between different architectures with minimal changes.---
## Prerequisites
Before running the project, ensure you have the following dependencies installed:- Python (>= 3.7)
- NumPy
- Pillow (PIL)
- Matplotlib
- Seaborn
- Scikit-learn
- TensorFlow/Keras
- pickle (standard Python library)To install missing packages, run:
```bash
pip install numpy pillow matplotlib seaborn scikit-learn tensorflow
```---
## File Structure
- **Input Data**: Gesture images should be stored in the `input/leapGestRecog` directory, organized into subdirectories representing gesture classes.
- **Model Files**: Saved models are stored with architecture-specific names (e.g., `gesture_model_cnn.keras`, `gesture_model_alexnet.keras`).
- **Reports and Plots**:
- Confusion matrices (`confusion_matrix_.png`)
- Training history graphs (`training_history_.png`)
- Classification reports (`report_model_.txt`)---
## How to Use
### 1. Dataset Preparation
Place gesture images in the `input/leapGestRecog` directory. Ensure the folder structure is organized with subdirectories named after each gesture class.### 2. Running the Program
Execute the script by running:```bash
python main.py
```### 3. Model Selection
To switch between models, update the `MODEL_TYPE` variable in the `__main__` block:
```python
MODEL_TYPE = MODEL_TYPE_CNN # For CNN
MODEL_TYPE = MODEL_TYPE_LENET5 # For LeNet-5
MODEL_TYPE = MODEL_TYPE_ALEXNET # For AlexNet
MODEL_TYPE = MODEL_TYPE_VGGNET # For VGGNet
```### 4. Outputs
- **Training History**: Graphs of accuracy and loss for both training and validation phases.
- **Confusion Matrix**: Visual representation of model predictions.
- **Classification Report**: Detailed precision, recall, and F1-score metrics.### 5. Resuming from a Saved Model
If a saved model exists for the selected architecture, it will be automatically loaded. Otherwise, a new model will be trained, evaluated, and saved.---
## Model Architectures
### 1. **Convolutional Neural Network (CNN)**
- Lightweight and fast to train.
- Ideal for real-time applications with near-perfect accuracy.### 2. **LeNet-5**
- A classic architecture with low computational cost.
- Suited for small datasets or systems with limited resources.### 3. **AlexNet**
- High accuracy but computationally expensive.
- Best for applications requiring precision over speed.### 4. **VGGNet**
- Highly accurate with deep layers.
- Resource-intensive, suitable for high-performance systems.---
## Results
The project evaluates each model based on:
- **Test Accuracy**: Overall model performance on unseen data.
- **Misclassifications**: Insights from confusion matrices.
- **Training History**: Visual trends of model learning.Example outputs:
- Training history (`training_history_.png`)
- Confusion matrix (`confusion_matrix_.png`)
- Evaluation report (`report_model_.txt`)---
## Future Enhancements
- **Data Augmentation**: Introduce techniques like flipping, rotation, and scaling to improve generalization.
- **Real-World Testing**: Deploy the model on embedded systems (e.g., Raspberry Pi) for gesture recognition in real-time.
- **Additional Models**: Explore architectures like ResNet or MobileNet for better performance.---
## License
This project is distributed under the MIT License.---
**Author**: Donny Marthen Sitompul
**Course**: Artificial Intelligence for Engineers (DAT305)