Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/brianlesko/fast-binary-image-classification
Classify images in real time. Retrain this CNN with your own dataset. For the binary classification problem. Developed for detecting thumbs up or thumbs down.
https://github.com/brianlesko/fast-binary-image-classification
binary-image-classification convolutional-neural-networks image-classification image-recognition neural-network
Last synced: about 2 months ago
JSON representation
Classify images in real time. Retrain this CNN with your own dataset. For the binary classification problem. Developed for detecting thumbs up or thumbs down.
- Host: GitHub
- URL: https://github.com/brianlesko/fast-binary-image-classification
- Owner: BrianLesko
- Created: 2024-06-16T17:30:25.000Z (7 months ago)
- Default Branch: main
- Last Pushed: 2024-08-25T16:02:56.000Z (4 months ago)
- Last Synced: 2024-08-26T17:08:16.516Z (4 months ago)
- Topics: binary-image-classification, convolutional-neural-networks, image-classification, image-recognition, neural-network
- Language: Python
- Homepage:
- Size: 76.3 MB
- Stars: 2
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: readme.md
Awesome Lists containing this project
README
# Real Time Image Classifier: Thumbs up vs down
Written in pure python, this repository was used to [train](https://github.com/BrianLesko/fast-binary-image-classification/blob/main/train.py) and [deploy](https://github.com/BrianLesko/fast-binary-image-classification/blob/main/deploy.py) a binary image classifier engineered to distinguish between "thumbs up" and "thumbs down" hand gestures.
Achieving 95% accuracy on the test set of 2,500 images, training completed in just over 26 minutes. The project includes three main Python scripts: train.py, simple_test.py, and test.py. In total, less than 300 lines of code.
Leveraging the VGG16 convolutional neural network with pretrained weights, the model boasts rapid training capabilities, suited for deployment on personal computers for custom use cases. This model debuted in [Very Deep Convolutional Networks for Large-Scale Image Recognition](https://arxiv.org/abs/1409.1556) in 2014. Newer models are more compute efficient; therefor, this repository is not suited to large scale deployment and is better used as a learning resourse or local deployment - where it is capable of competent accuracy (95% test accuracy in this demonstration) at the tradeoff of inflated compute cost.
## Training Summary
The model was engineered using a dataset of 2,500 images divided into two categories: "like" and "dislike". The training was executed over seven epochs and took 26.15 minutes, achieving a final test accuracy of 95.28%. The accuracy increased logarithmically, while the loss decreased from 0.7721 to 0.1551 across epochs.
This dataset was sourced from the "like" and "dislike" classes of the Hagrid hand gesture dataset. Model weights are stored in model.pth.
If you plan to train your own model, ensure to download and organize your dataset similarly.
The model weights for the thumbs up and down implementation are saved in model.pth.
## Dataset useage
The classifier was trained to recognize hand gestures performed between 0.5 and 4 meters from the camera. The dataset, consisting of 2,500 images, was sampled from the "like" and "dislike" classes of the [Hagrid Dataset](https://github.com/hukenovs/hagrid).
## Simple Validation
The simple_test.py script conducts random image classifications to validate the trained model, which consistently achieved an accuracy rate above 80% across 20 separate runs.
## Deployment
The repository includes a web app that integrates with either a laptop webcam or USB camera device. On an M3 MacBook, the classifier operates at approximately 12 classifications per second, with the entire deployment script engineered in fewer than 100 lines of Python code.
To run the deployment code, follow the usage instructions below.
## Usage
To deploy the machine learning model locally, execute the following commands:
```
python3 -m venv my_env
source my_env/bin/activate # or on windows: source my_env\Scripts\activate
pip install -r requirements.txt
streamlit run test.py 8000
```to stop the app, go back to the terminal and press control C
This will start the local Streamlit server, and you can access the chatbot by opening a web browser and navigating to `http://localhost:8000`.
## Dependencies
`streamlit` - for the web interface
`pytorch` - to download the Machine learning model and run it, torch and torchvision
`opencv` - to fetch the camera feed from your device
`pillow` - python image library for converting from opencv to torchvision format
`numpy` - for numerical arrays
`scikit-learn` - to split the training and testing images## Transfer learning
To take advantage of transfer learning, a popular method machine learning tool, the last few layers of a pretrained neural network are modified. This allows a neural network trained for one purpose to be reused on another machine learning task. This cuts down the training time signifcatly.
All that needs to be done is to form the output matrix into the shape appropriate for your ML task, and finetune the weights in your classifer layers.
In this case, the classifier portion of the VGG-16 convolutional neural network was modified in python using the pytorch machine learning library.
```
# Updating the classifier with the correct input size
input_features = model.classifier[0].in_features
model.classifier = nn.Sequential(
nn.Linear(input_features, 256),
nn.ReLU(),
nn.Dropout(p=0.6),
nn.Linear(256, 1),
nn.Sigmoid()
)
```ReLU: Adds non-linearity, helping the model learn complex patterns.
Dropout (60%): Minimizes overfitting during neural network training.
Sigmoid: Ensures output values are between 0 and 1, perfect for engineering binary classification.## Motivation
This project is intended as a review of machine learning and AI concepts, while expanding on my software engineering skills in web app deployment that I have developed since I received my masters in Engineering from Ohio State University.
Next, I intend to use transfer learning and the softmax for a multi-class classification neural network and integrate additional functionality into the deployed machine learning model and user interface.
If you find this project useful, please consider giving it a :star:
╭━━╮╭━━━┳━━┳━━━┳━╮╱╭╮ ╭╮╱╱╭━━━┳━━━┳╮╭━┳━━━╮
┃╭╮┃┃╭━╮┣┫┣┫╭━╮┃┃╰╮┃┃ ┃┃╱╱┃╭━━┫╭━╮┃┃┃╭┫╭━╮┃
┃╰╯╰┫╰━╯┃┃┃┃┃╱┃┃╭╮╰╯┃ ┃┃╱╱┃╰━━┫╰━━┫╰╯╯┃┃╱┃┃
┃╭━╮┃╭╮╭╯┃┃┃╰━╯┃┃╰╮┃┃ ┃┃╱╭┫╭━━┻━━╮┃╭╮┃┃┃╱┃┃
┃╰━╯┃┃┃╰┳┫┣┫╭━╮┃┃╱┃┃┃ ┃╰━╯┃╰━━┫╰━╯┃┃┃╰┫╰━╯┃
╰━━━┻╯╰━┻━━┻╯╱╰┻╯╱╰━╯ ╰━━━┻━━━┻━━━┻╯╰━┻━━━╯
follow all of these for a cookie :)