https://github.com/sovit-123/american-sign-language-detection-using-deep-learning
This project aims to detect American Sign Language using PyTorch and deep learning. The neural network can also detect the sign language letters in real-time from a webcam video feed.
https://github.com/sovit-123/american-sign-language-detection-using-deep-learning
american-sign-language-recognition computer-vision convolutional-neural-networks deep-learning deep-neural-networks machine-learning neural-networks pytorch
Last synced: about 1 month ago
JSON representation
This project aims to detect American Sign Language using PyTorch and deep learning. The neural network can also detect the sign language letters in real-time from a webcam video feed.
- Host: GitHub
- URL: https://github.com/sovit-123/american-sign-language-detection-using-deep-learning
- Owner: sovit-123
- Created: 2020-05-02T04:18:49.000Z (about 5 years ago)
- Default Branch: master
- Last Pushed: 2020-05-06T10:09:52.000Z (about 5 years ago)
- Last Synced: 2025-04-11T00:13:40.487Z (about 1 month ago)
- Topics: american-sign-language-recognition, computer-vision, convolutional-neural-networks, deep-learning, deep-neural-networks, machine-learning, neural-networks, pytorch
- Language: Python
- Homepage:
- Size: 44.5 MB
- Stars: 6
- Watchers: 1
- Forks: 3
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# American Sign Language Detection using Deep Learning
## About the Project
This project aims to achieve American Sign Language Detection using Deep Learning. Also, real time webcam detection is a major aim of this project which will be refined from time to time.
### Some Results



### Dataset Used
The dataset used can be found on [Kaggle](https://www.kaggle.com/grassknoted/asl-alphabet.).
### Specific Packages
* PyTorch >= 1.4.
* Albumentations >= 0.4.3.
* Scikit-Learn >= 0.22.1.## Using the Repository
* Download the zip file
OR
* Clone the repository using: `git clone https://github.com/sovit-123/American-Sign-Language-Detection-using-Deep-Learning.git`.
## Directory Structure and Usage
* ```
├───input
│ ├───asl_alphabet_test
│ │ └───asl_alphabet_test
│ ├───asl_alphabet_train
│ │ └───asl_alphabet_train
│ │ ├───A
│ │ ├───B
│ │ ...
│ └───preprocessed_image
│ ├───A
│ ├───B
│ ...
├───outputs
└───src
│ cam_test.py
│ cnn_models.py
│ create_csv.py
│ preprocess_image.py
│ test.py
│ train.py
```* Be sure to make a folder named `input` first. This is where all the image data will reside.
* `input` folder contains the the original data from the [Kaggle website](https://www.kaggle.com/grassknoted/asl-alphabet) as well as the preprocessed images that are used for training.
* `input/preprocessed_image` contains the resized images that are used for training. The total images in the original dataset is 87000. The `input/preprocessed_image` may contain 87000 or a subset of images depending upon the number of images preprocessed. These many images will be used for training.
* `outputs` folder contains the trained model (`model.pth`), the loss and accuracy plots, the predicted test images, and the saved webcam feed with the predicted output.
* `src` folder contains the different python files.
* `preprocess_image.py`: Preprocess the number of images that you want to use for training.
* `create_csv.py`: Create a CSV file for the preprocessed images mapping the image paths to the labels. All the images are read from disk during training.
* `cnn_models.py`: Contains the modules of Custom convolutional neural network model to be used during training. Can be expanded with different module. Keeping this file separate provides easier usage of different models during training.
* `train.py`: Python file to train the CNN model on the dataset.
* `test.py`: Python file to test on the images provided in `input/asl_alphabet_test/asl_alphabet_test` folder.
* `cam_test.py`: Python file for real time webcam sign language detection (**The major aim of this project**).### Using The Different Python Files (In Order)
* **Execute all the files in the terminal while being within the `src` folder.**
* `preprocess_image.py`:
`python preprocess_image.py --num-images 1200`
`--num_images` is the number of images to preprocess for each category from `A` to `Z`, including `del`, `nothing`, and `space`.
* `create_csv.py`:
`python create_csv.py`
* `train.py`:
`python train.py --epochs 10`
* `test.py`:
`python test.py --img A_test.jpg`
* `cam_test.py`:
`python cam_test.py `
## References
* Kaggle dataset:
* https://www.kaggle.com/grassknoted/asl-alphabet.
* [Changing the contrast and brightness of an image!](https://docs.opencv.org/3.4/d3/dc1/tutorial_basic_linear_transform.html).
* [Real-time American Sign Language Recognition with Convolutional Neural Networks](http://cs231n.stanford.edu/reports/2016/pdfs/214_Report.pdf), **Brandon Garcia et al.**