https://github.com/ammarlodhi255/image-captioning-system-to-assist-the-blind
An image captioning system that is able to predict and speak out a caption of an image taken by visually impaired.
https://github.com/ammarlodhi255/image-captioning-system-to-assist-the-blind
bidirectional-lstm cnn computer-vision computer-vision-algorithms css3 deep-learning html5 image-captioning javascript javascript-es6 lstm-neural-network ml-web nlp nlp-machine-learning resnet resnet-50 vgg16
Last synced: 26 days ago
JSON representation
An image captioning system that is able to predict and speak out a caption of an image taken by visually impaired.
- Host: GitHub
- URL: https://github.com/ammarlodhi255/image-captioning-system-to-assist-the-blind
- Owner: ammarlodhi255
- Created: 2022-11-11T12:38:45.000Z (over 2 years ago)
- Default Branch: master
- Last Pushed: 2024-08-26T15:48:41.000Z (9 months ago)
- Last Synced: 2025-03-30T04:05:35.588Z (about 2 months ago)
- Topics: bidirectional-lstm, cnn, computer-vision, computer-vision-algorithms, css3, deep-learning, html5, image-captioning, javascript, javascript-es6, lstm-neural-network, ml-web, nlp, nlp-machine-learning, resnet, resnet-50, vgg16
- Language: Jupyter Notebook
- Homepage:
- Size: 10.7 MB
- Stars: 9
- Watchers: 1
- Forks: 7
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Image Captioning System to Assist The Blind
## Table of Contents
+ [About](#about)
+ [Getting Started](#getting_started)
+ [Screenshots](#screenshots)
+ [Dataset Split](#dataset_split)
+ [Model Anatomy](#model_anatomy)
+ [Project Workflow](#project_workflow)
+ [Results](#results)
+ [Final Outcome](#final_outcome)
+ [Additional Outputs](#additional_outputs)
+ [Contributing](#contributing)The goal of the project is to develop a system using deep learning techniques to assist visually impaired individuals in obtaining information by describing images taken by them. The system uses a CNN model and an NLP model to create a single image captioning system that takes image features as input and generates a text sequence describing the image.
Incorporated state-of-the-art pre-trained models, such as ResNet50, VGG16, and VGG19, for image feature extraction and LSTM and Bidirectional LSTM for text generation. Evaluated various models to determine the best-performing model with a BLEU-score of 0.61 and deployed it using Flask and pyttsx3 for web and text-to-speech functionality in the app.
These instructions will get you a copy of the project up and running on your local machine.
1. Clone the project repository from GitHub:
```bash
git clone https://github.com/ammarlodhi255/image-captioning-system-to-assist-the-blind.git
```2. Navigate to the project directory:
```bash
cd image-captioning-system-to-assist-the-blind
```3. Create a virtual environment for the project:
```bash
python3 -m venv env
```4. Activate the virtual environment:
```bash
source env/bin/activate
```5. Export the Flask app:
```bash
export FLASK_APP=app.py
```6. Run the Flask app:
```bash
flask run
```### Dataset Split
### Model Anatomy
### Project Workflow
### Results
### Final Outcome





### Additional Outputs

## Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
1. Fork the repository
2. Create your feature branch (`git checkout -b feature/AmazingFeature`)
3. Commit your changes (`git commit -m 'Add some AmazingFeature'`)
4. Push to the branch (`git push origin feature/AmazingFeature`)
5. Open a Pull Request