An open API service indexing awesome lists of open source software.

https://github.com/ammarlodhi255/image-captioning-system-to-assist-the-blind

An image captioning system that is able to predict and speak out a caption of an image taken by visually impaired.
https://github.com/ammarlodhi255/image-captioning-system-to-assist-the-blind

bidirectional-lstm cnn computer-vision computer-vision-algorithms css3 deep-learning html5 image-captioning javascript javascript-es6 lstm-neural-network ml-web nlp nlp-machine-learning resnet resnet-50 vgg16

Last synced: 26 days ago
JSON representation

An image captioning system that is able to predict and speak out a caption of an image taken by visually impaired.

Awesome Lists containing this project

README

        

# Image Captioning System to Assist The Blind

## Table of Contents

+ [About](#about)
+ [Getting Started](#getting_started)
+ [Screenshots](#screenshots)
+ [Dataset Split](#dataset_split)
+ [Model Anatomy](#model_anatomy)
+ [Project Workflow](#project_workflow)
+ [Results](#results)
+ [Final Outcome](#final_outcome)
+ [Additional Outputs](#additional_outputs)
+ [Contributing](#contributing)

## About

The goal of the project is to develop a system using deep learning techniques to assist visually impaired individuals in obtaining information by describing images taken by them. The system uses a CNN model and an NLP model to create a single image captioning system that takes image features as input and generates a text sequence describing the image.

Incorporated state-of-the-art pre-trained models, such as ResNet50, VGG16, and VGG19, for image feature extraction and LSTM and Bidirectional LSTM for text generation. Evaluated various models to determine the best-performing model with a BLEU-score of 0.61 and deployed it using Flask and pyttsx3 for web and text-to-speech functionality in the app.

## Getting Started

These instructions will get you a copy of the project up and running on your local machine.

1. Clone the project repository from GitHub:

```bash
git clone https://github.com/ammarlodhi255/image-captioning-system-to-assist-the-blind.git
```

2. Navigate to the project directory:

```bash
cd image-captioning-system-to-assist-the-blind
```

3. Create a virtual environment for the project:

```bash
python3 -m venv env
```

4. Activate the virtual environment:

```bash
source env/bin/activate
```

5. Export the Flask app:

```bash
export FLASK_APP=app.py
```

6. Run the Flask app:

```bash
flask run
```

## Screenshots

### Dataset Split
![Dataset Design](/screenshots/Dataset-design.png)

### Model Anatomy
![Model Anatomy](/screenshots/Model-Anatomy.png)

### Project Workflow
![Project Workflow](/screenshots/Project-workflow.png)

### Results
![Results Table](/screenshots/Results.png)

### Final Outcome
![Home Interface](/screenshots/homeint.png)
![Browse](/screenshots/browse.png)
![Selected](/screenshots/selected.png)
![Choice](/screenshots/choice.png)
![Generating](/screenshots/generating.png)
![Generated](/screenshots/generated.png)

### Additional Outputs
![Output1](/screenshots/output1.png)
![Output2](/screenshots/output2.png)

## Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

1. Fork the repository
2. Create your feature branch (`git checkout -b feature/AmazingFeature`)
3. Commit your changes (`git commit -m 'Add some AmazingFeature'`)
4. Push to the branch (`git push origin feature/AmazingFeature`)
5. Open a Pull Request