https://github.com/ammarlodhi255/image-captioning-system-to-assist-the-blind

An image captioning system that is able to predict and speak out a caption of an image taken by visually impaired.
https://github.com/ammarlodhi255/image-captioning-system-to-assist-the-blind

bidirectional-lstm cnn computer-vision computer-vision-algorithms css3 deep-learning html5 image-captioning javascript javascript-es6 lstm-neural-network ml-web nlp nlp-machine-learning resnet resnet-50 vgg16

Last synced: 8 months ago
JSON representation

An image captioning system that is able to predict and speak out a caption of an image taken by visually impaired.

Host: GitHub
URL: https://github.com/ammarlodhi255/image-captioning-system-to-assist-the-blind
Owner: ammarlodhi255
Created: 2022-11-11T12:38:45.000Z (about 3 years ago)
Default Branch: master
Last Pushed: 2024-08-26T15:48:41.000Z (over 1 year ago)
Last Synced: 2025-03-30T04:05:35.588Z (8 months ago)
Topics: bidirectional-lstm, cnn, computer-vision, computer-vision-algorithms, css3, deep-learning, html5, image-captioning, javascript, javascript-es6, lstm-neural-network, ml-web, nlp, nlp-machine-learning, resnet, resnet-50, vgg16
Language: Jupyter Notebook
Homepage:
Size: 10.7 MB
Stars: 9
Watchers: 1
Forks: 7
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

          # Image Captioning System to Assist The Blind

## Table of Contents

+ [About](#about)

+ [Getting Started](#getting_started)

+ [Screenshots](#screenshots)

    + [Dataset Split](#dataset_split)

    + [Model Anatomy](#model_anatomy)

    + [Project Workflow](#project_workflow)

    + [Results](#results)

    + [Final Outcome](#final_outcome)

    + [Additional Outputs](#additional_outputs)

+ [Contributing](#contributing)

## About 

The goal of the project is to develop a system using deep learning techniques to assist visually impaired individuals in obtaining information by describing images taken by them. The system uses a CNN model and an NLP model to create a single image captioning system that takes image features as input and generates a text sequence describing the image. 

Incorporated state-of-the-art pre-trained models, such as ResNet50, VGG16, and VGG19, for image feature extraction and LSTM and Bidirectional LSTM for text generation. Evaluated various models to determine the best-performing model with a BLEU-score of 0.61 and deployed it using Flask and pyttsx3 for web and text-to-speech functionality in the app.

## Getting Started 

These instructions will get you a copy of the project up and running on your local machine.

1. Clone the project repository from GitHub:

```bash

git clone https://github.com/ammarlodhi255/image-captioning-system-to-assist-the-blind.git

```

2. Navigate to the project directory:

```bash

cd image-captioning-system-to-assist-the-blind

```

3. Create a virtual environment for the project:

```bash

python3 -m venv env

```

4. Activate the virtual environment:

```bash

source env/bin/activate

```

5. Export the Flask app:

```bash

export FLASK_APP=app.py

```

6. Run the Flask app:

```bash

flask run

```

## Screenshots  

### Dataset Split  

![Dataset Design](/screenshots/Dataset-design.png)

### Model Anatomy  

![Model Anatomy](/screenshots/Model-Anatomy.png)

### Project Workflow  

![Project Workflow](/screenshots/Project-workflow.png)

### Results  

![Results Table](/screenshots/Results.png)

### Final Outcome  

![Home Interface](/screenshots/homeint.png)

![Browse](/screenshots/browse.png)

![Selected](/screenshots/selected.png)

![Choice](/screenshots/choice.png)

![Generating](/screenshots/generating.png)

![Generated](/screenshots/generated.png)

### Additional Outputs  

![Output1](/screenshots/output1.png)

![Output2](/screenshots/output2.png)

## Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

1. Fork the repository

2. Create your feature branch (`git checkout -b feature/AmazingFeature`)

3. Commit your changes (`git commit -m 'Add some AmazingFeature'`)

4. Push to the branch (`git push origin feature/AmazingFeature`)

5. Open a Pull Request

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/ammarlodhi255/image-captioning-system-to-assist-the-blind

Awesome Lists containing this project

README