Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/mariam-zaidi/image_caption_generator-eye_for_blind
Image to text to speech generation
https://github.com/mariam-zaidi/image_caption_generator-eye_for_blind
attention-mechanism cnn-rnn encoder-decoder keras tensorflow
Last synced: 26 days ago
JSON representation
Image to text to speech generation
- Host: GitHub
- URL: https://github.com/mariam-zaidi/image_caption_generator-eye_for_blind
- Owner: Mariam-Zaidi
- Created: 2024-09-02T08:22:50.000Z (2 months ago)
- Default Branch: main
- Last Pushed: 2024-09-13T16:23:16.000Z (about 2 months ago)
- Last Synced: 2024-10-13T00:41:40.623Z (26 days ago)
- Topics: attention-mechanism, cnn-rnn, encoder-decoder, keras, tensorflow
- Language: Jupyter Notebook
- Homepage:
- Size: 1.95 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Image caption generator | Eye-for-blind
#### Image to text to speech generation## Brief description
To create a deep learning model which can explain the contents of an image in the form of speech through caption generation with an attention mechanism on Flickr8K dataset. This kind of model is a use-case for blind people so that they can understand any image with the help of speech. The caption generated through a CNN-RNN model will be converted to speech using a text to speech library.
The features of an image will be extracted by a CNN-based encoder and this will be decoded by an RNN model.
The project is an extended application of "Show, Attend and Tell: Neural Image Caption Generation with Visual Attention paper".