https://github.com/tanyachutani/image-captioning-generator
Image Captioning Generator Keras
https://github.com/tanyachutani/image-captioning-generator
beam-search bleu-score caption-generation cnn cnn-rnn flicker8k-dataset greedy-search image-captioning inceptionv3 keras rnn tensorflow
Last synced: about 2 months ago
JSON representation
Image Captioning Generator Keras
- Host: GitHub
- URL: https://github.com/tanyachutani/image-captioning-generator
- Owner: TanyaChutani
- Created: 2020-04-12T23:35:52.000Z (about 6 years ago)
- Default Branch: master
- Last Pushed: 2020-04-30T03:57:47.000Z (about 6 years ago)
- Last Synced: 2025-01-10T11:59:28.127Z (over 1 year ago)
- Topics: beam-search, bleu-score, caption-generation, cnn, cnn-rnn, flicker8k-dataset, greedy-search, image-captioning, inceptionv3, keras, rnn, tensorflow
- Language: Jupyter Notebook
- Size: 156 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# ImageCaptionGenerator
Image Captioning Generator Keras
## Data
Dataset - Flickr 8k Dataset
[Flicker8k_Dataset](https://github.com/jbrownlee/Datasets/releases/download/Flickr8k/Flickr8k_Dataset.zip)
[Flickr8k_text](https://github.com/jbrownlee/Datasets/releases/download/Flickr8k/Flickr8k_text.zip)
Flicker8k_Dataset - Contains 8092 images in jpeg format.
Flickr8k_text - Each image contains 5 description.
## Model
Built and trained a deep learning model for captioning real world image.
- Used pre-trained InceptionV3 to extract feature from image.
- Used pre-trained fasttext embedding, these were feed into a Stacked Bi-directional GRU layer.
- They both were combined and predicted the next word till the end of caption using greedy search (during testing).
## Result
#### Weights
## To Do
- Add attention
- Use beam search instead of greddy seaech