https://github.com/jman4162/cvnd-image-captioning
CNN-RNN image captioning project for Udacity's Computer Vision Nanodegree (CVND) program.
https://github.com/jman4162/cvnd-image-captioning
cnn computer-vision computer-vision-nanodegree cvnd cvnd-image-captioning deep-learning encoder-decoder-model image-captioning lstm pytorch rnn udacity
Last synced: 11 months ago
JSON representation
CNN-RNN image captioning project for Udacity's Computer Vision Nanodegree (CVND) program.
- Host: GitHub
- URL: https://github.com/jman4162/cvnd-image-captioning
- Owner: jman4162
- License: mit
- Created: 2021-06-19T20:22:10.000Z (almost 5 years ago)
- Default Branch: main
- Last Pushed: 2021-06-19T20:32:11.000Z (almost 5 years ago)
- Last Synced: 2025-01-13T11:17:31.771Z (over 1 year ago)
- Topics: cnn, computer-vision, computer-vision-nanodegree, cvnd, cvnd-image-captioning, deep-learning, encoder-decoder-model, image-captioning, lstm, pytorch, rnn, udacity
- Language: HTML
- Homepage:
- Size: 4.12 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# CVND-Image-Captioning-Project
CNN-RNN image captioning project for Udacity's Computer Vision Nanodegree (CVND) program.
# Instructions
1. Clone this repo: https://github.com/cocodataset/cocoapi
```
git clone https://github.com/cocodataset/cocoapi.git
```
2. Setup the coco API (also described in the readme [here](https://github.com/cocodataset/cocoapi))
```
cd cocoapi/PythonAPI
make
cd ..
```
3. Download some specific data from here: http://cocodataset.org/#download (described below)
* Under **Annotations**, download:
* **2014 Train/Val annotations [241MB]** (extract captions_train2014.json and captions_val2014.json, and place at locations cocoapi/annotations/captions_train2014.json and cocoapi/annotations/captions_val2014.json, respectively)
* **2014 Testing Image info [1MB]** (extract image_info_test2014.json and place at location cocoapi/annotations/image_info_test2014.json)
* Under **Images**, download:
* **2014 Train images [83K/13GB]** (extract the train2014 folder and place at location cocoapi/images/train2014/)
* **2014 Val images [41K/6GB]** (extract the val2014 folder and place at location cocoapi/images/val2014/)
* **2014 Test images [41K/6GB]** (extract the test2014 folder and place at location cocoapi/images/test2014/)
4. The project is structured as a series of Jupyter notebooks that are designed to be completed in sequential order (`0_Dataset.ipynb, 1_Preliminaries.ipynb, 2_Training.ipynb, 3_Inference.ipynb`).