https://github.com/yj-yu/lsmdc

Last synced: about 2 months ago
JSON representation

Host: GitHub
URL: https://github.com/yj-yu/lsmdc
Owner: yj-yu
Created: 2018-11-01T01:08:49.000Z (over 6 years ago)
Default Branch: master
Last Pushed: 2018-11-12T12:10:19.000Z (over 6 years ago)
Last Synced: 2024-11-18T01:39:16.982Z (7 months ago)
Language: Python
Size: 50.8 KB
Stars: 31
Watchers: 5
Forks: 5
Open Issues: 3
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

awesome-video-text-retrieval - jsfusion

README

# A Joint Sequence Fusion Model for Video Question Answering and Retrieval

This project hosts the tensorflow implementation for our **ECCV 2018** paper, **A Joint Sequence Fusion Model for Video Question Answering and Retrieval}**.

## Reference

If you use this code or dataset as part of any published research, please refer the following paper.

```
@inproceedings{
author = {Youngjae Yu and Jongseok Kim and Gunhee Kim},
title = "{A Joint Sequence Fusion Model for Video Question Answering and Retrieval}"
booktitle = {ECCV},
year = 2018
}
```

## Setup

### Install dependencies
```
pip install -r requirements.txt
```

### Setup python paths
```
git submodule update --init --recursive
add2virtualenv .
```

### Prepare Data

- Video Feature

1. Download [LSMDC data](https://sites.google.com/site/describingmovies/lsmdc-2016/download).

2. Extract rgb features using pool5 layer of the pretrained ResNet-152 model.

3. Extract audio features using [VGGish](https://github.com/tensorflow/models/tree/master/research/audioset).

4. Concat rgb and video features and save it into hdf5 file, and save it in 'dataset/LSMDC/LSMDC16_features/RESNET_pool5wav.hdf5'.

- Dataset
- We processed raw data frames file in LSMDC17 and MSR-VTT dataset
- [Download dataframe files](https://drive.google.com/drive/folders/1_Wyr2VEWU4N-OgLBaQDGWXqD2TXXUBaF?usp=sharing)
- Save these files in "dataset/LSMDC/DataFrame"

- Vocabulary

- We make word embedding matrix using GloVe Vector.
- [Download vocabulary files](https://drive.google.com/drive/folders/1GsArc0BuxzMAYobzbhWMj7MEUPDuneeC?usp=sharing)
- Save these files in "dataset/LSMDC/Vocabulary"

### Training

Modify `configuartion.py` to suit your environment.

- train_tag can be 'MC', 'FIB'

Run `train.py`.

```
python train.py --tag="tag"
```

### Pretrained Model

You can download the models and features in [gDrive Link](https://drive.google.com/open?id=1w86Rx2yvucUYQDTkew89h5KlRQPRcwHI)
Modify 'configuration.py' to load the checkpoints (self.load_from_ckpt = 'path/to/checkpoint/')

```
[RET] R@1: 93, R@5: 247, R@10: 348, medr : 29
[FIB] Accuracy: 45.1
```

You can get slightly lower or higher performance from these scores.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/yj-yu/lsmdc

Awesome Lists containing this project

README