Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/yj-yu/lsmdc
https://github.com/yj-yu/lsmdc
Last synced: 2 months ago
JSON representation
- Host: GitHub
- URL: https://github.com/yj-yu/lsmdc
- Owner: yj-yu
- Created: 2018-11-01T01:08:49.000Z (about 6 years ago)
- Default Branch: master
- Last Pushed: 2018-11-12T12:10:19.000Z (about 6 years ago)
- Last Synced: 2024-07-24T23:01:53.661Z (6 months ago)
- Language: Python
- Size: 50.8 KB
- Stars: 29
- Watchers: 5
- Forks: 5
- Open Issues: 3
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
- awesome-video-text-retrieval - jsfusion
README
# A Joint Sequence Fusion Model for Video Question Answering and Retrieval
This project hosts the tensorflow implementation for our **ECCV 2018** paper, **A Joint Sequence Fusion Model for Video Question Answering and Retrieval}**.
## Reference
If you use this code or dataset as part of any published research, please refer the following paper.
```
@inproceedings{
author = {Youngjae Yu and Jongseok Kim and Gunhee Kim},
title = "{A Joint Sequence Fusion Model for Video Question Answering and Retrieval}"
booktitle = {ECCV},
year = 2018
}
```## Setup
### Install dependencies
```
pip install -r requirements.txt
```### Setup python paths
```
git submodule update --init --recursive
add2virtualenv .
```### Prepare Data
- Video Feature
1. Download [LSMDC data](https://sites.google.com/site/describingmovies/lsmdc-2016/download).
2. Extract rgb features using pool5 layer of the pretrained ResNet-152 model.
3. Extract audio features using [VGGish](https://github.com/tensorflow/models/tree/master/research/audioset).
4. Concat rgb and video features and save it into hdf5 file, and save it in 'dataset/LSMDC/LSMDC16_features/RESNET_pool5wav.hdf5'.
- Dataset
- We processed raw data frames file in LSMDC17 and MSR-VTT dataset
- [Download dataframe files](https://drive.google.com/drive/folders/1_Wyr2VEWU4N-OgLBaQDGWXqD2TXXUBaF?usp=sharing)
- Save these files in "dataset/LSMDC/DataFrame"- Vocabulary
- We make word embedding matrix using GloVe Vector.
- [Download vocabulary files](https://drive.google.com/drive/folders/1GsArc0BuxzMAYobzbhWMj7MEUPDuneeC?usp=sharing)
- Save these files in "dataset/LSMDC/Vocabulary"### Training
Modify `configuartion.py` to suit your environment.
- train_tag can be 'MC', 'FIB'
Run `train.py`.
```
python train.py --tag="tag"
```### Pretrained Model
You can download the models and features in [gDrive Link](https://drive.google.com/open?id=1w86Rx2yvucUYQDTkew89h5KlRQPRcwHI)
Modify 'configuration.py' to load the checkpoints (self.load_from_ckpt = 'path/to/checkpoint/')```
[RET] R@1: 93, R@5: 247, R@10: 348, medr : 29
[FIB] Accuracy: 45.1
```You can get slightly lower or higher performance from these scores.