Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/SCLinDennis/Weakly-Supervised-Dense-Video-Captioning

This repo is not completed
https://github.com/SCLinDennis/Weakly-Supervised-Dense-Video-Captioning

Last synced: about 1 month ago
JSON representation

This repo is not completed

Awesome Lists containing this project

README

        

# Weakly-Supervised-Dense-Video-Captioning
This repo try to implement ***[Weakly Supervised Dense Video Captioning](https://arxiv.org/abs/1704.01502)*** in tensorflow but not complete yet.

## Requirement
- Python 3
- Keras 2.2
- Tensorflow 1.8

## Usage
- Run ```lexical_Res.py``` for training FCN with MIML loss while saving weights with the lowest loss.
- Run ```region_selection.py``` to generate most informative and coherrence region sequence.
- Run ```TRY3/model_seq2seq.py``` to train language model.
- While using ```TRY3/s2vt_predict_v2.py``` to inference the model.

## Guide
1. ```extract_frames.py```: Uniform sampling 30 frames for each video.
2. ```load_data.py```: Create label vector and word dictionary.
3. ```Res_video_bag.py```: Lexical FCN(Resnet50) with a frame as an instance.
4. ```lexical_Res.py```: Lexical FCN(Resnet50) with a region as an instance.
5. ```region_selection.py```: Region sequence generator, which cound form one region sequence now.
6. dic/: Where to put ix2word, word2ix, word_counts.
7. frames/: Where to put frames extracted by ```extract_frames.py```.
8. MSRVTT/: Where to put training/testing labels and region sequences generated by ```region_selection.py```.
9. videos/: Where to put the [***MSR-VTT***](http://ms-multimedia-challenge.com/dataset) videos.
10. Weight_Resnet50/: Where to put weight save from ```lexical_Res.py```.
11. Weight_Resnet50_vasbag/: Where to put weight save from ```Res_video_bag.py```
12. TRY3/```s2vt_train.py```: Language model using S2VT.(train)
13. TRY3/```s2vt.py```: S2VT model graph.
14. TRY3/```s2vt_inference.py```: Language model using S2VT.(inference)

## Reference
- [Weakly Supervised Dense Video Captioning](https://arxiv.org/abs/1704.01502)
- [s2vt](https://github.com/thtang/ADLxMLDS2017)

## Contact
Shih-Chen Lin ([email protected])

Any discussions and suggestions are welcome!