https://github.com/pomonam/learnablepoolingmethods

TensorFlow Implementation of "Learnable Pooling Methods for Video Classification".
https://github.com/pomonam/learnablepoolingmethods

eccv netvlad video-classification youtube-8m

Last synced: about 2 months ago
JSON representation

TensorFlow Implementation of "Learnable Pooling Methods for Video Classification".

Host: GitHub
URL: https://github.com/pomonam/learnablepoolingmethods
Owner: pomonam
License: apache-2.0
Archived: true
Created: 2018-05-31T13:56:21.000Z (over 7 years ago)
Default Branch: master
Last Pushed: 2018-10-08T04:30:17.000Z (about 7 years ago)
Last Synced: 2025-03-04T14:49:39.389Z (7 months ago)
Topics: eccv, netvlad, video-classification, youtube-8m
Language: Python
Homepage:
Size: 946 KB
Stars: 38
Watchers: 5
Forks: 2
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

          # Learnable Pooling Methods for Video Classification

The repository is based on the starter code provided by Google AI. It contains a code for training and evaluating models for [YouTube-8M](https://research.google.com/youtube8m/) dataset. The detailed table of contents and descriptions can be found at [original repository](https://github.com/google/youtube-8m).

The repository contains models from team "Deep Topology". Our approach was accepted in [ECCV - The 2nd Workshop on YouTube-8M Large-Scale Video Understanding](https://research.google.com/youtube8m/workshop2018/index.html). The presentation is accessible in ECCV Workshop page.

Presentation: TBA \

Paper: [Link](paper/Learnable_Pooling_Methods_for_Video_Classification.pdf), [Arxiv](https://arxiv.org/abs/1810.00530)

    

# Usage

In [frame_level_models.py](frame_level_models.py), prototype 1, 2 and 3 refer to sections 3.1, 3.2 and 3.2 in the paper. The detailed instructions instructions to train and evaluate the model can be found at [YT8M repository](https://github.com/google/youtube-8m). The following is the example training command to reproduce the result.

### Prototype 1 (Attention Enhanced NetVLAD)

```

python train.py --train_data_pattern="" --model=NetVladV1 --train_dir="" --frame_features=True --feature_names="rgb,audio" --feature_sizes="1024,128" --batch_size=80 --base_learning_rate=0.0002 --netvlad_cluster_size=256 --netvlad_hidden_size=512 --iterations=256 --learning_rate_decay=0.85

```

### Prototype 2 (NetVLAD with Attention Based Cluster Similarities)

```

python train.py --train_data_pattern="" --model=NetVladV2 --train_dir="" --frame_features=True --feature_names="rgb,audio" --feature_sizes="1024,128" --batch_size=80 --base_learning_rate=0.0002 --netvlad_cluster_size=256 --netvlad_hidden_size=512 --iterations=256 --learning_rate_decay=0.85

```

### Prototype 3 (Regularized Function Approximation Approach)

```

TBD

```

# Changes

- **1.00** (31 August 2018)

    - Initial public release

- **2.00** (30 September 2018)

    - Code cleaning

    - Model usage

    

# Citations

If you find our apporaches useful, please cite our paper.

```

@article{kmiec2018learnable,

  title={Learnable Pooling Methods for Video Classification},

  author={Kmiec, Sebastian and Bae, Juhan and An, Ruijian},

  journal={arXiv preprint arXiv:1810.00530},

  year={2018}

}

```

# Contributors (Alphabetical Order)

- [Ruijian An](https://github.com/RuijianSZ)

- [Juhan Bae](https://github.com/pomonam)

- [Sebastian Kmiec](https://github.com/sebastiankmiec)

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/pomonam/learnablepoolingmethods

Awesome Lists containing this project

README