https://github.com/laura-wang/video_repres_mas

code for CVPR-2019 paper: Self-supervised Spatio-temporal Representation Learning for Videos by Predicting Motion and Appearance Statistics
https://github.com/laura-wang/video_repres_mas

action-recognition cvpr2019 self-supervised-learning spatio-temporal-analysis tensorflow video

Last synced: 2 months ago
JSON representation

code for CVPR-2019 paper: Self-supervised Spatio-temporal Representation Learning for Videos by Predicting Motion and Appearance Statistics

Host: GitHub
URL: https://github.com/laura-wang/video_repres_mas
Owner: laura-wang
License: mit
Created: 2019-04-04T07:42:01.000Z (about 6 years ago)
Default Branch: master
Last Pushed: 2021-02-09T03:01:57.000Z (over 4 years ago)
Last Synced: 2024-11-10T18:03:06.538Z (8 months ago)
Topics: action-recognition, cvpr2019, self-supervised-learning, spatio-temporal-analysis, tensorflow, video
Language: Python
Homepage:
Size: 1.09 MB
Stars: 63
Watchers: 7
Forks: 10
Open Issues: 3
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

awesome-video-self-supervised-learning - [Github

README

# Self-Supervised Spatio-Temporal Representation Learning for Videos by Predicting Motion and Appearance Statistics
Tensorflow implementation of our CVPR 2019 paper [Self-Supervised Spatio-Temporal Representation Learning for Videos by Predicting Motion and Appearance Statistics.](http://openaccess.thecvf.com/content_CVPR_2019/html/Wang_Self-Supervised_Spatio-Temporal_Representation_Learning_for_Videos_by_Predicting_Motion_and_CVPR_2019_paper.html)

## Update

A journal (T-PAMI 2021) extension of this work can be found [here](https://arxiv.org/abs/2008.13426), with extensive additional analysis and significant performance gain (~30%). The corresponding PyTorch implemetation is available here: https://github.com/laura-wang/video_repres_sts.

## Overview
We realease partial of our training code on UCF101 dataset. It contains the self-supervised learning based on motion statistics (see more details in our paper).
The entire training protocol (both motion statistics and appearance statistics) is implemented in the pytorch version: https://github.com/laura-wang/video_repres_sts.

## Requirements
1. tensorflow >= 1.9.0
2. Python 3
3. cv2
4. scipy

## Data preparation

You can download the original UCF101 dataset from the [official website](https://www.crcv.ucf.edu/data/UCF101.php). And then extarct RGB images from videos and finally extract optical flow data using TVL1 method. **But I recommend you to direclty download the pre-processed RGB and optical flow data of UCF101 provided by [feichtenhofer](https://github.com/feichtenhofer/twostreamfusion).**

## Train
Here we provide the first version of our training code with "placeholder" as data reading pipeline, so you don't need to write RGB/Optical flow data into tfrecord format. We also rewrite the training code using Dataset API, but currently we think the placeholder version is enough for you to get to understand motion statsitics.

Before `python train.py`, remember to set right dataset directory in the list file, and then you can play with the motion statistics!

## Citation

If you find this repository useful in your research, please consider citing:

```
@inproceedings{wang2019self,
title={Self-Supervised Spatio-Temporal Representation Learning for Videos by Predicting Motion and Appearance Statistics},
author={Wang, Jiangliu and Jiao, Jianbo and Bao, Linchao and He, Shengfeng and Liu, Yunhui and Liu, Wei},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
pages={4006--4015},
year={2019}
}
```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/laura-wang/video_repres_mas

Awesome Lists containing this project

README