Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/lucidrains/TimeSformer-pytorch

Implementation of TimeSformer from Facebook AI, a pure attention-based solution for video classification
https://github.com/lucidrains/TimeSformer-pytorch

artificial-intelligence attention-mechanism deep-learning transformers video-classification

Last synced: 3 months ago
JSON representation

Implementation of TimeSformer from Facebook AI, a pure attention-based solution for video classification

Awesome Lists containing this project

README

        

## TimeSformer - Pytorch

Implementation of TimeSformer, from Facebook AI. A pure and simple attention-based solution for reaching SOTA on video classification. This repository will only house the best performing variant, 'Divided Space-Time Attention', which is nothing more than attention along the time axis before the spatial.

Press release

## Install

``` bash
$ pip install timesformer-pytorch
```

## Usage

```python
import torch
from timesformer_pytorch import TimeSformer

model = TimeSformer(
dim = 512,
image_size = 224,
patch_size = 16,
num_frames = 8,
num_classes = 10,
depth = 12,
heads = 8,
dim_head = 64,
attn_dropout = 0.1,
ff_dropout = 0.1
)

video = torch.randn(2, 8, 3, 224, 224) # (batch x frames x channels x height x width)
mask = torch.ones(2, 8).bool() # (batch x frame) - use a mask if there are variable length videos in the same batch

pred = model(video, mask = mask) # (2, 10)
```

## Citations

```bibtex
@misc{bertasius2021spacetime,
title = {Is Space-Time Attention All You Need for Video Understanding?},
author = {Gedas Bertasius and Heng Wang and Lorenzo Torresani},
year = {2021},
eprint = {2102.05095},
archivePrefix = {arXiv},
primaryClass = {cs.CV}
}
```

```bibtex
@misc{su2021roformer,
title = {RoFormer: Enhanced Transformer with Rotary Position Embedding},
author = {Jianlin Su and Yu Lu and Shengfeng Pan and Bo Wen and Yunfeng Liu},
year = {2021},
eprint = {2104.09864},
archivePrefix = {arXiv},
primaryClass = {cs.CL}
}
```

```bibtex
@article{tokshift2021,
title = {Token Shift Transformer for Video Classification},
author = {Hao Zhang, Yanbin Hao, Chong-Wah Ngo},
journal = {ACM Multimedia 2021},
}
```