https://github.com/kracwarlock/action-recognition-visual-attention

Action recognition using soft attention based deep recurrent neural networks
https://github.com/kracwarlock/action-recognition-visual-attention

action-recognition attention-mechanism deep deep-learning deep-neural-networks deeplearning paper soft-attention video

Last synced: 2 months ago
JSON representation

Action recognition using soft attention based deep recurrent neural networks

Host: GitHub
URL: https://github.com/kracwarlock/action-recognition-visual-attention
Owner: kracwarlock
Created: 2015-09-28T17:45:48.000Z (over 9 years ago)
Default Branch: master
Last Pushed: 2016-10-30T22:19:10.000Z (over 8 years ago)
Last Synced: 2025-03-31T09:07:30.272Z (3 months ago)
Topics: action-recognition, attention-mechanism, deep, deep-learning, deep-neural-networks, deeplearning, paper, soft-attention, video
Language: Jupyter Notebook
Homepage: http://www.cs.toronto.edu/~shikhar/projects/action-recognition-attention
Size: 985 KB
Stars: 350
Watchers: 24
Forks: 158
Open Issues: 8
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

        ## Action Recognition using Visual Attention

We propose a soft attention based model for the task of action recognition in videos. 

We use multi-layered Recurrent Neural Networks (RNNs) with Long-Short Term Memory 

(LSTM) units which are deep both spatially and temporally. Our model learns to focus 

selectively on parts of the video frames and classifies videos after taking a few 

glimpses. The model essentially learns which parts in the frames are relevant for the 

task at hand and attaches higher importance to them. We evaluate the model on UCF-11 

(YouTube Action), HMDB-51 and Hollywood2 datasets and analyze how the model focuses its 

attention depending on the scene and the action being performed.

## Dependencies

* Python 2.7

* [NumPy](http://www.numpy.org/)

* [scikit learn](http://scikit-learn.org/stable/index.html)

* [skimage](http://scikit-image.org/docs/dev/api/skimage.html)

* [Theano](http://www.deeplearning.net/software/theano/)

* [h5py](http://docs.h5py.org/en/latest/)

## Input data format

This is provided in [util/README.md](https://github.com/kracwarlock/action-recognition-visual-attention/blob/master/util/README.md)

## Reference

If you use this code as part of any published research, please acknowledge the

following papers:

**"Action Recognition using Visual Attention."**  

Shikhar Sharma, Ryan Kiros, Ruslan Salakhutdinov. *[arXiv](http://arxiv.org/abs/1511.04119)*

    @article{sharma2015attention,

        title={Action Recognition using Visual Attention},

        author={Sharma, Shikhar and Kiros, Ryan and Salakhutdinov, Ruslan},

        journal={arXiv preprint arXiv:1511.04119},

        year={2015}

    } 

**"Show, Attend and Tell: Neural Image Caption Generation with Visual Attention."**  

Kelvin Xu, Jimmy Ba, Ryan Kiros, Kyunghyun Cho, Aaron Courville, Ruslan

Salakhutdinov, Richard Zemel, Yoshua Bengio. *To appear ICML (2015)*

    @article{Xu2015show,

        title={Show, Attend and Tell: Neural Image Caption Generation with Visual Attention},

        author={Xu, Kelvin and Ba, Jimmy and Kiros, Ryan and Cho, Kyunghyun and Courville, Aaron and Salakhutdinov, Ruslan and Zemel, Richard and Bengio, Yoshua},

        journal={arXiv preprint arXiv:1502.03044},

        year={2015}

    }

## License

This repsoitory is released under a [revised (3-clause) BSD License](http://directory.fsf.org/wiki/License:BSD_3Clause). It 

is the implementation for our paper [Action Recognition using Visual Attention](http://arxiv.org/abs/1511.04119). The repository uses some code from the project 

[arctic-caption](https://github.com/kelvinxu/arctic-captions) which is originally the implementation for the paper 

[Show, Attend and Tell: Neural Image Caption Generation with Visual Attention](http://arxiv.org/abs/1502.03044) and is also licensed 

under a [revised (3-clause) BSD License](http://directory.fsf.org/wiki/License:BSD_3Clause).

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/kracwarlock/action-recognition-visual-attention

Awesome Lists containing this project

README