{"id":16995156,"url":"https://github.com/kracwarlock/action-recognition-visual-attention","last_synced_at":"2025-04-07T11:05:57.832Z","repository":{"id":151633877,"uuid":"43316631","full_name":"kracwarlock/action-recognition-visual-attention","owner":"kracwarlock","description":"Action recognition using soft attention based deep recurrent neural networks","archived":false,"fork":false,"pushed_at":"2016-10-30T22:19:10.000Z","size":1009,"stargazers_count":350,"open_issues_count":8,"forks_count":158,"subscribers_count":24,"default_branch":"master","last_synced_at":"2025-03-31T09:07:30.272Z","etag":null,"topics":["action-recognition","attention-mechanism","deep","deep-learning","deep-neural-networks","deeplearning","paper","soft-attention","video"],"latest_commit_sha":null,"homepage":"http://www.cs.toronto.edu/~shikhar/projects/action-recognition-attention","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/kracwarlock.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2015-09-28T17:45:48.000Z","updated_at":"2025-03-07T06:24:28.000Z","dependencies_parsed_at":"2023-05-25T10:45:28.463Z","dependency_job_id":null,"html_url":"https://github.com/kracwarlock/action-recognition-visual-attention","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kracwarlock%2Faction-recognition-visual-attention","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kracwarlock%2Faction-recognition-visual-attention/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kracwarlock%2Faction-recognition-visual-attention/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kracwarlock%2Faction-recognition-visual-attention/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/kracwarlock","download_url":"https://codeload.github.com/kracwarlock/action-recognition-visual-attention/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247640462,"owners_count":20971557,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["action-recognition","attention-mechanism","deep","deep-learning","deep-neural-networks","deeplearning","paper","soft-attention","video"],"created_at":"2024-10-14T03:47:45.258Z","updated_at":"2025-04-07T11:05:57.808Z","avatar_url":"https://github.com/kracwarlock.png","language":"Jupyter Notebook","funding_links":[],"categories":["📦 Legacy \u0026 Inactive Projects"],"sub_categories":[],"readme":"## Action Recognition using Visual Attention\n\nWe propose a soft attention based model for the task of action recognition in videos. \nWe use multi-layered Recurrent Neural Networks (RNNs) with Long-Short Term Memory \n(LSTM) units which are deep both spatially and temporally. Our model learns to focus \nselectively on parts of the video frames and classifies videos after taking a few \nglimpses. The model essentially learns which parts in the frames are relevant for the \ntask at hand and attaches higher importance to them. We evaluate the model on UCF-11 \n(YouTube Action), HMDB-51 and Hollywood2 datasets and analyze how the model focuses its \nattention depending on the scene and the action being performed.\n\n## Dependencies\n\n* Python 2.7\n* [NumPy](http://www.numpy.org/)\n* [scikit learn](http://scikit-learn.org/stable/index.html)\n* [skimage](http://scikit-image.org/docs/dev/api/skimage.html)\n* [Theano](http://www.deeplearning.net/software/theano/)\n* [h5py](http://docs.h5py.org/en/latest/)\n\n## Input data format\n\nThis is provided in [util/README.md](https://github.com/kracwarlock/action-recognition-visual-attention/blob/master/util/README.md)\n\n## Reference\n\nIf you use this code as part of any published research, please acknowledge the\nfollowing papers:\n\n**\"Action Recognition using Visual Attention.\"**  \nShikhar Sharma, Ryan Kiros, Ruslan Salakhutdinov. *[arXiv](http://arxiv.org/abs/1511.04119)*\n\n    @article{sharma2015attention,\n        title={Action Recognition using Visual Attention},\n        author={Sharma, Shikhar and Kiros, Ryan and Salakhutdinov, Ruslan},\n        journal={arXiv preprint arXiv:1511.04119},\n        year={2015}\n    } \n\n**\"Show, Attend and Tell: Neural Image Caption Generation with Visual Attention.\"**  \nKelvin Xu, Jimmy Ba, Ryan Kiros, Kyunghyun Cho, Aaron Courville, Ruslan\nSalakhutdinov, Richard Zemel, Yoshua Bengio. *To appear ICML (2015)*\n\n    @article{Xu2015show,\n        title={Show, Attend and Tell: Neural Image Caption Generation with Visual Attention},\n        author={Xu, Kelvin and Ba, Jimmy and Kiros, Ryan and Cho, Kyunghyun and Courville, Aaron and Salakhutdinov, Ruslan and Zemel, Richard and Bengio, Yoshua},\n        journal={arXiv preprint arXiv:1502.03044},\n        year={2015}\n    }\n\n## License\nThis repsoitory is released under a [revised (3-clause) BSD License](http://directory.fsf.org/wiki/License:BSD_3Clause). It \nis the implementation for our paper [Action Recognition using Visual Attention](http://arxiv.org/abs/1511.04119). The repository uses some code from the project \n[arctic-caption](https://github.com/kelvinxu/arctic-captions) which is originally the implementation for the paper \n[Show, Attend and Tell: Neural Image Caption Generation with Visual Attention](http://arxiv.org/abs/1502.03044) and is also licensed \nunder a [revised (3-clause) BSD License](http://directory.fsf.org/wiki/License:BSD_3Clause).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkracwarlock%2Faction-recognition-visual-attention","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fkracwarlock%2Faction-recognition-visual-attention","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkracwarlock%2Faction-recognition-visual-attention/lists"}