{"id":28250678,"url":"https://github.com/reshalfahsi/action-recognition","last_synced_at":"2026-04-29T08:34:31.824Z","repository":{"id":294024323,"uuid":"747200903","full_name":"reshalfahsi/action-recognition","owner":"reshalfahsi","description":"Action Recognition Using CNN + Bidirectional RNN","archived":false,"fork":false,"pushed_at":"2024-01-23T13:23:43.000Z","size":9075,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-06-14T01:38:36.303Z","etag":null,"topics":["action-recognition","bidirectional-rnn","cnn","computer-vision","hmdb51","mnasnet","pytorch","pytorch-lightning","video-classification","video-processing"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/reshalfahsi.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2024-01-23T13:22:21.000Z","updated_at":"2024-01-23T23:23:23.000Z","dependencies_parsed_at":"2025-05-18T14:48:55.514Z","dependency_job_id":"ee41b933-f508-4d3a-9c75-ebd7466696d6","html_url":"https://github.com/reshalfahsi/action-recognition","commit_stats":null,"previous_names":["reshalfahsi/action-recognition"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/reshalfahsi/action-recognition","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/reshalfahsi%2Faction-recognition","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/reshalfahsi%2Faction-recognition/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/reshalfahsi%2Faction-recognition/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/reshalfahsi%2Faction-recognition/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/reshalfahsi","download_url":"https://codeload.github.com/reshalfahsi/action-recognition/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/reshalfahsi%2Faction-recognition/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32417944,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-29T06:29:02.080Z","status":"ssl_error","status_checked_at":"2026-04-29T06:29:00.631Z","response_time":110,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.6:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["action-recognition","bidirectional-rnn","cnn","computer-vision","hmdb51","mnasnet","pytorch","pytorch-lightning","video-classification","video-processing"],"created_at":"2025-05-19T14:18:47.840Z","updated_at":"2026-04-29T08:34:31.807Z","avatar_url":"https://github.com/reshalfahsi.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Action Recognition Using CNN + Bidirectional RNN\n\n\n\u003cdiv align=\"center\"\u003e\n    \u003ca href=\"https://colab.research.google.com/github/reshalfahsi/action-recognition/blob/master/Action_Recognition.ipynb\"\u003e\u003cimg src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"colab\"\u003e\u003c/a\u003e\n    \u003cbr /\u003e\n\u003c/div\u003e\n\n\nGiven a video, we can undergo recognition or analysis to decide what action occurred in the clip. By nature, videos are a sequence of frames. Consequently, performing action recognition on video deals with processing spatio-temporal data. Here, we can make use of the HMDB51 dataset, consisting of 6k+ clips of 51 actions. This dataset has three separate train/test splits. Striving for simplicity, this project utilizes the first training split as the training set, the second testing split as the validation set, and the third testing split as the testing set. Regarding the action recognition model, CNN is customarily adopted to extract spatial information. Thus, a CNN architecture, MnasNet, is put into use. Next, to handle the temporal information, bidirectional RNN is employed. Succinctly, the action recognition model in this project is composed of CNN and bidirectional RNN.\n\n\n## Experiment\n\nPlease take a look at this [notebook](https://github.com/reshalfahsi/action-recognition/blob/master/Action_Recognition.ipynb) to see the recognition in action.\n\n\n## Result\n\n## Quantitative Result\n\nThe following table conveys the quantitative performance of the model.\n\nTest Metric  | Score\n------------ | -------------\nLoss         | 0.753\nAccuracy     | 88.39%\n\n\n## Loss and Accuracy Curve\n\n\u003cp align=\"center\"\u003e \u003cimg src=\"https://github.com/reshalfahsi/action-recognition/blob/master/assets/loss_curve.png\" alt=\"loss_curve\" \u003e \u003cbr /\u003e The loss curve on the (first) training split and the validation set (the second testing split) of the CNN + Bidirectional RNN model. \u003c/p\u003e\n\n\u003cp align=\"center\"\u003e \u003cimg src=\"https://github.com/reshalfahsi/action-recognition/blob/master/assets/acc_curve.png\" alt=\"acc_curve\" \u003e \u003cbr /\u003e The accuracy curve on the (first) training split and the validation set (the second testing split) of the CNN + Bidirectional RNN model. \u003c/p\u003e\n\n\n## Qualitative Result`\n\nHere is a compilation of several video clips with an in-frame caption tailored to their predicted and actual actions.\n\n\u003cp align=\"center\"\u003e \u003cimg src=\"https://github.com/reshalfahsi/action-recognition/blob/master/assets/result.gif\" alt=\"qualitative\" \u003e \u003cbr /\u003e The action recognition results of the CNN + Bidirectional RNN model. Several actions are shown in the compilation video: \u003ci\u003ebrush hair\u003c/i\u003e, \u003ci\u003ethrow\u003c/i\u003e, \u003ci\u003edive\u003c/i\u003e, \u003ci\u003eride bike\u003c/i\u003e, and \u003ci\u003eswing baseball.\u003c/i\u003e \u003c/p\u003e\n\n\n## Credit\n\n- [Video Classification with a CNN-RNN Architecture](https://keras.io/examples/vision/video_classification/)\n- [Bidirectional Recurrent Neural Networks](https://ieeexplore.ieee.org/document/650093)\n- [MnasNet: Platform-Aware Neural Architecture Search for Mobile](https://arxiv.org/pdf/1807.11626.pdf)\n- [HMDB: A Large Video Database for Human Motion Recognition](https://serre-lab.clps.brown.edu/wp-content/uploads/2012/08/Kuehne_etal_iccv11.pdf)\n- [MoViNet-pytorch](https://github.com/Atze00/MoViNet-pytorch)\n- [Gluon CV Toolkit HMDB51 Dataset](https://github.com/dmlc/gluon-cv/blob/master/gluoncv/data/hmdb51/classification.py)\n- [Forked Torchvision by Henry Xia](https://github.com/ehnryx/vision/tree/be6f398c0612c245b0019a286a99f80aca81de7d/torchvision/transforms)\n- [PyTorch Lightning](https://lightning.ai/docs/pytorch/latest/)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Freshalfahsi%2Faction-recognition","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Freshalfahsi%2Faction-recognition","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Freshalfahsi%2Faction-recognition/lists"}