{"id":19899256,"url":"https://github.com/tootouch/reveca","last_synced_at":"2025-10-09T03:33:40.271Z","repository":{"id":41170260,"uuid":"490767525","full_name":"TooTouch/REVECA","owner":"TooTouch","description":"Generic Event Boundary Captioning (GEBC) Challenge at LOVEU@CVPR 2022 - 3rd place (REVECA)","archived":false,"fork":false,"pushed_at":"2023-02-17T07:01:47.000Z","size":122509,"stargazers_count":26,"open_issues_count":0,"forks_count":2,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-05-02T22:35:43.813Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/TooTouch.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2022-05-10T15:58:51.000Z","updated_at":"2024-01-04T17:08:28.000Z","dependencies_parsed_at":"2025-05-02T22:42:06.212Z","dependency_job_id":null,"html_url":"https://github.com/TooTouch/REVECA","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/TooTouch/REVECA","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/TooTouch%2FREVECA","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/TooTouch%2FREVECA/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/TooTouch%2FREVECA/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/TooTouch%2FREVECA/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/TooTouch","download_url":"https://codeload.github.com/TooTouch/REVECA/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/TooTouch%2FREVECA/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":279000714,"owners_count":26082911,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-09T02:00:07.460Z","response_time":59,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-12T20:07:48.606Z","updated_at":"2025-10-09T03:33:40.249Z","avatar_url":"https://github.com/TooTouch.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Generic Boundary Event Captioning Challenge at CVPR 2022 LOVEU workshop [[paper](https://arxiv.org/abs/2206.09178)]\n\n*Jaehyuk Heo, YongGi Jeong, Sunwoo Kim, Jaehee Kim, Pilsung Kang*  \n*School of Industrial \u0026 Management Engineering, Korea University*  \n*Seoul, Korea*\n\n\nWe propose the Rich Encoder-decoder framework for Video Event Captioner (REVECA). Our model achieves 3rd place in [GEBC Challenge](https://codalab.lisn.upsaclay.fr/competitions/4157#results).\n\n\u003cp align='center'\u003e\n    \u003cimg width='800' src='https://github.com/TooTouch/REVECA/blob/main/assets/figure1.png'\u003e\n\u003c/p\u003e\n\n# Environments\n\n1. Build a docker image and make a docker container\n\n```bash\ncd docker \nbash docker_build.sh $image_name\n```\n\n2. Install packages\n\n```bash\npip install -r requirements\n```\n\n# Datasets\n\nDownload Kinetics-GEBC and annotations in [here](https://sites.google.com/view/loveucvpr22/home?authuser=0). And save files in `./datasets`\n\n```\ndatasets/\n└── annotations\n    ├── testset_highest_f1.json\n    ├── trainset_highest_f1.json\n    ├── valset_highest_f1.json\n```\n\n\nOur model uses three video features: instance segmentation mask, TSN features\n\n1. We use the semantic segmentation mask for the training model. The segmentation model is [Mask2Former](https://github.com/facebookresearch/Mask2Former).\n\n![](https://github.com/TooTouch/REVECA/blob/main/assets/run_with_seg.gif)\n\n2. We use TSN features extracted by Temporal Segmentation Networks. TSN features released in GEBC Challenge can download [here](https://drive.google.com/drive/folders/1kOauKJY4MphWJhjYcXcCcdmP-071Fu6D?usp=sharing).\n\n\n# Methods\n\nOur video understanding model is called REVECA, based on CoCa. We use three methods: (1) Temporal-based Pairwise Difference (TPD), (2) Frame position embedding, and (3) LoRA. we use timm version == 0.6.2.dev0 and `loralib`. And then, we modify a `vision_transformer.py` for using LoRA. \n\n\n# Results\n\nMethod | Avg. | CIDEr | SPICE | ROUGE-L\n---|---|---|---|---\nCNN+LSTM | 29.94 | 49.73 | 13.62 | 26.46\nRobust Change Captioning | 34.16 | 58.56 | 16.34 | 27.57\nUniVL-revised | 36.64 | 65.74 | 18.06 | 26.12\nActBERT-revised | 40.80 | 74.71 | 19.52 | 28.15\n**REVECA (our model)** | **50.97** | **93.91** | **24.66** | **34.34**\n\n# Saved Model\n\nOur final model weights can download [here](https://drive.google.com/file/d/1sQZXg5-L6i5l6brCyu5HCsaoRvlVSiuO/view?usp=sharing).\n\n\n# Citation\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftootouch%2Freveca","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ftootouch%2Freveca","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftootouch%2Freveca/lists"}