{"id":20764070,"url":"https://github.com/jpthu17/hbi","last_synced_at":"2025-04-06T12:12:03.027Z","repository":{"id":96490531,"uuid":"607472332","full_name":"jpthu17/HBI","owner":"jpthu17","description":"[CVPR 2023 Highlight \u0026 TPAMI] Video-Text as Game Players: Hierarchical Banzhaf Interaction for Cross-Modal Representation Learning","archived":false,"fork":false,"pushed_at":"2024-12-28T06:49:07.000Z","size":53529,"stargazers_count":115,"open_issues_count":4,"forks_count":5,"subscribers_count":3,"default_branch":"main","last_synced_at":"2025-03-30T11:09:29.347Z","etag":null,"topics":["cross-modal-retrieval","cvpr","video-question-answering","video-retrieval"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/jpthu17.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-02-28T03:07:29.000Z","updated_at":"2025-03-28T10:29:30.000Z","dependencies_parsed_at":"2025-01-30T23:10:54.068Z","dependency_job_id":"5285494b-06e7-494d-8115-cb10d40880f4","html_url":"https://github.com/jpthu17/HBI","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jpthu17%2FHBI","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jpthu17%2FHBI/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jpthu17%2FHBI/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jpthu17%2FHBI/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/jpthu17","download_url":"https://codeload.github.com/jpthu17/HBI/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247478324,"owners_count":20945266,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cross-modal-retrieval","cvpr","video-question-answering","video-retrieval"],"created_at":"2024-11-17T10:48:32.755Z","updated_at":"2025-04-06T12:12:03.009Z","avatar_url":"https://github.com/jpthu17.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cdiv align=\"center\"\u003e\n  \n# 【CVPR'2023 Highlight🔥\u0026TPAMI】Video-Text as Game Players: Hierarchical Banzhaf Interaction for Cross-Modal Representation Learning\n  \n[![Conference](http://img.shields.io/badge/CVPR-2023(Highlight)-FFD93D.svg)](https://cvpr.thecvf.com/)\n[![Project](http://img.shields.io/badge/Project-HBI-4D96FF.svg)](https://jpthu17.github.io/HBI/)\n[![Paper](http://img.shields.io/badge/Paper-arxiv.2303.14369-FF6B6B.svg)](https://arxiv.org/abs/2303.14369)\n\u003c/div\u003e\n\nThe implementation of CVPR 2023 Highlight (Top 10%) paper [Video-Text as Game Players: Hierarchical Banzhaf Interaction for Cross-Modal Representation Learning](https://arxiv.org/abs/2303.14369).\n\nIn this paper, we creatively model video-text as game players with multivariate cooperative game theory to wisely handle the uncertainty during fine-grained semantic interaction with diverse granularity, flexible combination, and vague intensity.\n\n## 📌 Citation\nIf you find this paper useful, please consider staring 🌟 this repo and citing 📑 our paper:\n```\n@article{jin2024hierarchical,\n  title={Hierarchical Banzhaf Interaction for General Video-Language Representation Learning},\n  author={Jin, Peng and Li, Hao and Yuan, Li and Yan, Shuicheng and Chen, Jie},\n  journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},\n  year={2024},\n  publisher={IEEE}\n}\n\n@inproceedings{jin2023video,\n  title={Video-text as game players: Hierarchical banzhaf interaction for cross-modal representation learning},\n  author={Jin, Peng and Huang, Jinfa and Xiong, Pengfei and Tian, Shangxuan and Liu, Chang and Ji, Xiangyang and Yuan, Li and Chen, Jie},\n  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},\n  pages={2472--2482},\n  year={2023}\n}\n```\n\n\u003cdetails open\u003e\u003csummary\u003e💡 I also have other text-video retrieval projects that may interest you ✨. \u003c/summary\u003e\u003cp\u003e\n\n\u003e [**DiffusionRet: Generative Text-Video Retrieval with Diffusion Model**](https://arxiv.org/abs/2303.09867)\u003cbr\u003e\n\u003e Accepted by ICCV 2023 | [[DiffusionRet Code]](https://github.com/jpthu17/DiffusionRet)\u003cbr\u003e\n\u003e Peng Jin, Hao Li, Zesen Cheng, Kehan Li, Xiangyang Ji, Chang Liu, Li Yuan, Jie Chen\n\n\u003e [**Expectation-Maximization Contrastive Learning for Compact Video-and-Language Representations**](https://arxiv.org/abs/2211.11427)\u003cbr\u003e\n\u003e Accepted by NeurIPS 2022 | [[EMCL Code]](https://github.com/jpthu17/EMCL)\u003cbr\u003e\n\u003e Peng Jin, Jinfa Huang, Fenglin Liu, Xian Wu, Shen Ge, Guoli Song, David Clifton, Jie Chen\n\n\u003e [**Text-Video Retrieval with Disentangled Conceptualization and Set-to-Set Alignment**](https://arxiv.org/abs/2305.12218)\u003cbr\u003e\n\u003e Accepted by IJCAI 2023 | [[DiCoSA Code]](https://github.com/jpthu17/DiCoSA)\u003cbr\u003e\n\u003e Peng Jin, Hao Li, Zesen Cheng, Jinfa Huang, Zhennan Wang, Li Yuan, Chang Liu, Jie Chen\n\u003c/p\u003e\u003c/details\u003e\n\n## 📣 Updates\n* **[2023/10/15]**: We release our [pre-trained estimator weights](https://github.com/jpthu17/HBI#train-the-banzhaf-interaction-estimator). If you want to apply a to other tasks, you can initialize a new estimator with the weights we provide. If you want better performance, you can train the estimator with a smaller learning rate and more epochs.\n* **[2023/10/11]**: We release code for Banzhaf Interaction estimator. Recommended running parameters will be provided shortly, and we will also release our pre-trained estimator weights.\n* **[2023/10/08]**: I am working on the code for Banzhaf Interaction estimator, which is expected to be released soon.\n* **[2023/06/28]**: Release code for reimplementing the experiments in the paper.\n* **[2023/03/28]**: Our **HBI** has been selected as a Highlight paper at CVPR 2023! (Top 2.5% of 9155 submissions).\n* **[2023/02/28]**: We will release the code asap. (I am busy with other DDLs. After that, I will open the source code as soon as possible. Please understand.)\n\n  \n## ⚡ Demo\n\u003cdiv align=\"center\"\u003e\n  \nhttps://user-images.githubusercontent.com/53246557/221760113-4a523e7e-d743-4dff-9f16-357ab0be0d5b.mp4\n\u003c/div\u003e\n\n\n## 😍 Visualization\n\n### Example 1\n\u003cdiv align=center\u003e\n\u003cimg src=\"static/images/Visualization_1.png\" width=\"800px\"\u003e\n\u003c/div\u003e\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cb\u003eMore examples\u003c/b\u003e\u003c/summary\u003e\n  \n### Example 2\n\u003cdiv align=center\u003e\n\u003cimg src=\"static/images/Visualization_2.png\" width=\"800px\"\u003e\n\u003c/div\u003e\n\n### Example 3\n\u003cdiv align=center\u003e\n\u003cimg src=\"static/images/Visualization_3.png\" width=\"800px\"\u003e\n\u003c/div\u003e\n\n### Example 4\n\u003cdiv align=center\u003e\n\u003cimg src=\"static/images/Visualization_4.png\" width=\"800px\"\u003e\n\u003c/div\u003e\n\n### Example 5\n\u003cdiv align=center\u003e\n\u003cimg src=\"static/images/Visualization_5.png\" width=\"800px\"\u003e\n\u003c/div\u003e\n\n### Example 6\n\u003cdiv align=center\u003e\n\u003cimg src=\"static/images/Visualization_6.png\" width=\"800px\"\u003e\n\u003c/div\u003e\n\n### Example 7\n\u003cdiv align=center\u003e\n\u003cimg src=\"static/images/Visualization_0.png\" width=\"800px\"\u003e\n\u003c/div\u003e\n\n\u003c/details\u003e\n\n## 🚀 Quick Start\n### Setup\n\n#### Setup code environment\n```shell\nconda create -n HBI python=3.9\nconda activate HBI\npip install -r requirements.txt\npip install torch==1.8.1+cu102 torchvision==0.9.1+cu102 -f https://download.pytorch.org/whl/torch_stable.html\n```\n\n#### Download CLIP Model\n```shell\ncd HBI/models\nwget https://openaipublic.azureedge.net/clip/models/40d365715913c9da98579312b702a82c18be219cc2a73407c4526f58eba950af/ViT-B-32.pt\n# wget https://openaipublic.azureedge.net/clip/models/5806e77cd80f8b59890b7e101eabd078d9fb84e6937f9e85e4ecb61988df416f/ViT-B-16.pt\n# wget https://openaipublic.azureedge.net/clip/models/b8cca3fd41ae0c99ba7e8951adf17d267cdb84cd88be6f7c2e0eca1737a03836/ViT-L-14.pt\n```\n\n#### Download Datasets\n\u003cdiv align=center\u003e\n\n|Datasets|Google Cloud|Baidu Yun|Peking University Yun|\n|:--------:|:--------------:|:-----------:|:-----------:|\n| MSR-VTT | [Download](https://drive.google.com/drive/folders/1LYVUCPRxpKMRjCSfB_Gz-ugQa88FqDu_?usp=sharing) | [Download](https://pan.baidu.com/s/1Gdf6ivybZkpua5z1HsCWRA?pwd=enav) | [Download](https://disk.pku.edu.cn/link/AA6A028EE7EF5C48A788118B82D6ABE0C5) |\n| MSVD | [Download](https://drive.google.com/drive/folders/18EXLWvCCQMRBd7-n6uznBUHdP4uC6Q15?usp=sharing) | [Download](https://pan.baidu.com/s/1hApFdxgV3TV2TCcnM_yBiA?pwd=kbfi) | [Download](https://disk.pku.edu.cn/link/AA6BD6FC1A490F4D0E9C384EF347F0D07F) |\n| ActivityNet | TODO | [Download](https://pan.baidu.com/s/1tI441VGvN3In7pcvss0grg?pwd=2ddy) | [Download](https://disk.pku.edu.cn/link/AAE744E6488E2049BD9412738E14AAA8EA) |\n| DiDeMo | TODO | [Download](https://pan.baidu.com/s/1Tsy9nb1hWzeXaZ4xr7qoTg?pwd=c842) | [Download](https://disk.pku.edu.cn/link/AA14E48D1333114022B736291D60350FA5) |\n\n\u003c/div\u003e\n\n#### Train the Banzhaf Interaction Estimator\n\nTrain the estimator according to the label generated by the BanzhafInteraction in HBI/models/banzhaf.py. \n\nThe training code is provided in banzhaf_estimator.py. We provide our trained weights, and if you want to apply a to other tasks, you can initialize a new estimator with the weights we provide.\n\nWe have tested the performance of [Estimator_1e-2_epoch6](https://drive.google.com/file/d/1GYDUIlEA1Fe9E_9IhE4Thgm5mo2ZcRa6/view?usp=sharing) with R@1 of 48.2 ([log](https://drive.google.com/file/d/1F-QvhvFj9s7tqoLnVwuUKCIbnLr2MHBq/view?usp=sharing)) on the MSR-VTT dataset. If you want better performance, you can train the estimator with a smaller learning rate and more epochs.\n\n\u003cdiv align=center\u003e\n\n|   Models    | Google Cloud | Baidu Yun |Peking University Yun| log|\n|:-----------:|:------------:|:---------:|:-----------:|:-----------:|\n|   Estimator_1e-2_epoch1   |     [Download](https://drive.google.com/file/d/1U2QsawOhBaPthZd13_pi_Qhi6kgvT1GB/view?usp=sharing)     |  [Download](https://pan.baidu.com/s/1mxpSHAxEH8qz59ROJTwH7A?pwd=ewsp)     | [Download](https://disk.pku.edu.cn:443/link/3E245D48A388A9DDCA9B8A45BE31C594)  | [log](https://drive.google.com/file/d/1rD1ywMgP_q_M-Njz7QVC0mOX0mM4wbUH/view?usp=sharing)  |\n|   Estimator_1e-2_epoch2   |     [Download](https://drive.google.com/file/d/1cdv6058pu2xhroI4gk4gl60IT7wWIDkj/view?usp=sharing)     |  [Download](https://pan.baidu.com/s/1Yo-fve2Oq1_KoLKQwztD5w?pwd=3mlo)      | [Download](https://disk.pku.edu.cn:443/link/AE8F75FC2A97DD903C4D562D965B6728)  | [log](https://drive.google.com/file/d/1rD1ywMgP_q_M-Njz7QVC0mOX0mM4wbUH/view?usp=sharing)  |\n|   Estimator_1e-2_epoch3   |     [Download](https://drive.google.com/file/d/1XjTWpyRFy0SmzsbyZ2YS2UczEEEgMppP/view?usp=sharing)     |  [Download](https://pan.baidu.com/s/1FPFlOtAVU27KCFH9i4eWZg?pwd=p5qf)      | [Download](https://disk.pku.edu.cn:443/link/0ACDF14C9CA901898F15B4CC4F8C0E30)  | [log](https://drive.google.com/file/d/1rD1ywMgP_q_M-Njz7QVC0mOX0mM4wbUH/view?usp=sharing)  |\n|   Estimator_1e-2_epoch4   |     [Download](https://drive.google.com/file/d/12b6Pjg5HrIRhMqq5KkLF_FKXY4RHv4Hn/view?usp=sharing)     |  [Download](https://pan.baidu.com/s/1LP99MFizCr_bgt9DtlLweg?pwd=skn3)     | [Download](https://disk.pku.edu.cn:443/link/615B6ABAB30E5A3064310ACAC28BC5CD)  | [log](https://drive.google.com/file/d/1rD1ywMgP_q_M-Njz7QVC0mOX0mM4wbUH/view?usp=sharing)  |\n|   Estimator_1e-2_epoch5   |     [Download](https://drive.google.com/file/d/1oLil8xQ0JwI2QWGNj8ghs_x1nI-mHigp/view?usp=sharing)     |  [Download](https://pan.baidu.com/s/1ORJkUmLe2fhMySTQrlKWcw?pwd=c8w8)      | [Download](https://disk.pku.edu.cn:443/link/5E1DEA84D402AFFFB304F571949736B1)  | [log](https://drive.google.com/file/d/1rD1ywMgP_q_M-Njz7QVC0mOX0mM4wbUH/view?usp=sharing)  |\n|   Estimator_1e-2_epoch6   |     [Download](https://drive.google.com/file/d/1GYDUIlEA1Fe9E_9IhE4Thgm5mo2ZcRa6/view?usp=sharing)    |  [Download](https://pan.baidu.com/s/1Kmn3laMFrG8WWQqNIyK69Q?pwd=79eb)     | [Download](https://disk.pku.edu.cn:443/link/7893AD6A50BAFCA342456B0B04C99419)  | [log](https://drive.google.com/file/d/1rD1ywMgP_q_M-Njz7QVC0mOX0mM4wbUH/view?usp=sharing)  |\n\n\u003c/div\u003e\n\n```shell\nCUDA_VISIBLE_DEVICES=0,1,2,3 \\\npython -m torch.distributed.launch \\\n--master_port 2502 \\\n--nproc_per_node=4 \\\nbanzhaf_estimator.py \\\n--do_train 1 \\\n--workers 8 \\\n--n_display 1 \\\n--epochs 10 \\\n--lr 1e-2 \\\n--coef_lr 1e-3 \\\n--batch_size 128 \\\n--batch_size_val 128 \\\n--anno_path data/MSR-VTT/anns \\\n--video_path ${DATA_PATH}/MSRVTT_Videos \\\n--datatype msrvtt \\\n--max_words 24 \\\n--max_frames 12 \\\n--video_framerate 1 \\\n--output_dir ${OUTPUT_PATH} \n```\n\n### Text-video Retrieval\n\u003cdiv align=center\u003e\n\n|Checkpoint|Google Cloud|Baidu Yun|Peking University Yun|\n|:--------:|:--------------:|:-----------:|:-----------:|\n| MSR-VTT | [Download](https://drive.google.com/file/d/1hoV9vsT0-KIjjIRPIB9D4dMXwrckvSLk/view?usp=sharing) | [Download](https://pan.baidu.com/s/1WWlpoSAUII3KH6KNsq7VSQ?pwd=pkph) | [Download](https://disk.pku.edu.cn:443/link/424DFFAC5D2CB600E73BCB67C05A73FD) |\n| ActivityNet | [Download](https://drive.google.com/file/d/1TRUAl17Wj2g2cyxWC5HUPflUo7eg78uu/view?usp=drive_link) | [Download](https://pan.baidu.com/s/1ynAaE0NWXx0LHhUZCC0uww?pwd=ta8v) | [Download](https://disk.pku.edu.cn:443/link/A7BDBF989B3E2C6356283ED01FBAACF2) |\n\n\u003c/div\u003e\n\n#### Eval on MSR-VTT\n```shell\nCUDA_VISIBLE_DEVICES=0,1 \\\npython -m torch.distributed.launch \\\n--master_port 2502 \\\n--nproc_per_node=2 \\\nmain_retrieval.py \\\n--do_eval 1 \\\n--workers 8 \\\n--n_display 50 \\\n--batch_size_val 128 \\\n--anno_path data/MSR-VTT/anns \\\n--video_path ${DATA_PATH}/MSRVTT_Videos \\\n--datatype msrvtt \\\n--max_words 24 \\\n--max_frames 12 \\\n--video_framerate 1 \\\n--init_model ${CHECKPOINT_PATH} \\\n--output_dir ${OUTPUT_PATH} \n```\n\n#### Train on MSR-VTT\n```shell\nCUDA_VISIBLE_DEVICES=0,1 \\\npython -m torch.distributed.launch \\\n--master_port 2502 \\\n--nproc_per_node=2 \\\nmain_retrieval.py \\\n--do_train 1 \\\n--workers 8 \\\n--n_display 50 \\\n--epochs 5 \\\n--lr 1e-4 \\\n--coef_lr 1e-3 \\\n--batch_size 128 \\\n--batch_size_val 128 \\\n--anno_path data/MSR-VTT/anns \\\n--video_path ${DATA_PATH}/MSRVTT_Videos \\\n--datatype msrvtt \\\n--max_words 24 \\\n--max_frames 12 \\\n--video_framerate 1 \\\n--estimator ${ESTIMATOR_PATH} \\\n--output_dir ${OUTPUT_PATH} \\\n--kl 2 \\\n--skl 1\n```\n\n#### Eval on ActivityNet Captions\n```shell\nCUDA_VISIBLE_DEVICES=0,1 \\\npython -m torch.distributed.launch \\\n--master_port 2502 \\\n--nproc_per_node=2 \\\nmain_retrieval.py \\\n--do_eval 1 \\\n--workers 8 \\\n--n_display 50 \\\n--batch_size_val 128 \\\n--anno_path ${DATA_PATH}/ActivityNet \\\n--video_path ${DATA_PATH}/ActivityNet/Activity_Videos \\\n--datatype activity \\\n--max_words 64 \\\n--max_frames 64 \\\n--video_framerate 1 \\\n--init_model ${CHECKPOINT_PATH} \\\n--output_dir ${OUTPUT_PATH} \n```\n\n#### Train on ActivityNet Captions\n```shell\nCUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 \\\npython -m torch.distributed.launch \\\n--master_port 2502 \\\n--nproc_per_node=8 \\\nmain_retrieval.py \\\n--do_train 1 \\\n--workers 8 \\\n--n_display 10 \\\n--epochs 10 \\\n--lr 1e-4 \\\n--coef_lr 1e-3 \\\n--batch_size 128 \\\n--batch_size_val 128 \\\n--anno_path ${DATA_PATH}/ActivityNet \\\n--video_path ${DATA_PATH}/ActivityNet/Activity_Videos \\\n--datatype activity \\\n--max_words 64 \\\n--max_frames 64 \\\n--video_framerate 1 \\\n--estimator ${ESTIMATOR_PATH} \\\n--output_dir ${OUTPUT_PATH} \\\n--kl 2 \\\n--skl 1\n```\n\n### Video-question Answering\n\u003cdiv align=center\u003e\n\n|Checkpoint|Google Cloud|Baidu Yun|Peking University Yun|\n|:--------:|:--------------:|:-----------:|:-----------:|\n| MSR-VTT-QA | [Download](https://drive.google.com/file/d/15GZXMaPvowL4GgxtB9ETvb8vivdcE8Wd/view?usp=sharing) | [Download](https://pan.baidu.com/s/1a959PS2EaYHxcYyrrQ4odQ?pwd=r34t) | [Download](https://disk.pku.edu.cn:443/link/DE99ECAD7C1E7F550A2753B561086CDF) |\n\n\u003c/div\u003e\n\n#### Eval on MSR-VTT-QA\n\n```shell\nCUDA_VISIBLE_DEVICES=0,1 \\\npython -m torch.distributed.launch \\\n--master_port 2502 \\\n--nproc_per_node=2 \\\nmain_vqa.py \\\n--do_eval \\ \n--num_thread_reader=8 \\\n--train_csv data/MSR-VTT/qa/train.jsonl \\\n--val_csv data/MSR-VTT/qa/test.jsonl \\\n--data_path data/MSR-VTT/qa/train_ans2label.json \\\n--features_path ${DATA_PATH}/MSRVTT_Videos \\\n--max_words 32 \\\n--max_frames 12 \\\n--batch_size_val 16 \\\n--datatype msrvtt \\\n--expand_msrvtt_sentences  \\\n--feature_framerate 1 \\\n--freeze_layer_num 0  \\\n--slice_framepos 2 \\\n--loose_type \\\n--linear_patch 2d \\\n--init_model ${CHECKPOINT_PATH} \\\n--output_dir ${OUTPUT_PATH}\n```\n\n#### Train on MSR-VTT-QA\n\n```shell\nCUDA_VISIBLE_DEVICES=0,1 \\\npython -m torch.distributed.launch \\\n--master_port 2502 \\\n--nproc_per_node=2 \\\nmain_vqa.py \\\n--do_train \\ \n--num_thread_reader=8 \\\n--epochs=5 \\\n--batch_size=32 \\\n--n_display=50 \\\n--train_csv data/MSR-VTT/qa/train.jsonl \\\n--val_csv data/MSR-VTT/qa/test.jsonl \\\n--data_path data/MSR-VTT/qa/train_ans2label.json \\\n--features_path ${DATA_PATH}/MSRVTT_Videos \\\n--lr 1e-4 \\\n--max_words 32 \\\n--max_frames 12 \\\n--batch_size_val 16 \\\n--datatype msrvtt \\\n--expand_msrvtt_sentences  \\\n--feature_framerate 1 \\\n--coef_lr 1e-3 \\\n--freeze_layer_num 0  \\\n--slice_framepos 2 \\\n--loose_type \\\n--linear_patch 2d \\\n--estimator ${ESTIMATOR_PATH} \\\n--output_dir ${OUTPUT_PATH} \\\n--kl 2 \\\n--skl 1\n```\n\n## 🎗️ Acknowledgments\nOur code is based on [EMCL](https://github.com/jpthu17/EMCL), [CLIP](https://github.com/openai/CLIP), [CLIP4Clip](https://github.com/ArrowLuo/CLIP4Clip/) and [DRL](https://github.com/foolwood/DRL). We sincerely appreciate for their contributions.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjpthu17%2Fhbi","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fjpthu17%2Fhbi","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjpthu17%2Fhbi/lists"}