{"id":22068843,"url":"https://github.com/shenyunhang/APE","last_synced_at":"2025-07-24T07:31:26.008Z","repository":{"id":209718651,"uuid":"682863836","full_name":"shenyunhang/APE","owner":"shenyunhang","description":"[CVPR 2024] Aligning and Prompting Everything All at Once for Universal Visual Perception","archived":false,"fork":false,"pushed_at":"2024-05-08T01:41:00.000Z","size":51746,"stargazers_count":424,"open_issues_count":36,"forks_count":26,"subscribers_count":6,"default_branch":"main","last_synced_at":"2024-05-08T02:28:53.843Z","etag":null,"topics":["image-segmentation","object-detection","open-world","referring-expression-comprehension","vision-language-transformer"],"latest_commit_sha":null,"homepage":"https://arxiv.org/abs/2312.02153","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/shenyunhang.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-08-25T03:46:12.000Z","updated_at":"2024-05-08T02:28:57.461Z","dependencies_parsed_at":"2024-05-08T02:39:14.205Z","dependency_job_id":null,"html_url":"https://github.com/shenyunhang/APE","commit_stats":null,"previous_names":["shenyunhang/ape"],"tags_count":1,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/shenyunhang%2FAPE","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/shenyunhang%2FAPE/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/shenyunhang%2FAPE/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/shenyunhang%2FAPE/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/shenyunhang","download_url":"https://codeload.github.com/shenyunhang/APE/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":227421346,"owners_count":17775010,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["image-segmentation","object-detection","open-world","referring-expression-comprehension","vision-language-transformer"],"created_at":"2024-11-30T20:04:24.934Z","updated_at":"2024-11-30T20:07:12.437Z","avatar_url":"https://github.com/shenyunhang.png","language":"Python","funding_links":[],"categories":["Paper List"],"sub_categories":["Follow-up Papers"],"readme":"# APE: Aligning and Prompting Everything All at Once for Universal Visual Perception\n\n\n\u003c!-- \n\u003ca href='https://github.com/shenyunhang/APE'\u003e\u003cimg src='https://img.shields.io/badge/Project-Page-Green'\u003e\u003c/a\u003e\n\u003ca href='https://arxiv.org/abs/2312.02153'\u003e\u003cimg src='https://img.shields.io/badge/Paper-Arxiv-red'\u003e\u003c/a\u003e\n\u003ca href='https://huggingface.co/spaces/shenyunhang/APE'\u003e\u003cimg src='https://img.shields.io/badge/%F0%9F%A4%97-Demo-yellow'\u003e\u003c/a\u003e\n\u003ca href='https://huggingface.co/shenyunhang/APE'\u003e\u003cimg src='https://img.shields.io/badge/%F0%9F%A4%97-Model-yellow'\u003e\u003c/a\u003e\n[![Code License](https://img.shields.io/badge/Code%20License-Apache_2.0-green.svg)](https://github.com/tatsu-lab/stanford_alpaca/blob/main/LICENSE)\n--\u003e\n\n\u003cp align=\"center\"\u003e\n    \u003cimg src=\"./.asset/ape.png\" width=\"96%\" height=\"96%\"\u003e\n\u003c/p\u003e\n\n\n\u003cfont size=7\u003e\u003cdiv align='center' \u003e :grapes: \\[[Read our arXiv Paper](https://arxiv.org/abs/2312.02153)\\] \u0026nbsp; :apple: \\[[Try our Online Demo](https://huggingface.co/spaces/shenyunhang/APE)\\] \u003c/div\u003e\u003c/font\u003e\n\n\n---\n\n\u003cp align=\"center\"\u003e\n    \u003cimg src=\"./.asset/example_1.png\" width=\"96%\" height=\"96%\"\u003e\n\u003c/p\u003e\n\n\n## :bulb: Highlight\n\n- **High Performance.**  SotA (or competitive) performance on **160** datasets with only one model.\n- **Perception in the Wild.** Detect and segment **everything** with thousands of vocabularies or language descriptions all at once.\n- **Flexible.** Support both foreground objects and background stuff for instance segmentation and semantic segmentation.\n\n## :fire: News\n* **`2024.04.07`** Release checkpoints for APE-Ti with only 6M backbone!\n* **`2024.02.27`** APE has been accepted to CVPR 2024!\n* **`2023.12.05`** Release training codes!\n* **`2023.12.05`** Release checkpoints for APE-L!\n* **`2023.12.05`** Release inference codes and demo!\n\n## :label: TODO \n\n- [x] Release inference code and demo.\n- [x] Release checkpoints.\n- [x] Release training codes.\n- [ ] Add clean docs.\n\n\n## :hammer_and_wrench: Install \n\n1. Clone the APE repository from GitHub:\n\n```bash\ngit clone https://github.com/shenyunhang/APE\ncd APE\n```\n\n2. Install the required dependencies and APE:\n\n```bash\npip3 install -r requirements.txt\npython3 -m pip install -e .\n```\n\n\n## :arrow_forward: Demo Localy\n\n**Web UI demo**\n```\npip3 install gradio\ncd APE/demo\npython3 app.py\n```\nThis demo will detect GPUs and use one GPU if you have GPUs.\n\nPlease feel free to try our [Online Demo](https://huggingface.co/spaces/shenyunhang/APE)!\n\n\u003cp align=\"center\"\u003e\n\u003cimg src=\"./.asset/demo.png\" width=\"96%\" height=\"96%\"\u003e\n\u003c/p\u003e\n\n\n## :books: Data Prepare\nFollowing [here](https://github.com/shenyunhang/APE/blob/main/datasets/README.md) to prepare the following datasets:\n\n|  Name |   COCO  |   LVIS  |  Objects365 | Openimages | VisualGenome |  SA-1B  |   RefCOCO  |   GQA   | PhraseCut | Flickr30k |         |\n|:-----:|:-------:|:-------:|:-----------:|:----------:|:------------:|:-------:|:----------:|:-------:|:---------:|:---------:|:-------:|\n| Train | \u0026check; | \u0026check; |   \u0026check;   |   \u0026check;  |    \u0026check;   | \u0026check; |   \u0026check;  | \u0026check; |  \u0026check;  |  \u0026check;  |         |\n|  Test | \u0026check; | \u0026check; |   \u0026check;   |   \u0026check;  |    \u0026cross;   | \u0026cross; |   \u0026check;  | \u0026cross; |  \u0026cross;  |  \u0026cross;  |         |\n|       |         |         |             |            |              |         |            |         |           |           |         |\n| Name  |  ODinW  |  SegInW | Roboflow100 |   ADE20k   |   ADE-full   |  BDD10k | Cityscapes |  PC459  |    PC59   |    VOC    |    D3   |\n| Train | \u0026cross; | \u0026cross; |   \u0026cross;   |   \u0026cross;  |    \u0026cross;   | \u0026cross; |   \u0026cross;  | \u0026cross; |  \u0026cross;  |  \u0026cross;  | \u0026cross; |\n|  Test | \u0026check; | \u0026check; |   \u0026check;   |   \u0026check;  |    \u0026check;   | \u0026check; |   \u0026check;  | \u0026check; |  \u0026check;  |  \u0026check;  | \u0026check; |\n\nNoted we do not use `coco_2017_train` for training.\n\nInstead, we augment `lvis_v1_train` with annotations from coco, and keep the image set unchanged.\n\nAnd we register it as `lvis_v1_train+coco` for instance segmentation and `lvis_v1_train+coco_panoptic_separated` for panoptic segmentation.\n\n\n## :test_tube: Inference\n\n### Infer on 160+ dataset\nWe provide several scripts to evaluate all models.\n\nIt is necessary to adjust the checkpoint location and GPU number in the scripts before running them.\n\n```bash\nscripts/eval_APE-L_D.sh\nscripts/eval_APE-L_C.sh\nscripts/eval_APE-L_B.sh\nscripts/eval_APE-L_A.sh\nscripts/eval_APE-Ti.sh\n```\n\n### Infer on images or videos\n\nAPE-L_D\n```\npython3 demo/demo_lazy.py \\\n--config-file configs/LVISCOCOCOCOSTUFF_O365_OID_VGR_SA1B_REFCOCO_GQA_PhraseCut_Flickr30k/ape_deta/ape_deta_vitl_eva02_clip_vlf_lsj1024_cp_16x4_1080k.py \\\n--input image1.jpg image2.jpg image3.jpg \\\n--output /path/to/output/dir \\\n--confidence-threshold 0.1 \\\n--text-prompt 'person,car,chess piece of horse head' \\\n--with-box \\\n--with-mask \\\n--with-sseg \\\n--opts \\\ntrain.init_checkpoint=/path/to/APE-D/checkpoint \\\nmodel.model_language.cache_dir=\"\" \\\nmodel.model_vision.select_box_nums_for_evaluation=500 \\\nmodel.model_vision.text_feature_bank_reset=True \\\n```\n\nTo disable `xformers`, add the following option:\n```\nmodel.model_vision.backbone.net.xattn=False \\\n```\n\nTo use `pytorch` version of `MultiScaleDeformableAttention`, add the following option:\n```\nmodel.model_vision.transformer.encoder.pytorch_attn=True \\\nmodel.model_vision.transformer.decoder.pytorch_attn=True \\\n```\n\n\n## :train: Training\n\n### Prepare backbone and language models\n```bash\ngit lfs install\ngit clone https://huggingface.co/QuanSun/EVA-CLIP models/QuanSun/EVA-CLIP/\ngit clone https://huggingface.co/BAAI/EVA models/BAAI/EVA/\ngit clone https://huggingface.co/Yuxin-CV/EVA-02 models/Yuxin-CV/EVA-02/\n```\n\nResize patch size:\n```bash\npython3 tools/eva_interpolate_patch_14to16.py --input models/QuanSun/EVA-CLIP/EVA02_CLIP_E_psz14_plus_s9B.pt --output models/QuanSun/EVA-CLIP/EVA02_CLIP_E_psz14to16_plus_s9B.pt --image_size 224\npython3 tools/eva_interpolate_patch_14to16.py --input models/QuanSun/EVA-CLIP/EVA01_CLIP_g_14_plus_psz14_s11B.pt --output models/QuanSun/EVA-CLIP/EVA01_CLIP_g_14_plus_psz14to16_s11B.pt --image_size 224\npython3 tools/eva_interpolate_patch_14to16.py --input models/QuanSun/EVA-CLIP/EVA02_CLIP_L_336_psz14_s6B.pt --output models/QuanSun/EVA-CLIP/EVA02_CLIP_L_336_psz14to16_s6B.pt --image_size 336\npython3 tools/eva_interpolate_patch_14to16.py --input models/Yuxin-CV/EVA-02/eva02/pt/eva02_Ti_pt_in21k_p14.pt --output models/Yuxin-CV/EVA-02/eva02/pt/eva02_Ti_pt_in21k_p14to16.pt --image_size 224\n```\n\n### Train APE-L_D\n\nSingle node:\n```bash\npython3 tools/train_net.py \\\n--num-gpus 8 \\\n--resume \\\n--config-file configs/LVISCOCOCOCOSTUFF_O365_OID_VGR_SA1B_REFCOCO_GQA_PhraseCut_Flickr30k/ape_deta/ape_deta_vitl_eva02_clip_vlf_lsj1024_cp_16x4_1080k_mdl.py \\\ntrain.output_dir=output/APE/configs/LVISCOCOCOCOSTUFF_O365_OID_VGR_SA1B_REFCOCO_GQA_PhraseCut_Flickr30k/ape_deta/ape_deta_vitl_eva02_clip_vlf_lsj1024_cp_16x4_1080k_mdl_`date +'%Y%m%d_%H%M%S'`\n```\n\nMultiple nodes:\n```bash\npython3 tools/train_net.py \\\n--dist-url=\"tcp://${MASTER_IP}:${MASTER_PORT}\" \\\n--num-gpus ${HOST_GPU_NUM} \\\n--num-machines ${HOST_NUM} \\\n--machine-rank ${INDEX} \\\n--resume \\\n--config-file configs/LVISCOCOCOCOSTUFF_O365_OID_VGR_SA1B_REFCOCO_GQA_PhraseCut_Flickr30k/ape_deta/ape_deta_vitl_eva02_clip_vlf_lsj1024_cp_16x4_1080k_mdl.py \\\ntrain.output_dir=output/APE/configs/LVISCOCOCOCOSTUFF_O365_OID_VGR_SA1B_REFCOCO_GQA_PhraseCut_Flickr30k/ape_deta/ape_deta_vitl_eva02_clip_vlf_lsj1024_cp_16x4_1080k_mdl_`date +'%Y%m%d_%H'`0000\n```\n\n### Train APE-L_C\n\nSingle node:\n```bash\npython3 tools/train_net.py \\\n--num-gpus 8 \\\n--resume \\\n--config-file configs/LVISCOCOCOCOSTUFF_O365_OID_VGR_SA1B_REFCOCO/ape_deta/ape_deta_vitl_eva02_vlf_lsj1024_cp_1080k.py \\\ntrain.output_dir=output/APE/configs/LVISCOCOCOCOSTUFF_O365_OID_VGR_SA1B_REFCOCO/ape_deta/ape_deta_vitl_eva02_vlf_lsj1024_cp_1080k_`date +'%Y%m%d_%H%M%S'`\n```\n\nMultiple nodes:\n```bash\npython3 tools/train_net.py \\\n--dist-url=\"tcp://${MASTER_IP}:${MASTER_PORT}\" \\\n--num-gpus ${HOST_GPU_NUM} \\\n--num-machines ${HOST_NUM} \\\n--machine-rank ${INDEX} \\\n--resume \\\n--config-file configs/LVISCOCOCOCOSTUFF_O365_OID_VGR_SA1B_REFCOCO/ape_deta/ape_deta_vitl_eva02_vlf_lsj1024_cp_1080k.py \\\ntrain.output_dir=output/APE/configs/LVISCOCOCOCOSTUFF_O365_OID_VGR_SA1B_REFCOCO/ape_deta/ape_deta_vitl_eva02_vlf_lsj1024_cp_1080k_`date +'%Y%m%d_%H'`0000\n```\n\n### Train APE-L_B\n\nSingle node:\n```bash\npython3 tools/train_net.py \\\n--num-gpus 8 \\\n--resume \\\n--config-file configs/LVISCOCOCOCOSTUFF_O365_OID_VGR_REFCOCO/ape_deta/ape_deta_vitl_eva02_vlf_lsj1024_cp_1080k.py \\\ntrain.output_dir=output/APE/configs/LVISCOCOCOCOSTUFF_O365_OID_VGR_REFCOCO/ape_deta/ape_deta_vitl_eva02_vlf_lsj1024_cp_1080k_`date +'%Y%m%d_%H%M%S'`\n```\n\nMultiple nodes:\n```bash\npython3 tools/train_net.py \\\n--dist-url=\"tcp://${MASTER_IP}:${MASTER_PORT}\" \\\n--num-gpus ${HOST_GPU_NUM} \\\n--num-machines ${HOST_NUM} \\\n--machine-rank ${INDEX} \\\n--resume \\\n--config-file configs/LVISCOCOCOCOSTUFF_O365_OID_VGR_REFCOCO/ape_deta/ape_deta_vitl_eva02_vlf_lsj1024_cp_1080k.py \\\ntrain.output_dir=output/APE/configs/LVISCOCOCOCOSTUFF_O365_OID_VGR_REFCOCO/ape_deta/ape_deta_vitl_eva02_vlf_lsj1024_cp_1080k_`date +'%Y%m%d_%H'`0000\n```\n\n### Train APE-L_A\n\nSingle node:\n```bash\npython3 tools/train_net.py \\\n--num-gpus 8 \\\n--resume \\\n--config-file configs/LVISCOCOCOCOSTUFF_O365_OID_VG/ape_deta/ape_deta_vitl_eva02_lsj1024_cp_720k.py \\\ntrain.output_dir=output/APE/configs/LVISCOCOCOCOSTUFF_O365_OID_VG/ape_deta/ape_deta_vitl_eva02_lsj1024_cp_720k_`date +'%Y%m%d_%H%M%S'`\n```\n\nMultiple nodes:\n```bash\npython3 tools/train_net.py \\\n--dist-url=\"tcp://${MASTER_IP}:${MASTER_PORT}\" \\\n--num-gpus ${HOST_GPU_NUM} \\\n--num-machines ${HOST_NUM} \\\n--machine-rank ${INDEX} \\\n--resume \\\n--config-file configs/LVISCOCOCOCOSTUFF_O365_OID_VG/ape_deta/ape_deta_vitl_eva02_lsj1024_cp_720k.py \\\ntrain.output_dir=output/APE/configs/LVISCOCOCOCOSTUFF_O365_OID_VG/ape_deta/ape_deta_vitl_eva02_lsj1024_cp_720k_`date +'%Y%m%d_%H'`0000\n```\n\n### Train APE-Ti\n\nSingle node:\n```bash\npython3 tools/train_net.py \\\n--num-gpus 8 \\\n--resume \\\n--config-file configs/LVISCOCOCOCOSTUFF_O365_OID_VGR_SA1B_REFCOCO_GQA_PhraseCut_Flickr30k/ape_deta/ape_deta_vitt_eva02_vlf_lsj1024_cp_16x4_1080k_mdl.py \\\ntrain.output_dir=output/APE/configs/LVISCOCOCOCOSTUFF_O365_OID_VGR_SA1B_REFCOCO_GQA_PhraseCut_Flickr30k/ape_deta/ape_deta_vitt_eva02_vlf_lsj1024_cp_16x4_1080k_mdl_`date +'%Y%m%d_%H%M%S'`\n```\n\nMultiple nodes:\n```bash\npython3 tools/train_net.py \\\n--dist-url=\"tcp://${MASTER_IP}:${MASTER_PORT}\" \\\n--num-gpus ${HOST_GPU_NUM} \\\n--num-machines ${HOST_NUM} \\\n--machine-rank ${INDEX} \\\n--resume \\\n--config-file configs/LVISCOCOCOCOSTUFF_O365_OID_VGR_SA1B_REFCOCO_GQA_PhraseCut_Flickr30k/ape_deta/ape_deta_vitt_eva02_vlf_lsj1024_cp_16x4_1080k_mdl.py \\\ntrain.output_dir=output/APE/configs/LVISCOCOCOCOSTUFF_O365_OID_VGR_SA1B_REFCOCO_GQA_PhraseCut_Flickr30k/ape_deta/ape_deta_vitt_eva02_vlf_lsj1024_cp_16x4_1080k_mdl_`date +'%Y%m%d_%H'`0000\n```\n\n\n## :luggage: Checkpoints\n\n```\ngit lfs install\ngit clone https://huggingface.co/shenyunhang/APE\n```\n\n\u003c!-- insert a table --\u003e\n\u003ctable\u003e\n  \u003cthead\u003e\n    \u003ctr style=\"text-align: right;\"\u003e\n      \u003cth\u003e\u003c/th\u003e\n      \u003cth\u003ename\u003c/th\u003e\n      \u003cth\u003eCheckpoint\u003c/th\u003e\n      \u003cth\u003eConfig\u003c/th\u003e\n    \u003c/tr\u003e\n  \u003c/thead\u003e\n  \u003ctbody\u003e\n    \u003ctr\u003e\n      \u003cth\u003e1\u003c/th\u003e\n      \u003ctd\u003eAPE-L_A\u003c/td\u003e\n      \u003ctd\u003e\u003ca href=\"https://huggingface.co/shenyunhang/APE/blob/main/configs/LVISCOCOCOCOSTUFF_O365_OID_VG/ape_deta/ape_deta_vitl_eva02_lsj_cp_720k_20230504_002019/model_final.pth\"\u003eHF link\u003c/a\u003e\u003c/td\u003e\n      \u003ctd\u003e\u003ca href=\"https://github.com/shenyunhang/APE/blob/main/configs/LVISCOCOCOCOSTUFF_O365_OID_VG/ape_deta/ape_deta_vitl_eva02_lsj1024_cp_720k.py\"\u003elink\u003c/a\u003e\u003c/td\u003e\n    \u003c/tr\u003e\n    \u003ctr\u003e\n      \u003cth\u003e2\u003c/th\u003e\n      \u003ctd\u003eAPE-L_B\u003c/td\u003e\n      \u003ctd\u003e\u003ca href=\"https://huggingface.co/shenyunhang/APE/blob/main/configs/LVISCOCOCOCOSTUFF_O365_OID_VGR_REFCOCO/ape_deta/ape_deta_vitl_eva02_vlf_lsj_cp_1080k_20230702_225418/model_final.pth\"\u003eHF link\u003c/a\u003e \n      \u003ctd\u003e\u003ca href=\"https://github.com/shenyunhang/APE/blob/main/configs/LVISCOCOCOCOSTUFF_O365_OID_VGR_REFCOCO/ape_deta/ape_deta_vitl_eva02_vlf_lsj1024_cp_1080k.py\"\u003elink\u003c/a\u003e\u003c/td\u003e\n    \u003c/tr\u003e\n    \u003ctr\u003e\n      \u003cth\u003e3\u003c/th\u003e\n      \u003ctd\u003eAPE-L_C\u003c/td\u003e\n      \u003ctd\u003e\u003ca href=\"https://huggingface.co/shenyunhang/APE/blob/main/configs/LVISCOCOCOCOSTUFF_O365_OID_VGR_SA1B_REFCOCO/ape_deta/ape_deta_vitl_eva02_vlf_lsj_cp_1080k_20230702_210950/model_final.pth\"\u003eHF link\u003c/a\u003e \n      \u003ctd\u003e\u003ca href=\"https://github.com/shenyunhang/APE/blob/main/configs/LVISCOCOCOCOSTUFF_O365_OID_VGR_SA1B_REFCOCO/ape_deta/ape_deta_vitl_eva02_vlf_lsj1024_cp_1080k.py\"\u003elink\u003c/a\u003e\u003c/td\u003e\n    \u003c/tr\u003e\n    \u003ctr\u003e\n      \u003cth\u003e4\u003c/th\u003e\n      \u003ctd\u003eAPE-L_D\u003c/td\u003e\n      \u003ctd\u003e\u003ca href=\"https://huggingface.co/shenyunhang/APE/blob/main/configs/LVISCOCOCOCOSTUFF_O365_OID_VGR_SA1B_REFCOCO_GQA_PhraseCut_Flickr30k/ape_deta/ape_deta_vitl_eva02_clip_vlf_lsj1024_cp_16x4_1080k_mdl_20230829_162438/model_final.pth\"\u003eHF link\u003c/a\u003e \n      \u003ctd\u003e\u003ca href=\"https://github.com/shenyunhang/APE/blob/main/configs/LVISCOCOCOCOSTUFF_O365_OID_VGR_SA1B_REFCOCO_GQA_PhraseCut_Flickr30k/ape_deta/ape_deta_vitl_eva02_clip_vlf_lsj1024_cp_16x4_1080k_mdl.py\"\u003elink\u003c/a\u003e\u003c/td\u003e\n    \u003c/tr\u003e\n    \u003ctr\u003e\n      \u003cth\u003e4\u003c/th\u003e\n      \u003ctd\u003eAPE-Ti\u003c/td\u003e\n      \u003ctd\u003e\u003ca href=\"https://huggingface.co/shenyunhang/APE/blob/main/configs/LVISCOCOCOCOSTUFF_O365_OID_VGR_SA1B_REFCOCO_GQA_PhraseCut_Flickr30k/ape_deta/ape_deta_vitt_eva02_vlf_lsj1024_cp_16x4_1080k_mdl_20240203_230000/model_final.pth\"\u003eHF link\u003c/a\u003e \n      \u003ctd\u003e\u003ca href=\"https://github.com/shenyunhang/APE/blob/main/configs/LVISCOCOCOCOSTUFF_O365_OID_VGR_SA1B_REFCOCO_GQA_PhraseCut_Flickr30k/ape_deta/ape_deta_vitt_eva02_vlf_lsj1024_cp_16x4_1080k_mdl.py\"\u003elink\u003c/a\u003e\u003c/td\u003e\n    \u003c/tr\u003e\n  \u003c/tbody\u003e\n\u003c/table\u003e\n\n\n## :medal_military: Results\n\n\u003cimg src=\".asset/radar.png\" alt=\"radar\" width=\"100%\"\u003e\n\n\n## :black_nib: Citation\n\nIf you find our work helpful for your research, please consider citing the following BibTeX entry.   \n\n```bibtex\n@inproceedings{APE,\n  title={Aligning and Prompting Everything All at Once for Universal Visual Perception},\n  author={Shen, Yunhang and Fu, Chaoyou and Chen, Peixian and Zhang, Mengdan and Li, Ke and Sun, Xing and Wu, Yunsheng and Lin, Shaohui and Ji, Rongrong},\n  journal={CVPR},\n  year={2024}\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fshenyunhang%2FAPE","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fshenyunhang%2FAPE","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fshenyunhang%2FAPE/lists"}