{"id":20663730,"url":"https://github.com/vita-group/uvc","last_synced_at":"2025-04-19T15:56:04.595Z","repository":{"id":64049835,"uuid":"437364017","full_name":"VITA-Group/UVC","owner":"VITA-Group","description":"[ICLR 2022] \"Unified Vision Transformer Compression\" by Shixing Yu*, Tianlong Chen*, Jiayi Shen, Huan Yuan, Jianchao Tan, Sen Yang, Ji Liu, Zhangyang Wang","archived":false,"fork":false,"pushed_at":"2023-12-01T06:42:31.000Z","size":8192,"stargazers_count":52,"open_issues_count":7,"forks_count":3,"subscribers_count":9,"default_branch":"main","last_synced_at":"2025-03-29T09:42:03.701Z","etag":null,"topics":["admm","compression","efficient-transformers","vision-transformer"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/VITA-Group.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2021-12-11T18:53:32.000Z","updated_at":"2025-01-22T08:06:51.000Z","dependencies_parsed_at":"2023-01-14T20:00:31.936Z","dependency_job_id":null,"html_url":"https://github.com/VITA-Group/UVC","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/VITA-Group%2FUVC","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/VITA-Group%2FUVC/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/VITA-Group%2FUVC/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/VITA-Group%2FUVC/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/VITA-Group","download_url":"https://codeload.github.com/VITA-Group/UVC/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":249731218,"owners_count":21317341,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["admm","compression","efficient-transformers","vision-transformer"],"created_at":"2024-11-16T19:19:32.581Z","updated_at":"2025-04-19T15:56:04.571Z","avatar_url":"https://github.com/VITA-Group.png","language":"Python","readme":"# Unified Vision Transformer Compression\n\n[![License: MIT](https://camo.githubusercontent.com/fd551ba4b042d89480347a0e74e31af63b356b2cac1116c7b80038f41b04a581/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f4c6963656e73652d4d49542d677265656e2e737667)](https://opensource.org/licenses/MIT)\n\nCodes for the paper: [ICLR 2022] [Unified Vision Transformer Compression](https://openreview.net/pdf?id=9jsZiUgkCZP).\n\nShixing Yu*, Tianlong Chen*, Jiayi Shen, Huan Yuan, Jianchao Tan, Sen Yang, Ji Liu, Zhangyang Wang\n\n\n\n## Overall Results\n\nExtensive experiments are conducted with several DeiT backbones on ImageNet, which consistently verify the effectiveness of our proposal. For example, UVC on DeiT-Tiny (with/without distillation tokens) yields around 50% FLOPs reduction, with little performance degradation (only 0.3%/0.9% loss compared to the baseline).\n\n| Method         | Acc           | FLOPs(G) | Compression Ratio (%) |\n| :------------- | :------------ | :------- | :-------------------- |\n| DeiT-Small     | 79.8          | 4.6      | 100                   |\n| SCOP           | 77.5 (-2.3)   | 2.6      | 56.4                  |\n| PoWER          | 78.3 (-1.5)   | 2.7      | 58.7                  |\n| HVT            | 78.0 (-1.8)   | 2.4      | 52.2                  |\n| Patch Slimming | 79.4 (-0.4)   | 2.6      | 56.5                  |\n| UVC (Ours)     | 79.44 (-0.36) | 2.65     | 57.61                 |\n| UVC (Ours)     | 78.82 (-0.98) | 2.32     | 50.41                 |\n\n\n\n## Overview of Proposed UVC\n\nWe formulate and solve UVC as a unified constrained optimization problem. It simultaneously learns model weights, layer-wise pruning ratios/masks, and skip configurations, under a distillation loss and an overall budget constraint.\n\n![architecture](Figs/architecture.png)\n\n\n\n## Implementations of UVC\n\n### Set the Environment\n\n```bash\nconda create -n vit python=3.6\n\npip install torch==1.7.1+cu101 torchvision==0.8.2+cu101 torchaudio==0.7.2 -f https://download.pytorch.org/whl/torch_stable.html\n\npip install tqdm scipy timm\npip install ml_collections\npip install tensorboard\n\ngit clone https://github.com/NVIDIA/apex\n\ncd apex\n\npip install -v --disable-pip-version-check --no-cache-dir --global-option=\"--cpp_ext\" --global-option=\"--cuda_ext\" ./\n\npip install -v --disable-pip-version-check --no-cache-dir ./\n```\n\n\n\n### Running command\n\nThe training contains two parts. \n\n* The first part is **UVC Training**. In this stage, it optimizes the architecture with primal-dual algorithm to find the optimal block-wise layout and skip configuration. \n* The second part is **Post Training**. In this stage, the architecture is fixed while only updating the weights to help the network to regain accuracy.\n\n#### Stage1 UVC Training\n\n```bash\npython -W ignore -m torch.distributed.launch \\\n--nproc_per_node=2 \\\n--master_port 6019 joint_train.py \\\n--gpu_num '0,1' \\\n--uvc_train \\\n--model_type deit_tiny_patch16_224 \\\n--model_path https://dl.fbaipublicfiles.com/deit/deit_tiny_patch16_224-a1311bcf.pth \\\n--distillation-type soft \\\n--distillation-alpha 0.1 \\\n--train_batch_size 512 \\\n--num_epochs 30 \\\n--eval_every 1000 \\\n--flops_with_mhsa 1 \\\n--zlr_schedule_list \"1,5,9,13,17\" \\\n--learning_rate 1e-4 \\\n--enable_deit 0 \\\n--budget 0.5 \\\n--enable_pruning 1 \\\n--enable_block_gating 1 \\\n--enable_patch_gating 1 \\\n--gating_weight 5e-4 \\\n--patch_weight 5 \\\n--patch_l1_weight 0.01 \\\n--patchloss \"l1\" \\\n--use_gumbel 1 \\\n--glr 0.1 \\\n--patchlr 0.01 \\\n--num_workers 64 \\\n--seed 730 \\\n--output_dir mc_deit_tiny_patch16_224_with_patch \\\n--log_interval 1000 \\\n--eps 0.1 \\\n--eps_decay 0.92 \\\n--enable_warmup 1 \\\n--warmup_epochs 5 \\\n--warmup_lr 1e-4 \\\n--z_grad_clip 0.5 \\\n--gating_interval 50\n```\n\n#### Stage2 Post Training\n\n```bash\npython -m torch.distributed.launch \\\n--nproc_per_node=2 --master_port 6382 post_train.py \\\n--pretrained 0 \\\n--model_type \"deit_small_patch16_224\" \\\n--model_path https://dl.fbaipublicfiles.com/deit/deit_small_patch16_224-cd65a155.pth \\\n--checkpoint_dir /home/shixing/deit_small_patch16_224_11.pth.tar \\\n--distillation-type soft \\\n--distillation-alpha 0.1 \\\n--train_batch_size 256 \\\n--gpu_num '2,3' \\\n--epochs 120 \\\n--eval_every 1000 \\\n--output_dir exp/deit_small_nasprune_0.58 \\\n--num_workers 64\n```\n\n\n\n## Citation\n\n```\n@inproceedings{yu2022unified,\n  author = {Yu, Shixing and Chen, Tianlong and Shen, Jiayi and Yuan, Huan and Tan, Jianchao and Yang, Sen and Liu, Ji and Wang, Zhangyang},\n  title = {Unified Visual Transformer Compression},\n  booktitle = {ICLR},\n  year = {2022},\n}\n```\n\n## Results\n\n* Our full compression log for reported results are under file log/.\n\n* Checkpoint of deit-tiny-distilled-patch16-224: https://drive.google.com/drive/folders/1kjhNsppWCLuaGm-fAf4tVbVRENL14PeD?usp=sharing (with distillation token)\n\u003cimg width=\"721\" alt=\"deit-tiny-distilled-patch16-224\" src=\"https://github.com/VITA-Group/UVC/assets/55985788/5c6d1f0d-b49d-4060-aed2-4b179e598e2a\"\u003e\n\n\n\n## Acknowledgement\n\nViT : https://github.com/jeonsworld/ViT-pytorch\n\nViT : https://github.com/google-research/vision_transformer\n\nDeiT: https://github.com/facebookresearch/deit\n\nT2T-ViT: https://github.com/yitu-opensource/T2T-ViT\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvita-group%2Fuvc","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fvita-group%2Fuvc","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvita-group%2Fuvc/lists"}