{"id":17349588,"url":"https://github.com/wofmanaf/ResT","last_synced_at":"2025-02-26T02:31:55.539Z","repository":{"id":44828462,"uuid":"369537668","full_name":"wofmanaf/ResT","owner":"wofmanaf","description":"This is an official implementation for \"ResT: An Efficient Transformer for Visual Recognition\".","archived":false,"fork":false,"pushed_at":"2022-09-28T02:38:39.000Z","size":227,"stargazers_count":277,"open_issues_count":11,"forks_count":28,"subscribers_count":6,"default_branch":"main","last_synced_at":"2024-10-16T18:17:44.943Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/wofmanaf.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2021-05-21T13:04:22.000Z","updated_at":"2024-09-23T01:44:26.000Z","dependencies_parsed_at":"2023-01-18T16:19:29.436Z","dependency_job_id":null,"html_url":"https://github.com/wofmanaf/ResT","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/wofmanaf%2FResT","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/wofmanaf%2FResT/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/wofmanaf%2FResT/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/wofmanaf%2FResT/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/wofmanaf","download_url":"https://codeload.github.com/wofmanaf/ResT/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":240780758,"owners_count":19856418,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-10-15T16:56:22.389Z","updated_at":"2025-02-26T02:31:55.189Z","avatar_url":"https://github.com/wofmanaf.png","language":"Python","funding_links":[],"categories":["Table of Contents"],"sub_categories":["微软Transformer霸榜模型"],"readme":"# Updates\n- (2022/05/10) Code of [ResTV2](https://arxiv.org/abs/2204.07366) is released! ResTv2 simplifies the EMSA structure in\n[ResTv1](https://arxiv.org/abs/2105.13677) (i.e., eliminating the multi-head interaction part) and employs an upsample\noperation to reconstruct the lost medium- and high-frequency information caused by the downsampling operation.\n\n# [ResT: An Efficient Transformer for Visual Recognition](https://arxiv.org/abs/2105.13677)\n\nOfficial PyTorch implementation of **ResTv1** and **ResTv2**, from the following paper:\n\n[ResT: An Efficient Transformer for Visual Recognition](https://arxiv.org/abs/2105.13677). NeurIPS 2021.\\\n[ResT V2: Simpler, Faster and Stronger](https://arxiv.org/abs/2204.07366). NeurIPS 2022.\\\nBy Qing-Long Zhang and Yu-Bin Yang \\\nState Key Laboratory for Novel Software Technology at Nanjing University\n\n--- \n\n\u003cp align=\"center\"\u003e\n\u003cimg src=\"figures/fig_1.png\" width=100% height=100% \nclass=\"center\"\u003e\n\u003c/p\u003e\n\n**ResTv1** is initially described in [arxiv](https://arxiv.org/abs/2105.13677), which capably serves as a\ngeneral-purpose backbone for computer vision. It can tackle input images with arbitrary size. Besides, \nResT compressed the memory of standard MSA and model the interaction between multi-heads while keeping \nthe diversity ability. \n\n## Catalog\n- [x] ImageNet-1K Training Code\n- [x] ImageNet-1K Fine-tuning Code  \n- [x] Downstream Transfer (Detection, Segmentation) Code\n\n\u003c!-- ✅ ⬜️  --\u003e\n\n## Results and Pre-trained Models\n### ImageNet-1K trained models\n\n|    name     | resolution |acc@1 | #params | FLOPs | Throughput | model |\n|:-----------:|:---:|:---:|:-------:|:-----:|:----------:|:---:|\n| ResTv1-Lite | 224x224 | 77.2 |   11M   | 1.4G  |    1246    | [baidu](https://pan.baidu.com/s/1VVzrzZi_tD3yTp_lw9tU9A)\n|  ResTv1-S   | 224x224 | 79.6 |   14M   | 1.9G  |    1043    | [baidu](https://pan.baidu.com/s/1Y-MIzzzcQnmrbHfGGR0mrw)\n|  ResTv1-B   | 224x224 | 81.6 |   30M   | 4.3G  |    673     | [baidu](https://pan.baidu.com/s/1HhR9YxtGIhouZ0GEA4LYlw)\n|  ResTv1-L   | 224x224 | 83.6 |   52M   | 7.9G  |    429     | [baidu](https://pan.baidu.com/s/14c4u_oRoBcKOt1aTlsBBpw)\n|  ResTv2-T   | 224x224 | 82.3 |   30M   | 4.1G  |    826     | [baidu](https://pan.baidu.com/s/1LHAbsrXnGsjvAE3d5zhaHQ) |\n|  ResTv2-T   | 384x384 | 83.7 |   30M   | 12.7G |    319     | [baidu](https://pan.baidu.com/s/1fEMs_OrDa_xF7Cw1DiBU9w) |\n|  ResTv2-S   | 224x224 | 83.2 |   41M   | 6.0G  |    687     | [baidu](https://pan.baidu.com/s/1nysV5MTtwsDLChrRa7vmZQ) |\n|  ResTv2-S   | 384x384 | 84.5 |   41M   | 18.4G |    256     | [baidu](https://pan.baidu.com/s/1S1GERP-lYEJANYr17xk3dA) |\n|  ResTv2-B   | 224x224 | 83.7 |   56M   | 7.9G  |    582     | [baidu](https://pan.baidu.com/s/1GH3N2_rbZx816mN87UzYgQ) |\n|  ResTv2-B   | 384x384 | 85.1 |   56M   | 24.3G |    210     | [baidu](https://pan.baidu.com/s/12RBMZmf6IlJIB3lIkeBH9Q) |\n|  ResTv2-L   | 224x224 | 84.2 |   87M   | 13.8G |    415     | [baidu](https://pan.baidu.com/s/1A2huwk_Ii4ZzQllg1iHrEw) |\n|  ResTv2-L   | 384x384 | 85.4 |   87M   | 42.4G |    141     | [baidu](https://pan.baidu.com/s/1dlxiWexb9mho63WdWS8nXg) |\n\n\nNote: Access code for `baidu` is `rest`. Pretrained models of ResTv1 is now available in [google drive](https://drive.google.com/drive/folders/1H6QUZsKYbU6LECtxzGHKqEeGbx1E8uQ9).\n\n## Installation\nPlease check [INSTALL.md](INSTALL.md) for installation instructions. \n\n## Evaluation\nWe give an example evaluation command for a ImageNet-1K pre-trained, then ImageNet-1K fine-tuned ResTv2-T:\n\nSingle-GPU\n```\npython main.py --model restv2_tiny --eval true \\\n--resume restv2_tiny_384.pth \\\n--input_size 384 --drop_path 0.1 \\\n--data_path /path/to/imagenet-1k\n```\n\nThis should give \n```\n* Acc@1 83.708 Acc@5 96.524 loss 0.777\n```\n\n- For evaluating other model variants, change `--model`, `--resume`, `--input_size` accordingly. You can get the url to pre-trained models from the tables above. \n- Setting model-specific `--drop_path` is not strictly required in evaluation, as the `DropPath` module in timm behaves the same during evaluation; but it is required in training. See [TRAINING.md](TRAINING.md) or our paper for the values used for different models.\n\n## Training\nSee [TRAINING.md](TRAINING.md) for training and fine-tuning instructions.\n\n## Acknowledgement\nThis repository is built using the [timm](https://github.com/rwightman/pytorch-image-models) library.\n\n## License\nThis project is released under the Apache License 2.0. Please see the [LICENSE](LICENSE) file for more information.\n\n## Citation\nIf you find this repository helpful, please consider citing:\n\n**ResTv1**\n```\n@inproceedings{zhang2021rest,\n  title={ResT: An Efficient Transformer for Visual Recognition},\n  author={Qinglong Zhang and Yu-bin Yang},\n  booktitle={Advances in Neural Information Processing Systems},\n  year={2021},\n  url={https://openreview.net/forum?id=6Ab68Ip4Mu}\n}\n```\n\n**ResTv2**\n```\n@article{zhang2022rest,\n  title={ResT V2: Simpler, Faster and Stronger},\n  author={Zhang, Qing-Long and Yang, Yu-Bin},\n  journal={arXiv preprint arXiv:2204.07366},\n  year={2022}\n```\n\n## Third-party Implementation\n[2022/05/26] ResT and ResT v2 have been integrated into [PaddleViT](https://github.com/BR-IDL/PaddleViT), checkout [here](https://github.com/BR-IDL/PaddleViT/tree/develop/image_classification/ResT) for the 3rd party implementation on Paddle framework!\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fwofmanaf%2FResT","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fwofmanaf%2FResT","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fwofmanaf%2FResT/lists"}