{"id":13993838,"url":"https://github.com/OpenGVLab/Vision-RWKV","last_synced_at":"2025-07-22T18:32:10.586Z","repository":{"id":225758138,"uuid":"765174073","full_name":"OpenGVLab/Vision-RWKV","owner":"OpenGVLab","description":"Vision-RWKV: Efficient and Scalable Visual Perception with RWKV-Like Architectures","archived":false,"fork":false,"pushed_at":"2024-10-31T05:59:54.000Z","size":933,"stargazers_count":357,"open_issues_count":12,"forks_count":13,"subscribers_count":5,"default_branch":"master","last_synced_at":"2024-10-31T06:25:45.088Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"https://arxiv.org/abs/2403.02308","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/OpenGVLab.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-02-29T12:23:45.000Z","updated_at":"2024-10-31T05:59:58.000Z","dependencies_parsed_at":"2024-05-29T19:38:06.036Z","dependency_job_id":"c8266569-1ddd-48ac-b537-2994a52a0bb4","html_url":"https://github.com/OpenGVLab/Vision-RWKV","commit_stats":null,"previous_names":["opengvlab/vision-rwkv"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/OpenGVLab%2FVision-RWKV","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/OpenGVLab%2FVision-RWKV/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/OpenGVLab%2FVision-RWKV/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/OpenGVLab%2FVision-RWKV/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/OpenGVLab","download_url":"https://codeload.github.com/OpenGVLab/Vision-RWKV/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":227156400,"owners_count":17739294,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-08-09T14:02:35.080Z","updated_at":"2025-07-22T18:32:10.578Z","avatar_url":"https://github.com/OpenGVLab.png","language":"Python","funding_links":[],"categories":["Python"],"sub_categories":[],"readme":"# Vision-RWKV\nThe official implementation of \"[Vision-RWKV: Efficient and Scalable Visual Perception with RWKV-Like Architectures](https://arxiv.org/abs/2403.02308)\".\n\n## News🚀🚀🚀\n- `2025/02/18`: A new version of the CUDA code has been added in the `cuda_new` folder to eliminate the hardcoding of `T_MAX`.\n- `2025/02/11`: 🎊🎊 Vison-RWKV is accepted by ICLR 2025!\n- `2024/04/14`: We support rwkv6 in classification task, higher performance!\n- `2024/03/04`: We release the code and models of Vision-RWKV.\n\n## Highlights\n\n- **High-Resolution Efficiency**: Processed high-resolution images smoothly with a global receptive field.\n- **Scalability**: Pre-trained with large-scale datasets and posses scale up stablity.\n- **Superior Performance**: Achieved a better performance in classfication tasks than ViTs. Surpassed window-based ViTs and comparabled to global attention ViTs with lower flops and higher speed in dense prediction tasks.\n- **Efficient Alternative**: Capability to be an alternative backbone to ViT in comprehensive vision tasks.\n\n\u003cimg width=\"1238\" alt=\"image\" src=\"https://github.com/OpenGVLab/Vision-RWKV/assets/23737120/10965279-6542-4f82-aef5-934b8d86b345\"\u003e\n\n\n## Overview\n\n\u003cimg width=\"1238\" alt=\"image\" src=\"https://github.com/OpenGVLab/Vision-RWKV/assets/23737120/7521a3d6-6b5a-4a24-9ec8-dfb4abd3fd84\"\u003e\n\n## Schedule\n- [x] Support RWKV6 as VRWKV6\n- [x] Release VRWKV-L\n- [x] Release VRWKV-T/S/B\n\n## Model Zoo\n\n### Pretrained Models\n|  Model  |   Size   |   Pretrain   |       Download       |\n|:-------:|:--------:|:------------:|:--------------------:|\n| VRWKV-L |    192   | ImageNet-22K | [ckpt](https://huggingface.co/OpenGVLab/Vision-RWKV/resolve/main/vrwkv_l_in22k_192.pth) |\n\n### Image Classification (ImageNet-1K)\n\n|  Model   |   Size   | #Param | #FLOPs |  Top-1 Acc |       Download       |\n|:--------:|:--------:| ------:| ------:|:----------:|:--------------------:|\n| VRWKV-T  |    224   |   6.2M |   1.2G |    75.1    | [ckpt](https://huggingface.co/OpenGVLab/Vision-RWKV/resolve/main/vrwkv_t_in1k_224.pth)    \\| [cfg](classification/configs/vrwkv/vrwkv_tiny_8xb128_in1k.py)        |\n| VRWKV-S  |    224   |  23.8M |   4.6G |    80.1    | [ckpt](https://huggingface.co/OpenGVLab/Vision-RWKV/resolve/main/vrwkv_s_in1k_224.pth)    \\| [cfg](classification/configs/vrwkv/vrwkv_small_8xb128_in1k.py)       |\n| VRWKV-B  |    224   |  93.7M |  18.2G |    82.0    | [ckpt](https://huggingface.co/OpenGVLab/Vision-RWKV/resolve/main/vrwkv_b_in1k_224.pth)    \\| [cfg](classification/configs/vrwkv/vrwkv_base_16xb64_in1k.py)        |\n| VRWKV-L  |    384   | 334.9M | 189.5G |    86.0    | [ckpt](https://huggingface.co/OpenGVLab/Vision-RWKV/resolve/main/vrwkv_l_22kto1k_384.pth) \\| [cfg](classification_internimage/configs/vrwkv_l_22kto1k_384.yaml) |\n| VRWKV6-T |    224   |   7.6M |   1.6G |    76.6    | [ckpt](https://huggingface.co/OpenGVLab/Vision-RWKV/resolve/main/vrwkv6_t_in1k_224.pth)    \\| [cfg](classification/configs/vrwkv6/vrwkv6_tiny_8xb128_in1k.py)        |\n| VRWKV6-S |    224   |  27.7M |   5.6G |    81.1    | [ckpt](https://huggingface.co/OpenGVLab/Vision-RWKV/resolve/main/vrwkv6_s_in1k_224.pth)    \\| [cfg](classification/configs/vrwkv6/vrwkv6_small_8xb128_in1k.py)       |\n| VRWKV6-B |    224   | 104.9M |  20.9G |    82.6    | [ckpt](https://huggingface.co/OpenGVLab/Vision-RWKV/resolve/main/vrwkv6_b_in1k_224.pth)    \\| [cfg](classification/configs/vrwkv6/vrwkv6_base_16xb64_in1k.py)        |\n\n- VRWKV-L is pretrained on ImageNet-22K and then finetuned on ImageNet-1K.\n- We train VRWKV-L with the internimage codebase for a higher speed.\n\n### Object Detection with Mask-RCNN head (COCO)\n\n\n|  Model  | #Param |  #FLOPs | box AP |  mask AP |       Download       |\n|:-------:| ------:| -------:|:------:|:--------:|:--------------------:|\n| VRWKV-T |   8.4M |   67.9G |  41.7  |   38.0   | [ckpt](https://huggingface.co/OpenGVLab/Vision-RWKV/resolve/main/mask_rcnn_vrwkv_adapter_tiny_fpn_1x_coco.pth)  \\| [cfg](detection/configs/mask_rcnn/mask_rcnn_vrwkv_adapter_tiny_fpn_1x_coco.py)  |\n| VRWKV-S |  29.3M |  189.9G |  44.8  |   40.2   | [ckpt](https://huggingface.co/OpenGVLab/Vision-RWKV/resolve/main/mask_rcnn_vrwkv_adapter_small_fpn_1x_coco.pth) \\| [cfg](detection/configs/mask_rcnn/mask_rcnn_vrwkv_adapter_small_fpn_1x_coco.py) |\n| VRWKV-B | 106.6M |  599.0G |  46.8  |   41.7   | [ckpt](https://huggingface.co/OpenGVLab/Vision-RWKV/resolve/main/mask_rcnn_vrwkv_adapter_base_fpn_1x_coco.pth)  \\| [cfg](detection/configs/mask_rcnn/mask_rcnn_vrwkv_adapter_base_fpn_1x_coco.py)  |\n| VRWKV-L | 351.9M | 1730.6G |  50.6  |   44.9   | [ckpt](https://huggingface.co/OpenGVLab/Vision-RWKV/resolve/main/mask_rcnn_vrwkv_adapter_large_fpn_1x_coco.pth) \\| [cfg](detection/configs/mask_rcnn/mask_rcnn_vrwkv_adapter_large_fpn_1x_coco.py) |\n\n- We report the \\#Param and \\#FLOPs of the backbone in this table.\n\n### Semantic Segmentation with UperNet head (ADE20K)\n\n\n|  Model  | #Param | #FLOPs |   mIoU   |       Download       |\n|:-------:| ------:| ------:|:--------:|:--------------------:|\n| VRWKV-T |   8.4M |  16.6G |   43.3   | [ckpt](https://huggingface.co/OpenGVLab/Vision-RWKV/resolve/main/upernet_vrwkv_adapter_tiny_512_160k_ade20k.pth)  \\| [cfg](segmentation/configs/ade20k/upernet_vrwkv_adapter_tiny_512_160k_ade20k.py)  |\n| VRWKV-S |  29.3M |  46.3G |   47.2   | [ckpt](https://huggingface.co/OpenGVLab/Vision-RWKV/resolve/main/upernet_vrwkv_adapter_small_512_160k_ade20k.pth) \\| [cfg](segmentation/configs/ade20k/upernet_vrwkv_adapter_small_512_160k_ade20k.py) |\n| VRWKV-B | 106.6M | 146.0G |   49.2   | [ckpt](https://huggingface.co/OpenGVLab/Vision-RWKV/resolve/main/upernet_vrwkv_adapter_base_512_160k_ade20k.pth)  \\| [cfg](segmentation/configs/ade20k/upernet_vrwkv_adapter_base_512_160k_ade20k.py)  |\n| VRWKV-L | 351.9M | 421.9G |   53.5   | [ckpt](https://huggingface.co/OpenGVLab/Vision-RWKV/resolve/main/upernet_vrwkv_adapter_large_512_160k_ade20k.pth) \\| [cfg](segmentation/configs/ade20k/upernet_vrwkv_adapter_large_512_160k_ade20k.py) |\n\n- We report the \\#Param and \\#FLOPs of the backbone in this table.\n\n## Citation\nIf this work is helpful for your research, please consider citing the following BibTeX entry.\n```BibTeX\n@article{duan2024vrwkv,\n  title={Vision-RWKV: Efficient and Scalable Visual Perception with RWKV-Like Architectures},\n  author={Duan, Yuchen and Wang, Weiyun and Chen, Zhe and Zhu, Xizhou and Lu, Lewei and Lu, Tong and Qiao, Yu and Li, Hongsheng and Dai, Jifeng and Wang, Wenhai},\n  journal={arXiv preprint arXiv:2403.02308},\n  year={2024}\n}\n```\n\n## License\nThis repository is released under the Apache 2.0 license as found in the [LICENSE](LICENSE) file.\n\n## Acknowledgement\n\nVision-RWKV is built with reference to the code of the following projects:  [RWKV](https://github.com/BlinkDL/RWKV-LM), [MMPretrain](https://github.com/open-mmlab/mmpretrain), [MMDetection](https://github.com/open-mmlab/mmdetection), [MMSegmentation](https://github.com/open-mmlab/mmsegmentation), [ViT-Adapter](https://github.com/czczup/ViT-Adapter), [InternImage](https://github.com/OpenGVLab/InternImage). Thanks for their awesome work!","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FOpenGVLab%2FVision-RWKV","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FOpenGVLab%2FVision-RWKV","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FOpenGVLab%2FVision-RWKV/lists"}