{"id":13710906,"url":"https://github.com/donnyyou/torchcv","last_synced_at":"2025-05-15T15:08:21.198Z","repository":{"id":41284476,"uuid":"153722564","full_name":"donnyyou/torchcv","owner":"donnyyou","description":"TorchCV: A PyTorch-Based Framework for Deep Learning in Computer Vision","archived":false,"fork":false,"pushed_at":"2020-11-19T05:40:57.000Z","size":30194,"stargazers_count":2250,"open_issues_count":42,"forks_count":373,"subscribers_count":69,"default_branch":"master","last_synced_at":"2025-04-07T20:11:30.326Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"https://pytorchcv.com","language":"Shell","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/donnyyou.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2018-10-19T03:38:47.000Z","updated_at":"2025-03-31T06:53:49.000Z","dependencies_parsed_at":"2022-07-06T16:32:38.292Z","dependency_job_id":null,"html_url":"https://github.com/donnyyou/torchcv","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/donnyyou%2Ftorchcv","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/donnyyou%2Ftorchcv/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/donnyyou%2Ftorchcv/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/donnyyou%2Ftorchcv/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/donnyyou","download_url":"https://codeload.github.com/donnyyou/torchcv/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254364270,"owners_count":22058878,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-08-02T23:01:02.088Z","updated_at":"2025-05-15T15:08:16.178Z","avatar_url":"https://github.com/donnyyou.png","language":"Shell","funding_links":[],"categories":["Pytorch \u0026 related libraries｜Pytorch \u0026 相关库","Computer Vision","Pytorch \u0026 related libraries","Shell","CV\u0026PyTorch实战"],"sub_categories":["CV｜计算机视觉:","General Purpose CV","CV:"],"readme":"# TorchCV: A PyTorch-Based Framework for Deep Learning in Computer Vision\n```\n@misc{you2019torchcv,\n    author = {Ansheng You and Xiangtai Li and Zhen Zhu and Yunhai Tong},\n    title = {TorchCV: A PyTorch-Based Framework for Deep Learning in Computer Vision},\n    howpublished = {\\url{https://github.com/donnyyou/torchcv}},\n    year = {2019}\n}\n```\n\nThis repository provides source code for most deep learning based cv problems. We'll do our best to keep this repository up-to-date.  If you do find a problem about this repository, please raise an issue or submit a pull request.\n```diff\n- Semantic Flow for Fast and Accurate Scene Parsing\n- Code and models: https://github.com/lxtGH/SFSegNets\n```\n## Implemented Papers\n\n- [Image Classification](https://github.com/youansheng/torchcv/tree/master/runner/cls)\n    - VGG: Very Deep Convolutional Networks for Large-Scale Image Recognition\n    - ResNet: Deep Residual Learning for Image Recognition\n    - DenseNet: Densely Connected Convolutional Networks\n    - ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices\n    - ShuffleNet V2: Practical Guidelines for Ecient CNN Architecture Design\n    - Partial Order Pruning: for Best Speed/Accuracy Trade-off in Neural Architecture Search\n\n- [Semantic Segmentation](https://github.com/youansheng/torchcv/tree/master/runner/seg)\n    - DeepLabV3: Rethinking Atrous Convolution for Semantic Image Segmentation\n    - PSPNet: Pyramid Scene Parsing Network\n    - DenseASPP: DenseASPP for Semantic Segmentation in Street Scenes\n    - Asymmetric Non-local Neural Networks for Semantic Segmentation\n    - Semantic Flow for Fast and Accurate Scene Parsing\n    \n- [Object Detection](https://github.com/youansheng/torchcv/tree/master/runner/det)\n    - SSD: Single Shot MultiBox Detector\n    - Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks\n    - YOLOv3: An Incremental Improvement\n    - FPN: Feature Pyramid Networks for Object Detection\n\n- [Pose Estimation](https://github.com/youansheng/torchcv/tree/master/runner/pose)\n    - CPM: Convolutional Pose Machines\n    - OpenPose: Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields\n\n- [Instance Segmentation](https://github.com/youansheng/torchcv/tree/master/runner/seg)\n    - Mask R-CNN\n\n- [Generative Adversarial Networks](https://github.com/youansheng/torchcv/tree/master/runner/gan)\n    - Pix2pix: Image-to-Image Translation with Conditional Adversarial Nets\n    - CycleGAN: Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks.\n\n\n## QuickStart with TorchCV\nNow only support Python3.x, pytorch 1.3.\n```bash\npip3 install -r requirements.txt\ncd lib/exts\nsh make.sh\n```\n\n\n## Performances with TorchCV\nAll the performances showed below fully reimplemented the papers' results.\n\n#### Image Classification\n- ImageNet (Center Crop Test): 224x224\n\n| Model | Train | Test | Top-1 | Top-5 | BS | Iters | Scripts |\n|:--------|:---------|:------|:------|:------|:------|:------|:------|\n| ResNet50 | train | val | 77.54 | 93.59 | 512 | 30W | [ResNet50](https://github.com/youansheng/torchcv/blob/master/scripts/cls/imagenet/run_ic_res50_imagenet_cls.sh) |\n| ResNet101 | train | val | 78.94 | 94.56 | 512 | 30W | [ResNet101](https://github.com/youansheng/torchcv/blob/master/scripts/cls/imagenet/run_ic_res101_imagenet_cls.sh) |\n| ShuffleNetV2x0.5 | train | val | 60.90 | 82.54 | 1024 | 40W | [ShuffleNetV2x0.5](https://github.com/youansheng/torchcv/blob/master/scripts/cls/imagenet/run_ic_shufflenetv2x0.5_imagenet_cls.sh) |\n| ShuffleNetV2x1.0 | train | val | 69.71 | 88.91 | 1024 | 40W | [ShuffleNetV2x1.0](https://github.com/youansheng/torchcv/blob/master/scripts/cls/imagenet/run_ic_shufflenetv2x1.0_imagenet_cls.sh) |\n| DFNetV1 | train | val | 70.99 | 89.68 | 1024 | 40W | [DFNetV1](https://github.com/youansheng/torchcv/blob/master/scripts/cls/imagenet/run_ic_dfnetv1_imagenet_cls.sh) |\n| DFNetV2 | train | val | 74.22 | 91.61 | 1024 | 40W | [DFNetV2](https://github.com/youansheng/torchcv/blob/master/scripts/cls/imagenet/run_ic_dfnetv2_imagenet_cls.sh) |\n\n#### Semantic Segmentation\n- Cityscapes (Single Scale Whole Image Test): Base LR 0.01, Crop Size 769\n\n| Model | Backbone | Train | Test | mIOU | BS | Iters | Scripts |\n|:--------|:---------|:------|:------|:------|:------|:------|:------|\n| [PSPNet]() | [3x3-Res101](https://drive.google.com/open?id=1bUzCKazlh8ElGVYWlABBAb0b0uIqFgtR) | train | val | 78.20 | 8 | 4W | [PSPNet](https://github.com/youansheng/torchcv/blob/master/scripts/seg/cityscapes/run_fs_pspnet_cityscapes_seg.sh) |\n| [DeepLabV3]() | [3x3-Res101](https://drive.google.com/open?id=1bUzCKazlh8ElGVYWlABBAb0b0uIqFgtR) | train | val | 79.13 | 8 | 4W | [DeepLabV3](https://github.com/youansheng/torchcv/blob/master/scripts/seg/cityscapes/run_fs_deeplabv3_cityscapes_seg.sh) |\n\n- ADE20K (Single Scale Whole Image Test): Base LR 0.02, Crop Size 520\n\n| Model | Backbone | Train | Test | mIOU | PixelACC | BS | Iters | Scripts |\n|:--------|:---------|:------|:------|:------|:------|:------|:------|:------|\n| [PSPNet]() | [3x3-Res50](https://drive.google.com/open?id=1zPQLFd9c1yHfkQn5CWBCcEKmjEEqxsWx) | train | val | 41.52 | 80.09 | 16 | 15W | [PSPNet](https://github.com/youansheng/torchcv/blob/master/scripts/seg/ade20k/run_fs_res50_pspnet_ade20k_seg.sh) |\n| [DeepLabv3]() | [3x3-Res50](https://drive.google.com/open?id=1zPQLFd9c1yHfkQn5CWBCcEKmjEEqxsWx) | train | val | 42.16 | 80.36 | 16 | 15W | [DeepLabV3](https://github.com/youansheng/torchcv/blob/master/scripts/seg/ade20k/run_fs_res50_deeplabv3_ade20k_seg.sh) |\n| [PSPNet]() | [3x3-Res101](https://drive.google.com/open?id=1bUzCKazlh8ElGVYWlABBAb0b0uIqFgtR) | train | val | 43.60 | 81.30 | 16 | 15W | [PSPNet](https://github.com/youansheng/torchcv/blob/master/scripts/seg/ade20k/run_fs_res101_pspnet_ade20k_seg.sh) |\n| [DeepLabv3]() | [3x3-Res101](https://drive.google.com/open?id=1bUzCKazlh8ElGVYWlABBAb0b0uIqFgtR) | train | val | 44.13 | 81.42 | 16 | 15W | [DeepLabV3](https://github.com/youansheng/torchcv/blob/master/scripts/seg/ade20k/run_fs_res101_deeplabv3_ade20k_seg.sh) |\n\n#### Object Detection\n- Pascal VOC2007/2012 (Single Scale Test): 20 Classes\n\n| Model | Backbone | Train | Test | mAP | BS | Epochs | Scripts |\n|:--------|:---------|:------|:------|:------|:------|:------|:------|\n| [SSD300](https://drive.google.com/open?id=15J5blVyZq7lqCePh-Q8S2pxim3-f_8LP) | [VGG16](https://drive.google.com/open?id=1nM0UwmqR4lIHzmRWvs71jfP_gAekjuKy) | 07+12_trainval | 07_test | 0.786 | 32 | 235 | [SSD300](https://github.com/youansheng/torchcv/blob/master/scripts/det/voc/run_ssd300_vgg16_voc_det.sh) |\n| [SSD512](https://drive.google.com/open?id=1RF5gnqfiyz-EcSFU1OSK7tNuX_VRObVW) | [VGG16](https://drive.google.com/open?id=1nM0UwmqR4lIHzmRWvs71jfP_gAekjuKy) | 07+12_trainval | 07_test | 0.808 | 32 | 235 | [SSD512](https://github.com/youansheng/torchcv/blob/master/scripts/det/voc/run_ssd512_vgg16_voc_det.sh) |\n| [Faster R-CNN](https://drive.google.com/open?id=15SfklRiI1McVWEq9EAceznK-9sxXSQR4) | [VGG16](https://drive.google.com/open?id=1ZL9SS9KRzsDQhMe8kyPQ1LHA60wx_Vcj) | 07_trainval | 07_test | 0.706 | 1 | 15 | [Faster R-CNN](https://github.com/youansheng/torchcv/blob/master/scripts/det/voc/run_fr_vgg16_voc_det.sh) |\n\n#### Pose Estimation\n- OpenPose: Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields\n\n#### Instance Segmentation\n- Mask R-CNN\n\n#### Generative Adversarial Networks\n- Pix2pix\n- CycleGAN\n\n\n## DataSets with TorchCV\nTorchCV has defined the dataset format of all the tasks which you could check in the subdirs of [data](https://github.com/youansheng/torchcv/tree/master/data). Following is an example dataset directory trees for training semantic segmentation. You could preprocess the open datasets with the scripts in folder [data/seg/preprocess](https://github.com/youansheng/torchcv/tree/master/data/seg/preprocess)\n```\nDataset\n    train\n        image\n            00001.jpg/png\n            00002.jpg/png\n            ...\n        label\n            00001.png\n            00002.png\n            ...\n    val\n        image\n            00001.jpg/png\n            00002.jpg/png\n            ...\n        label\n            00001.png\n            00002.png\n            ...\n```\n\n\n## Commands with TorchCV\n\nTake PSPNet as an example. (\"tag\" could be any string, include an empty one.)\n- Training\n```bash\ncd scripts/seg/cityscapes/\nbash run_fs_pspnet_cityscapes_seg.sh train tag\n```\n\n- Resume Training\n```bash\ncd scripts/seg/cityscapes/\nbash run_fs_pspnet_cityscapes_seg.sh train tag\n```\n\n- Validate\n```bash\ncd scripts/seg/cityscapes/\nbash run_fs_pspnet_cityscapes_seg.sh val tag\n```\n\n- Testing:\n```bash\ncd scripts/seg/cityscapes/\nbash run_fs_pspnet_cityscapes_seg.sh test tag\n```\n\n## Demos with TorchCV\n\n\u003cdiv align=\"center\"\u003e\n\n\u003cimg src=\"demo/openpose/samples/000000319721_vis.png\" width=\"500px\"/\u003e\n\n\u003cp\u003e Example output of \u003cb\u003eVGG19-OpenPose\u003c/b\u003e\u003c/p\u003e\n\n\u003cimg src=\"demo/openpose/samples/000000475191_vis.png\" width=\"500px\"/\u003e\n\n\u003cp\u003e Example output of \u003cb\u003eVGG19-OpenPose\u003c/b\u003e\u003c/p\u003e\n\n\u003c/div\u003e\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdonnyyou%2Ftorchcv","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdonnyyou%2Ftorchcv","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdonnyyou%2Ftorchcv/lists"}