https://github.com/donnyyou/torchcv

TorchCV: A PyTorch-Based Framework for Deep Learning in Computer Vision
https://github.com/donnyyou/torchcv

Last synced: 7 months ago
JSON representation

TorchCV: A PyTorch-Based Framework for Deep Learning in Computer Vision

Host: GitHub
URL: https://github.com/donnyyou/torchcv
Owner: donnyyou
License: apache-2.0
Created: 2018-10-19T03:38:47.000Z (about 7 years ago)
Default Branch: master
Last Pushed: 2020-11-19T05:40:57.000Z (about 5 years ago)
Last Synced: 2025-04-07T20:11:30.326Z (8 months ago)
Language: Shell
Homepage: https://pytorchcv.com
Size: 28.8 MB
Stars: 2,250
Watchers: 69
Forks: 373
Open Issues: 42
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

Awesome-pytorch-list-CNVersion - TorchCV
awesome-list - TorchCV - A PyTorch-Based Framework for Deep Learning in Computer Vision. (Computer Vision / General Purpose CV)
Awesome-pytorch-list - TorchCV - Based Framework for Deep Learning in Computer Vision. (Pytorch & related libraries / CV:)

README

          # TorchCV: A PyTorch-Based Framework for Deep Learning in Computer Vision

```

@misc{you2019torchcv,

    author = {Ansheng You and Xiangtai Li and Zhen Zhu and Yunhai Tong},

    title = {TorchCV: A PyTorch-Based Framework for Deep Learning in Computer Vision},

    howpublished = {\url{https://github.com/donnyyou/torchcv}},

    year = {2019}

}

```

This repository provides source code for most deep learning based cv problems. We'll do our best to keep this repository up-to-date.  If you do find a problem about this repository, please raise an issue or submit a pull request.

```diff

- Semantic Flow for Fast and Accurate Scene Parsing

- Code and models: https://github.com/lxtGH/SFSegNets

```

## Implemented Papers

- [Image Classification](https://github.com/youansheng/torchcv/tree/master/runner/cls)

    - VGG: Very Deep Convolutional Networks for Large-Scale Image Recognition

    - ResNet: Deep Residual Learning for Image Recognition

    - DenseNet: Densely Connected Convolutional Networks

    - ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices

    - ShuffleNet V2: Practical Guidelines for Ecient CNN Architecture Design

    - Partial Order Pruning: for Best Speed/Accuracy Trade-off in Neural Architecture Search

- [Semantic Segmentation](https://github.com/youansheng/torchcv/tree/master/runner/seg)

    - DeepLabV3: Rethinking Atrous Convolution for Semantic Image Segmentation

    - PSPNet: Pyramid Scene Parsing Network

    - DenseASPP: DenseASPP for Semantic Segmentation in Street Scenes

    - Asymmetric Non-local Neural Networks for Semantic Segmentation

    - Semantic Flow for Fast and Accurate Scene Parsing

    

- [Object Detection](https://github.com/youansheng/torchcv/tree/master/runner/det)

    - SSD: Single Shot MultiBox Detector

    - Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

    - YOLOv3: An Incremental Improvement

    - FPN: Feature Pyramid Networks for Object Detection

- [Pose Estimation](https://github.com/youansheng/torchcv/tree/master/runner/pose)

    - CPM: Convolutional Pose Machines

    - OpenPose: Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields

- [Instance Segmentation](https://github.com/youansheng/torchcv/tree/master/runner/seg)

    - Mask R-CNN

- [Generative Adversarial Networks](https://github.com/youansheng/torchcv/tree/master/runner/gan)

    - Pix2pix: Image-to-Image Translation with Conditional Adversarial Nets

    - CycleGAN: Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks.

## QuickStart with TorchCV

Now only support Python3.x, pytorch 1.3.

```bash

pip3 install -r requirements.txt

cd lib/exts

sh make.sh

```

## Performances with TorchCV

All the performances showed below fully reimplemented the papers' results.

#### Image Classification

- ImageNet (Center Crop Test): 224x224

| Model | Train | Test | Top-1 | Top-5 | BS | Iters | Scripts |

|:--------|:---------|:------|:------|:------|:------|:------|:------|

| ResNet50 | train | val | 77.54 | 93.59 | 512 | 30W | [ResNet50](https://github.com/youansheng/torchcv/blob/master/scripts/cls/imagenet/run_ic_res50_imagenet_cls.sh) |

| ResNet101 | train | val | 78.94 | 94.56 | 512 | 30W | [ResNet101](https://github.com/youansheng/torchcv/blob/master/scripts/cls/imagenet/run_ic_res101_imagenet_cls.sh) |

| ShuffleNetV2x0.5 | train | val | 60.90 | 82.54 | 1024 | 40W | [ShuffleNetV2x0.5](https://github.com/youansheng/torchcv/blob/master/scripts/cls/imagenet/run_ic_shufflenetv2x0.5_imagenet_cls.sh) |

| ShuffleNetV2x1.0 | train | val | 69.71 | 88.91 | 1024 | 40W | [ShuffleNetV2x1.0](https://github.com/youansheng/torchcv/blob/master/scripts/cls/imagenet/run_ic_shufflenetv2x1.0_imagenet_cls.sh) |

| DFNetV1 | train | val | 70.99 | 89.68 | 1024 | 40W | [DFNetV1](https://github.com/youansheng/torchcv/blob/master/scripts/cls/imagenet/run_ic_dfnetv1_imagenet_cls.sh) |

| DFNetV2 | train | val | 74.22 | 91.61 | 1024 | 40W | [DFNetV2](https://github.com/youansheng/torchcv/blob/master/scripts/cls/imagenet/run_ic_dfnetv2_imagenet_cls.sh) |

#### Semantic Segmentation

- Cityscapes (Single Scale Whole Image Test): Base LR 0.01, Crop Size 769

| Model | Backbone | Train | Test | mIOU | BS | Iters | Scripts |

|:--------|:---------|:------|:------|:------|:------|:------|:------|

| [PSPNet]() | [3x3-Res101](https://drive.google.com/open?id=1bUzCKazlh8ElGVYWlABBAb0b0uIqFgtR) | train | val | 78.20 | 8 | 4W | [PSPNet](https://github.com/youansheng/torchcv/blob/master/scripts/seg/cityscapes/run_fs_pspnet_cityscapes_seg.sh) |

| [DeepLabV3]() | [3x3-Res101](https://drive.google.com/open?id=1bUzCKazlh8ElGVYWlABBAb0b0uIqFgtR) | train | val | 79.13 | 8 | 4W | [DeepLabV3](https://github.com/youansheng/torchcv/blob/master/scripts/seg/cityscapes/run_fs_deeplabv3_cityscapes_seg.sh) |

- ADE20K (Single Scale Whole Image Test): Base LR 0.02, Crop Size 520

| Model | Backbone | Train | Test | mIOU | PixelACC | BS | Iters | Scripts |

|:--------|:---------|:------|:------|:------|:------|:------|:------|:------|

| [PSPNet]() | [3x3-Res50](https://drive.google.com/open?id=1zPQLFd9c1yHfkQn5CWBCcEKmjEEqxsWx) | train | val | 41.52 | 80.09 | 16 | 15W | [PSPNet](https://github.com/youansheng/torchcv/blob/master/scripts/seg/ade20k/run_fs_res50_pspnet_ade20k_seg.sh) |

| [DeepLabv3]() | [3x3-Res50](https://drive.google.com/open?id=1zPQLFd9c1yHfkQn5CWBCcEKmjEEqxsWx) | train | val | 42.16 | 80.36 | 16 | 15W | [DeepLabV3](https://github.com/youansheng/torchcv/blob/master/scripts/seg/ade20k/run_fs_res50_deeplabv3_ade20k_seg.sh) |

| [PSPNet]() | [3x3-Res101](https://drive.google.com/open?id=1bUzCKazlh8ElGVYWlABBAb0b0uIqFgtR) | train | val | 43.60 | 81.30 | 16 | 15W | [PSPNet](https://github.com/youansheng/torchcv/blob/master/scripts/seg/ade20k/run_fs_res101_pspnet_ade20k_seg.sh) |

| [DeepLabv3]() | [3x3-Res101](https://drive.google.com/open?id=1bUzCKazlh8ElGVYWlABBAb0b0uIqFgtR) | train | val | 44.13 | 81.42 | 16 | 15W | [DeepLabV3](https://github.com/youansheng/torchcv/blob/master/scripts/seg/ade20k/run_fs_res101_deeplabv3_ade20k_seg.sh) |

#### Object Detection

- Pascal VOC2007/2012 (Single Scale Test): 20 Classes

| Model | Backbone | Train | Test | mAP | BS | Epochs | Scripts |

|:--------|:---------|:------|:------|:------|:------|:------|:------|

| [SSD300](https://drive.google.com/open?id=15J5blVyZq7lqCePh-Q8S2pxim3-f_8LP) | [VGG16](https://drive.google.com/open?id=1nM0UwmqR4lIHzmRWvs71jfP_gAekjuKy) | 07+12_trainval | 07_test | 0.786 | 32 | 235 | [SSD300](https://github.com/youansheng/torchcv/blob/master/scripts/det/voc/run_ssd300_vgg16_voc_det.sh) |

| [SSD512](https://drive.google.com/open?id=1RF5gnqfiyz-EcSFU1OSK7tNuX_VRObVW) | [VGG16](https://drive.google.com/open?id=1nM0UwmqR4lIHzmRWvs71jfP_gAekjuKy) | 07+12_trainval | 07_test | 0.808 | 32 | 235 | [SSD512](https://github.com/youansheng/torchcv/blob/master/scripts/det/voc/run_ssd512_vgg16_voc_det.sh) |

| [Faster R-CNN](https://drive.google.com/open?id=15SfklRiI1McVWEq9EAceznK-9sxXSQR4) | [VGG16](https://drive.google.com/open?id=1ZL9SS9KRzsDQhMe8kyPQ1LHA60wx_Vcj) | 07_trainval | 07_test | 0.706 | 1 | 15 | [Faster R-CNN](https://github.com/youansheng/torchcv/blob/master/scripts/det/voc/run_fr_vgg16_voc_det.sh) |

#### Pose Estimation

- OpenPose: Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields

#### Instance Segmentation

- Mask R-CNN

#### Generative Adversarial Networks

- Pix2pix

- CycleGAN

## DataSets with TorchCV

TorchCV has defined the dataset format of all the tasks which you could check in the subdirs of [data](https://github.com/youansheng/torchcv/tree/master/data). Following is an example dataset directory trees for training semantic segmentation. You could preprocess the open datasets with the scripts in folder [data/seg/preprocess](https://github.com/youansheng/torchcv/tree/master/data/seg/preprocess)

```

Dataset

    train

        image

            00001.jpg/png

            00002.jpg/png

            ...

        label

            00001.png

            00002.png

            ...

    val

        image

            00001.jpg/png

            00002.jpg/png

            ...

        label

            00001.png

            00002.png

            ...

```

## Commands with TorchCV

Take PSPNet as an example. ("tag" could be any string, include an empty one.)

- Training

```bash

cd scripts/seg/cityscapes/

bash run_fs_pspnet_cityscapes_seg.sh train tag

```

- Resume Training

```bash

cd scripts/seg/cityscapes/

bash run_fs_pspnet_cityscapes_seg.sh train tag

```

- Validate

```bash

cd scripts/seg/cityscapes/

bash run_fs_pspnet_cityscapes_seg.sh val tag

```

- Testing:

```bash

cd scripts/seg/cityscapes/

bash run_fs_pspnet_cityscapes_seg.sh test tag

```

## Demos with TorchCV





 Example output of VGG19-OpenPose




 Example output of VGG19-OpenPose

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/donnyyou/torchcv

Awesome Lists containing this project

README