https://github.com/dmlc/gluon-cv

Gluon CV Toolkit
https://github.com/dmlc/gluon-cv

action-recognition computer-vision deep-learning gan gluon image-classification machine-learning mxnet neural-network object-detection person-reid pose-estimation semantic-segmentation

Last synced: 10 months ago
JSON representation

Gluon CV Toolkit

Host: GitHub
URL: https://github.com/dmlc/gluon-cv
Owner: dmlc
License: apache-2.0
Created: 2018-02-26T01:33:21.000Z (about 8 years ago)
Default Branch: master
Last Pushed: 2024-11-25T15:30:52.000Z (over 1 year ago)
Last Synced: 2025-05-13T11:10:08.612Z (10 months ago)
Topics: action-recognition, computer-vision, deep-learning, gan, gluon, image-classification, machine-learning, mxnet, neural-network, object-detection, person-reid, pose-estimation, semantic-segmentation
Language: Python
Homepage: http://gluon-cv.mxnet.io
Size: 37.8 MB
Stars: 5,884
Watchers: 151
Forks: 1,206
Open Issues: 62
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

awesome-python-data-science - gluon-cv - Provides implementations of the state-of-the-art deep learning models in computer vision. <img height="20" src="img/mxnet_big.png" alt="MXNet based"> (Deep Learning / MXNet)
awesome-list - GluonCV - A high-level computer vision library for PyTorch and MXNet. (Computer Vision / General Purpose CV)
awesome-yolo-object-detection - Gluon CV Toolkit - cv?style=social"/> : GluonCV provides implementations of the state-of-the-art (SOTA) deep learning models in computer vision. (Other Versions of YOLO)
awesome-semantic-segmentation-pytorch - gloun-cv
StarryDivineSky - dmlc/gluon-cv
awesome-python-machine-learning-resources - GitHub - 5% open · ⏱️ 11.08.2022): (图像数据与CV)
awesome-seg - gloun-cv
awesome-semantic-understanding-for-aerial-scene - 2019

README

          # Gluon CV Toolkit

![Build Status](https://github.com/dmlc/gluon-cv/workflows/Unit%20Test/badge.svg?branch=master&event=push)

[![GitHub license](docs/_static/apache2.svg)](./LICENSE)

[![PyPI](https://img.shields.io/pypi/v/gluoncv.svg)](https://pypi.python.org/pypi/gluoncv)

[![PyPI Pre-release](https://img.shields.io/badge/pypi--prerelease-v0.11.0-ff69b4.svg)](https://pypi.org/project/gluoncv/#history)

[![Downloads](http://pepy.tech/badge/gluoncv)](http://pepy.tech/project/gluoncv)

[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/resnest-split-attention-networks/semantic-segmentation-on-ade20k)](https://paperswithcode.com/sota/semantic-segmentation-on-ade20k?p=resnest-split-attention-networks)

[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/resnest-split-attention-networks/object-detection-on-coco)](https://paperswithcode.com/sota/object-detection-on-coco?p=resnest-split-attention-networks)

[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/resnest-split-attention-networks/instance-segmentation-on-coco)](https://paperswithcode.com/sota/instance-segmentation-on-coco?p=resnest-split-attention-networks)

[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/resnest-split-attention-networks/panoptic-segmentation-on-coco-panoptic)](https://paperswithcode.com/sota/panoptic-segmentation-on-coco-panoptic?p=resnest-split-attention-networks)

[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/resnest-split-attention-networks/image-classification-on-imagenet)](https://paperswithcode.com/sota/image-classification-on-imagenet?p=resnest-split-attention-networks)

| [Installation](https://gluon-cv.mxnet.io/install.html) | [Documentation](https://gluon-cv.mxnet.io) | [Tutorials](https://gluon-cv.mxnet.io/tutorials/index.html) |

GluonCV provides implementations of the state-of-the-art (SOTA) deep learning models in computer vision.

It is designed for engineers, researchers, and

students to fast prototype products and research ideas based on these

models. This toolkit offers four main features:

1. Training scripts to reproduce SOTA results reported in research papers

2. Supports both PyTorch and MXNet

3. A large number of pre-trained models

4. Carefully designed APIs that greatly reduce the implementation complexity

5. Community supports

Please also checkout [AutoGluon](https://github.com/autogluon/autogluon) if you have [image classification](https://auto.gluon.ai/stable/tutorials/multimodal/image_prediction/index.html) or [object detection](https://auto.gluon.ai/stable/tutorials/multimodal/object_detection/index.html) needs. We have built the [MultimodalPredictor](https://auto.gluon.ai/stable/tutorials/multimodal/index.html) with an improved model zoo, including [TIMM](https://github.com/rwightman/pytorch-image-models), [Huggingface](https://huggingface.co/), [MMDetection](https://github.com/open-mmlab/mmdetection) and more. With just a few lines of code, you can train and deploy high accuracy computer vision models for your application.

# Demo



    






Check the HD video at [Youtube](https://www.youtube.com/watch?v=nfpouVAzXt0) or [Bilibili](https://www.bilibili.com/video/av55619231).

# Supported Applications

| Application  | Illustration  | Available Models |

|:-----------------------:|:---:|:---:|

| [Image Classification:](https://gluon-cv.mxnet.io/model_zoo/classification.html) 
recognize an object in an image.  |   | 50+ models, including 
ResNet, MobileNet, 
DenseNet, VGG, ... |

| [Object Detection:](https://gluon-cv.mxnet.io/model_zoo/detection.html) 
detect multiple objects with their 
 bounding boxes in an image.     |   | Faster RCNN, SSD, Yolo-v3 |

| [Semantic Segmentation:](https://gluon-cv.mxnet.io/model_zoo/segmentation.html#semantic-segmentation) 
associate each pixel of an image 
 with a categorical label. |   | FCN, PSP, ICNet, DeepLab-v3, DeepLab-v3+, DANet, FastSCNN |

| [Instance Segmentation:](https://gluon-cv.mxnet.io/model_zoo/segmentation.html#instance-segmentation) 
detect objects and associate 
 each pixel inside object area with an 
 instance label. |  | Mask RCNN|

| [Pose Estimation:](https://gluon-cv.mxnet.io/model_zoo/pose.html) 
detect human pose 
 from images. |  | Simple Pose|

| [Video Action Recognition:](https://gluon-cv.mxnet.io/model_zoo/action_recognition.html) 
recognize human actions 
 in a video. |  | MXNet: TSN, C3D, I3D, I3D_slow, P3D, R3D, R2+1D, Non-local, SlowFast 
 PyTorch: TSN, I3D, I3D_slow, R2+1D, Non-local, CSN, SlowFast, TPN |

| [Depth Prediction:](https://gluon-cv.mxnet.io/model_zoo/depth.html) 
predict depth map 
 from images. |  | Monodepth2|

| [GAN:](https://github.com/dmlc/gluon-cv/tree/master/scripts/gan) 
generate visually deceptive images |  | WGAN, CycleGAN, StyleGAN|

| [Person Re-ID:](https://github.com/dmlc/gluon-cv/tree/master/scripts/re-id/baseline) 
re-identify pedestrians across scenes |  |Market1501 baseline |

# Installation

GluonCV is built on top of MXNet and PyTorch. Depending on the individual model implementation(check [model zoo](https://gluon-cv.mxnet.io/model_zoo/index.html) for the complete list), you will need to install either one of the deep learning framework. Of course you can always install both for the best coverage.

Please also check [installation guide](https://cv.gluon.ai/install.html) for a comprehensive guide to help you choose the right installation command for your environment.

## Installation (MXNet)

GluonCV supports Python 3.6 or later. The easiest way to install is via pip.

### Stable Release

The following commands install the stable version of GluonCV and MXNet:

```bash

pip install gluoncv --upgrade

# native

pip install -U --pre mxnet -f https://dist.mxnet.io/python/mkl

# cuda 10.2

pip install -U --pre mxnet -f https://dist.mxnet.io/python/cu102mkl

```

**The latest stable version of GluonCV is 0.8 and we recommend mxnet 1.6.0/1.7.0**

### Nightly Release

You may get access to latest features and bug fixes with the following commands which install the nightly build of GluonCV and MXNet:

```bash

pip install gluoncv --pre --upgrade

# native

pip install -U --pre mxnet -f https://dist.mxnet.io/python/mkl

# cuda 10.2

pip install -U --pre mxnet -f https://dist.mxnet.io/python/cu102mkl

```

There are multiple versions of MXNet pre-built package available. Please refer to [mxnet packages](https://gluon-crash-course.mxnet.io/mxnet_packages.html) if you need more details about MXNet versions.

## Installation (PyTorch)

GluonCV supports Python 3.6 or later. The easiest way to install is via pip.

### Stable Release

The following commands install the stable version of GluonCV and PyTorch:

```bash

pip install gluoncv --upgrade

# native

pip install torch==1.6.0+cpu torchvision==0.7.0+cpu -f https://download.pytorch.org/whl/torch_stable.html

# cuda 10.2

pip install torch==1.6.0 torchvision==0.7.0

```

There are multiple versions of PyTorch pre-built package available. Please refer to [PyTorch](https://pytorch.org/get-started/previous-versions/) if you need other versions.

**The latest stable version of GluonCV is 0.8 and we recommend PyTorch 1.6.0**

### Nightly Release

You may get access to latest features and bug fixes with the following commands which install the nightly build of GluonCV:

```bash

pip install gluoncv --pre --upgrade

# native

pip install --pre torch torchvision torchaudio -f https://download.pytorch.org/whl/nightly/cpu/torch_nightly.html

# cuda 10.2

pip install --pre torch torchvision torchaudio -f https://download.pytorch.org/whl/nightly/cu102/torch_nightly.html

```

# Docs 📖

GluonCV documentation is available at [our website](https://gluon-cv.mxnet.io/index.html).

# Examples

All tutorials are available at [our website](https://gluon-cv.mxnet.io/index.html)!

- [Image Classification](http://gluon-cv.mxnet.io/build/examples_classification/index.html)

- [Object Detection](http://gluon-cv.mxnet.io/build/examples_detection/index.html)

- [Semantic Segmentation](http://gluon-cv.mxnet.io/build/examples_segmentation/index.html)

- [Instance Segmentation](http://gluon-cv.mxnet.io/build/examples_instance/index.html)

- [Video Action Recognition](https://gluon-cv.mxnet.io/build/examples_action_recognition/index.html)

- [Depth Prediction](https://gluon-cv.mxnet.io/build/examples_depth/index.html)

- [Generative Adversarial Network](https://github.com/dmlc/gluon-cv/tree/master/scripts/gan)

- [Person Re-identification](https://github.com/dmlc/gluon-cv/tree/master/scripts/re-id/)

# Resources

Check out how to use GluonCV for your own research or projects.

- For background knowledge of deep learning or CV, please refer to the open source book [*Dive into Deep Learning*](http://d2l.ai/). If you are new to Gluon, please check out [our 60-minute crash course](http://gluon-crash-course.mxnet.io/).

- For getting started quickly, refer to notebook runnable examples at [Examples](https://gluon-cv.mxnet.io/build/examples_classification/index.html).

- For advanced examples, check out our [Scripts](http://gluon-cv.mxnet.io/master/scripts/index.html).

- For experienced users, check out our [API Notes](https://gluon-cv.mxnet.io/api/data.datasets.html#).

# Citation

If you feel our code or models helps in your research, kindly cite our papers:

```

@article{gluoncvnlp2020,

  author  = {Jian Guo and He He and Tong He and Leonard Lausen and Mu Li and Haibin Lin and Xingjian Shi and Chenguang Wang and Junyuan Xie and Sheng Zha and Aston Zhang and Hang Zhang and Zhi Zhang and Zhongyue Zhang and Shuai Zheng and Yi Zhu},

  title   = {GluonCV and GluonNLP: Deep Learning in Computer Vision and Natural Language Processing},

  journal = {Journal of Machine Learning Research},

  year    = {2020},

  volume  = {21},

  number  = {23},

  pages   = {1-7},

  url     = {http://jmlr.org/papers/v21/19-429.html}

}

@article{he2018bag,

  title={Bag of Tricks for Image Classification with Convolutional Neural Networks},

  author={He, Tong and Zhang, Zhi and Zhang, Hang and Zhang, Zhongyue and Xie, Junyuan and Li, Mu},

  journal={arXiv preprint arXiv:1812.01187},

  year={2018}

}

@article{zhang2019bag,

  title={Bag of Freebies for Training Object Detection Neural Networks},

  author={Zhang, Zhi and He, Tong and Zhang, Hang and Zhang, Zhongyue and Xie, Junyuan and Li, Mu},

  journal={arXiv preprint arXiv:1902.04103},

  year={2019}

}

@article{zhang2020resnest,

  title={ResNeSt: Split-Attention Networks},

  author={Zhang, Hang and Wu, Chongruo and Zhang, Zhongyue and Zhu, Yi and Zhang, Zhi and Lin, Haibin and Sun, Yue and He, Tong and Muller, Jonas and Manmatha, R. and Li, Mu and Smola, Alexander},

  journal={arXiv preprint arXiv:2004.08955},

  year={2020}

}

```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/dmlc/gluon-cv

Awesome Lists containing this project

README