{"id":13444033,"url":"https://github.com/dmlc/gluon-cv","last_synced_at":"2025-05-13T11:10:11.433Z","repository":{"id":37734313,"uuid":"122896249","full_name":"dmlc/gluon-cv","owner":"dmlc","description":"Gluon CV Toolkit","archived":false,"fork":false,"pushed_at":"2024-11-25T15:30:52.000Z","size":39688,"stargazers_count":5884,"open_issues_count":62,"forks_count":1206,"subscribers_count":151,"default_branch":"master","last_synced_at":"2025-05-13T11:10:08.612Z","etag":null,"topics":["action-recognition","computer-vision","deep-learning","gan","gluon","image-classification","machine-learning","mxnet","neural-network","object-detection","person-reid","pose-estimation","semantic-segmentation"],"latest_commit_sha":null,"homepage":"http://gluon-cv.mxnet.io","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/dmlc.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2018-02-26T01:33:21.000Z","updated_at":"2025-05-13T04:38:36.000Z","dependencies_parsed_at":"2025-04-22T23:51:12.304Z","dependency_job_id":null,"html_url":"https://github.com/dmlc/gluon-cv","commit_stats":{"total_commits":894,"total_committers":125,"mean_commits":7.152,"dds":0.6845637583892618,"last_synced_commit":"567775619f3b97d47e7c360748912a4fd883ff52"},"previous_names":[],"tags_count":10,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dmlc%2Fgluon-cv","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dmlc%2Fgluon-cv/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dmlc%2Fgluon-cv/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dmlc%2Fgluon-cv/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/dmlc","download_url":"https://codeload.github.com/dmlc/gluon-cv/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":253929367,"owners_count":21985802,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["action-recognition","computer-vision","deep-learning","gan","gluon","image-classification","machine-learning","mxnet","neural-network","object-detection","person-reid","pose-estimation","semantic-segmentation"],"created_at":"2024-07-31T03:02:17.191Z","updated_at":"2025-05-13T11:10:11.399Z","avatar_url":"https://github.com/dmlc.png","language":"Python","funding_links":[],"categories":["Python","References","Deep Learning","Computer Vision","Code template \u0026 example","Other Versions of YOLO","对象检测_分割","图像数据与CV","Appendix: Object Detection for Natural Scene"],"sub_categories":["Demo","MXNet","General Purpose CV","资源传输下载","Papers"],"readme":"# Gluon CV Toolkit\n\n![Build Status](https://github.com/dmlc/gluon-cv/workflows/Unit%20Test/badge.svg?branch=master\u0026event=push)\n[![GitHub license](docs/_static/apache2.svg)](./LICENSE)\n[![PyPI](https://img.shields.io/pypi/v/gluoncv.svg)](https://pypi.python.org/pypi/gluoncv)\n[![PyPI Pre-release](https://img.shields.io/badge/pypi--prerelease-v0.11.0-ff69b4.svg)](https://pypi.org/project/gluoncv/#history)\n[![Downloads](http://pepy.tech/badge/gluoncv)](http://pepy.tech/project/gluoncv)\n\n[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/resnest-split-attention-networks/semantic-segmentation-on-ade20k)](https://paperswithcode.com/sota/semantic-segmentation-on-ade20k?p=resnest-split-attention-networks)\n[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/resnest-split-attention-networks/object-detection-on-coco)](https://paperswithcode.com/sota/object-detection-on-coco?p=resnest-split-attention-networks)\n[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/resnest-split-attention-networks/instance-segmentation-on-coco)](https://paperswithcode.com/sota/instance-segmentation-on-coco?p=resnest-split-attention-networks)\n[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/resnest-split-attention-networks/panoptic-segmentation-on-coco-panoptic)](https://paperswithcode.com/sota/panoptic-segmentation-on-coco-panoptic?p=resnest-split-attention-networks)\n[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/resnest-split-attention-networks/image-classification-on-imagenet)](https://paperswithcode.com/sota/image-classification-on-imagenet?p=resnest-split-attention-networks)\n\n| [Installation](https://gluon-cv.mxnet.io/install.html) | [Documentation](https://gluon-cv.mxnet.io) | [Tutorials](https://gluon-cv.mxnet.io/tutorials/index.html) |\n\nGluonCV provides implementations of the state-of-the-art (SOTA) deep learning models in computer vision.\n\nIt is designed for engineers, researchers, and\nstudents to fast prototype products and research ideas based on these\nmodels. This toolkit offers four main features:\n\n1. Training scripts to reproduce SOTA results reported in research papers\n2. Supports both PyTorch and MXNet\n3. A large number of pre-trained models\n4. Carefully designed APIs that greatly reduce the implementation complexity\n5. Community supports\n\nPlease also checkout [AutoGluon](https://github.com/autogluon/autogluon) if you have [image classification](https://auto.gluon.ai/stable/tutorials/multimodal/image_prediction/index.html) or [object detection](https://auto.gluon.ai/stable/tutorials/multimodal/object_detection/index.html) needs. We have built the [MultimodalPredictor](https://auto.gluon.ai/stable/tutorials/multimodal/index.html) with an improved model zoo, including [TIMM](https://github.com/rwightman/pytorch-image-models), [Huggingface](https://huggingface.co/), [MMDetection](https://github.com/open-mmlab/mmdetection) and more. With just a few lines of code, you can train and deploy high accuracy computer vision models for your application.\n\n\n# Demo\n\n\u003cdiv align=\"center\"\u003e\n    \u003cimg src=\"docs/_static/short_demo.gif\"\u003e\n\u003c/div\u003e\n\n\u003cbr\u003e\n\nCheck the HD video at [Youtube](https://www.youtube.com/watch?v=nfpouVAzXt0) or [Bilibili](https://www.bilibili.com/video/av55619231).\n\n\n# Supported Applications\n\n| Application  | Illustration  | Available Models |\n|:-----------------------:|:---:|:---:|\n| [Image Classification:](https://gluon-cv.mxnet.io/model_zoo/classification.html) \u003cbr/\u003erecognize an object in an image.  | \u003ca href=\"https://gluon-cv.mxnet.io/model_zoo/classification.html\"\u003e\u003cimg  src=\"docs/_static/image-classification.png\" alt=\"classification\" height=\"200\"/\u003e\u003c/a\u003e  | 50+ models, including \u003cbr/\u003e\u003ca href=\"https://gluon-cv.mxnet.io/model_zoo/classification.html#resnet\"\u003eResNet\u003c/a\u003e, \u003ca href=\"https://gluon-cv.mxnet.io/model_zoo/classification.html#mobilenet\"\u003eMobileNet\u003c/a\u003e, \u003cbr/\u003e\u003ca href=\"https://gluon-cv.mxnet.io/model_zoo/classification.html#densenet\"\u003eDenseNet\u003c/a\u003e, \u003ca href=\"https://gluon-cv.mxnet.io/model_zoo/classification.html#vgg\"\u003eVGG\u003c/a\u003e, ... |\n| [Object Detection:](https://gluon-cv.mxnet.io/model_zoo/detection.html) \u003cbr/\u003edetect multiple objects with their \u003cbr/\u003e bounding boxes in an image.     |  \u003ca href=\"https://gluon-cv.mxnet.io/model_zoo/detection.html\"\u003e\u003cimg src=\"docs/_static/object-detection.png\" alt=\"detection\" height=\"200\"/\u003e\u003c/a\u003e | \u003ca href=\"https://gluon-cv.mxnet.io/model_zoo/detection.html#faster-rcnn\"\u003eFaster RCNN\u003c/a\u003e, \u003ca href=\"https://gluon-cv.mxnet.io/model_zoo/detection.html#ssd\"\u003eSSD\u003c/a\u003e, \u003ca href=\"https://gluon-cv.mxnet.io/model_zoo/detection.html#yolo-v3\"\u003eYolo-v3\u003c/a\u003e |\n| [Semantic Segmentation:](https://gluon-cv.mxnet.io/model_zoo/segmentation.html#semantic-segmentation) \u003cbr/\u003eassociate each pixel of an image \u003cbr/\u003e with a categorical label. |  \u003ca href=\"https://gluon-cv.mxnet.io/model_zoo/segmentation.html#semantic-segmentation\"\u003e\u003cimg src=\"docs/_static/semantic-segmentation.png\" alt=\"semantic\" height=\"200\"/\u003e\u003c/a\u003e | \u003ca href=\"https://gluon-cv.mxnet.io/model_zoo/segmentation.html#semantic-segmentation\"\u003eFCN\u003c/a\u003e, \u003ca href=\"https://gluon-cv.mxnet.io/model_zoo/segmentation.html#semantic-segmentation\"\u003ePSP\u003c/a\u003e, \u003ca href=\"https://gluon-cv.mxnet.io/model_zoo/segmentation.html#semantic-segmentation\"\u003eICNet\u003c/a\u003e, \u003ca href=\"https://gluon-cv.mxnet.io/model_zoo/segmentation.html#semantic-segmentation\"\u003eDeepLab-v3\u003c/a\u003e, \u003ca href=\"https://gluon-cv.mxnet.io/model_zoo/segmentation.html#semantic-segmentation\"\u003eDeepLab-v3+\u003c/a\u003e, \u003ca href=\"https://gluon-cv.mxnet.io/model_zoo/segmentation.html#semantic-segmentation\"\u003eDANet\u003c/a\u003e, \u003ca href=\"https://gluon-cv.mxnet.io/model_zoo/segmentation.html#semantic-segmentation\"\u003eFastSCNN\u003c/a\u003e |\n| [Instance Segmentation:](https://gluon-cv.mxnet.io/model_zoo/segmentation.html#instance-segmentation) \u003cbr/\u003edetect objects and associate \u003cbr/\u003e each pixel inside object area with an \u003cbr/\u003e instance label. | \u003ca href=\"https://gluon-cv.mxnet.io/model_zoo/segmentation.html#instance-segmentation\"\u003e\u003cimg src=\"docs/_static/instance-segmentation.png\" alt=\"instance\" height=\"200\"/\u003e\u003c/a\u003e | \u003ca href=\"https://gluon-cv.mxnet.io/model_zoo/segmentation.html#instance-segmentation\"\u003eMask RCNN\u003c/a\u003e|\n| [Pose Estimation:](https://gluon-cv.mxnet.io/model_zoo/pose.html) \u003cbr/\u003edetect human pose \u003cbr/\u003e from images. | \u003ca href=\"https://gluon-cv.mxnet.io/model_zoo/pose.html\"\u003e\u003cimg src=\"docs/_static/pose-estimation.svg\" alt=\"pose\" height=\"200\"/\u003e\u003c/a\u003e | \u003ca href=\"https://gluon-cv.mxnet.io/model_zoo/pose.html#simple-pose-with-resnet\"\u003eSimple Pose\u003c/a\u003e|\n| [Video Action Recognition:](https://gluon-cv.mxnet.io/model_zoo/action_recognition.html) \u003cbr/\u003erecognize human actions \u003cbr/\u003e in a video. | \u003ca href=\"https://gluon-cv.mxnet.io/model_zoo/action_recognition.html\"\u003e\u003cimg src=\"docs/_static/action-recognition.png\" alt=\"action_recognition\" height=\"200\"/\u003e\u003c/a\u003e | MXNet: \u003ca href=\"https://gluon-cv.mxnet.io/model_zoo/action_recognition.html\"\u003eTSN\u003c/a\u003e, \u003ca href=\"https://gluon-cv.mxnet.io/model_zoo/action_recognition.html\"\u003eC3D\u003c/a\u003e, \u003ca href=\"https://gluon-cv.mxnet.io/model_zoo/action_recognition.html\"\u003eI3D\u003c/a\u003e, \u003ca href=\"https://gluon-cv.mxnet.io/model_zoo/action_recognition.html\"\u003eI3D_slow\u003c/a\u003e, \u003ca href=\"https://gluon-cv.mxnet.io/model_zoo/action_recognition.html\"\u003eP3D\u003c/a\u003e, \u003ca href=\"https://gluon-cv.mxnet.io/model_zoo/action_recognition.html\"\u003eR3D\u003c/a\u003e, \u003ca href=\"https://gluon-cv.mxnet.io/model_zoo/action_recognition.html\"\u003eR2+1D\u003c/a\u003e, \u003ca href=\"https://gluon-cv.mxnet.io/model_zoo/action_recognition.html\"\u003eNon-local\u003c/a\u003e, \u003ca href=\"https://gluon-cv.mxnet.io/model_zoo/action_recognition.html\"\u003eSlowFast\u003c/a\u003e \u003cbr/\u003e PyTorch: \u003ca href=\"https://gluon-cv.mxnet.io/model_zoo/action_recognition.html\"\u003eTSN\u003c/a\u003e, \u003ca href=\"https://gluon-cv.mxnet.io/model_zoo/action_recognition.html\"\u003eI3D\u003c/a\u003e, \u003ca href=\"https://gluon-cv.mxnet.io/model_zoo/action_recognition.html\"\u003eI3D_slow\u003c/a\u003e, \u003ca href=\"https://gluon-cv.mxnet.io/model_zoo/action_recognition.html\"\u003eR2+1D\u003c/a\u003e, \u003ca href=\"https://gluon-cv.mxnet.io/model_zoo/action_recognition.html\"\u003eNon-local\u003c/a\u003e, \u003ca href=\"https://gluon-cv.mxnet.io/model_zoo/action_recognition.html\"\u003eCSN\u003c/a\u003e, \u003ca href=\"https://gluon-cv.mxnet.io/model_zoo/action_recognition.html\"\u003eSlowFast\u003c/a\u003e, \u003ca href=\"https://gluon-cv.mxnet.io/model_zoo/action_recognition.html\"\u003eTPN\u003c/a\u003e |\n| [Depth Prediction:](https://gluon-cv.mxnet.io/model_zoo/depth.html) \u003cbr/\u003epredict depth map \u003cbr/\u003e from images. | \u003ca href=\"https://gluon-cv.mxnet.io/model_zoo/depth.html\"\u003e\u003cimg src=\"docs/_static/depth.png\" alt=\"depth\" height=\"200\"/\u003e\u003c/a\u003e | \u003ca href=\"https://gluon-cv.mxnet.io/model_zoo/depth.html#kitti-dataset\"\u003eMonodepth2\u003c/a\u003e|\n| [GAN:](https://github.com/dmlc/gluon-cv/tree/master/scripts/gan) \u003cbr/\u003egenerate visually deceptive images | \u003ca href=\"https://github.com/dmlc/gluon-cv/tree/master/scripts/gan\"\u003e\u003cimg src=\"https://github.com/dmlc/gluon-cv/raw/master/scripts/gan/wgan/fake_samples_400000.png\" alt=\"lsun\" height=\"200\"/\u003e\u003c/a\u003e | \u003ca href=\"https://github.com/dmlc/gluon-cv/tree/master/scripts/gan/wgan\"\u003eWGAN\u003c/a\u003e, \u003ca href=\"https://github.com/dmlc/gluon-cv/tree/master/scripts/gan/cycle_gan\"\u003eCycleGAN\u003c/a\u003e, \u003ca href=\"https://github.com/dmlc/gluon-cv/tree/master/scripts/gan/stylegan\"\u003eStyleGAN\u003c/a\u003e|\n| [Person Re-ID:](https://github.com/dmlc/gluon-cv/tree/master/scripts/re-id/baseline) \u003cbr/\u003ere-identify pedestrians across scenes | \u003ca href=\"https://github.com/dmlc/gluon-cv/tree/master/scripts/re-id/baseline\"\u003e\u003cimg src=\"https://user-images.githubusercontent.com/3307514/46702937-f4311800-cbd9-11e8-8eeb-c945ec5643fb.png\" alt=\"re-id\" height=\"160\"/\u003e\u003c/a\u003e |\u003ca href=\"https://github.com/dmlc/gluon-cv/tree/master/scripts/re-id/baseline\"\u003eMarket1501 baseline\u003c/a\u003e |\n\n# Installation\n\nGluonCV is built on top of MXNet and PyTorch. Depending on the individual model implementation(check [model zoo](https://gluon-cv.mxnet.io/model_zoo/index.html) for the complete list), you will need to install either one of the deep learning framework. Of course you can always install both for the best coverage.\n\nPlease also check [installation guide](https://cv.gluon.ai/install.html) for a comprehensive guide to help you choose the right installation command for your environment.\n\n## Installation (MXNet)\n\nGluonCV supports Python 3.6 or later. The easiest way to install is via pip.\n\n### Stable Release\nThe following commands install the stable version of GluonCV and MXNet:\n\n```bash\npip install gluoncv --upgrade\n# native\npip install -U --pre mxnet -f https://dist.mxnet.io/python/mkl\n# cuda 10.2\npip install -U --pre mxnet -f https://dist.mxnet.io/python/cu102mkl\n```\n\n**The latest stable version of GluonCV is 0.8 and we recommend mxnet 1.6.0/1.7.0**\n\n### Nightly Release\n\nYou may get access to latest features and bug fixes with the following commands which install the nightly build of GluonCV and MXNet:\n\n```bash\npip install gluoncv --pre --upgrade\n# native\npip install -U --pre mxnet -f https://dist.mxnet.io/python/mkl\n# cuda 10.2\npip install -U --pre mxnet -f https://dist.mxnet.io/python/cu102mkl\n```\n\nThere are multiple versions of MXNet pre-built package available. Please refer to [mxnet packages](https://gluon-crash-course.mxnet.io/mxnet_packages.html) if you need more details about MXNet versions.\n\n\n## Installation (PyTorch)\n\nGluonCV supports Python 3.6 or later. The easiest way to install is via pip.\n\n### Stable Release\nThe following commands install the stable version of GluonCV and PyTorch:\n\n```bash\npip install gluoncv --upgrade\n# native\npip install torch==1.6.0+cpu torchvision==0.7.0+cpu -f https://download.pytorch.org/whl/torch_stable.html\n# cuda 10.2\npip install torch==1.6.0 torchvision==0.7.0\n```\nThere are multiple versions of PyTorch pre-built package available. Please refer to [PyTorch](https://pytorch.org/get-started/previous-versions/) if you need other versions.\n\n\n**The latest stable version of GluonCV is 0.8 and we recommend PyTorch 1.6.0**\n\n### Nightly Release\n\nYou may get access to latest features and bug fixes with the following commands which install the nightly build of GluonCV:\n\n```bash\npip install gluoncv --pre --upgrade\n# native\npip install --pre torch torchvision torchaudio -f https://download.pytorch.org/whl/nightly/cpu/torch_nightly.html\n# cuda 10.2\npip install --pre torch torchvision torchaudio -f https://download.pytorch.org/whl/nightly/cu102/torch_nightly.html\n```\n\n\n# Docs 📖\nGluonCV documentation is available at [our website](https://gluon-cv.mxnet.io/index.html).\n\n# Examples\n\nAll tutorials are available at [our website](https://gluon-cv.mxnet.io/index.html)!\n\n- [Image Classification](http://gluon-cv.mxnet.io/build/examples_classification/index.html)\n\n- [Object Detection](http://gluon-cv.mxnet.io/build/examples_detection/index.html)\n\n- [Semantic Segmentation](http://gluon-cv.mxnet.io/build/examples_segmentation/index.html)\n\n- [Instance Segmentation](http://gluon-cv.mxnet.io/build/examples_instance/index.html)\n\n- [Video Action Recognition](https://gluon-cv.mxnet.io/build/examples_action_recognition/index.html)\n\n- [Depth Prediction](https://gluon-cv.mxnet.io/build/examples_depth/index.html)\n\n- [Generative Adversarial Network](https://github.com/dmlc/gluon-cv/tree/master/scripts/gan)\n\n- [Person Re-identification](https://github.com/dmlc/gluon-cv/tree/master/scripts/re-id/)\n\n# Resources\n\nCheck out how to use GluonCV for your own research or projects.\n\n- For background knowledge of deep learning or CV, please refer to the open source book [*Dive into Deep Learning*](http://d2l.ai/). If you are new to Gluon, please check out [our 60-minute crash course](http://gluon-crash-course.mxnet.io/).\n- For getting started quickly, refer to notebook runnable examples at [Examples](https://gluon-cv.mxnet.io/build/examples_classification/index.html).\n- For advanced examples, check out our [Scripts](http://gluon-cv.mxnet.io/master/scripts/index.html).\n- For experienced users, check out our [API Notes](https://gluon-cv.mxnet.io/api/data.datasets.html#).\n\n# Citation\n\nIf you feel our code or models helps in your research, kindly cite our papers:\n\n```\n@article{gluoncvnlp2020,\n  author  = {Jian Guo and He He and Tong He and Leonard Lausen and Mu Li and Haibin Lin and Xingjian Shi and Chenguang Wang and Junyuan Xie and Sheng Zha and Aston Zhang and Hang Zhang and Zhi Zhang and Zhongyue Zhang and Shuai Zheng and Yi Zhu},\n  title   = {GluonCV and GluonNLP: Deep Learning in Computer Vision and Natural Language Processing},\n  journal = {Journal of Machine Learning Research},\n  year    = {2020},\n  volume  = {21},\n  number  = {23},\n  pages   = {1-7},\n  url     = {http://jmlr.org/papers/v21/19-429.html}\n}\n\n@article{he2018bag,\n  title={Bag of Tricks for Image Classification with Convolutional Neural Networks},\n  author={He, Tong and Zhang, Zhi and Zhang, Hang and Zhang, Zhongyue and Xie, Junyuan and Li, Mu},\n  journal={arXiv preprint arXiv:1812.01187},\n  year={2018}\n}\n\n@article{zhang2019bag,\n  title={Bag of Freebies for Training Object Detection Neural Networks},\n  author={Zhang, Zhi and He, Tong and Zhang, Hang and Zhang, Zhongyue and Xie, Junyuan and Li, Mu},\n  journal={arXiv preprint arXiv:1902.04103},\n  year={2019}\n}\n\n@article{zhang2020resnest,\n  title={ResNeSt: Split-Attention Networks},\n  author={Zhang, Hang and Wu, Chongruo and Zhang, Zhongyue and Zhu, Yi and Zhang, Zhi and Lin, Haibin and Sun, Yue and He, Tong and Muller, Jonas and Manmatha, R. and Li, Mu and Smola, Alexander},\n  journal={arXiv preprint arXiv:2004.08955},\n  year={2020}\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdmlc%2Fgluon-cv","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdmlc%2Fgluon-cv","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdmlc%2Fgluon-cv/lists"}