{"id":13564285,"url":"https://github.com/kdexd/virtex","last_synced_at":"2025-04-05T03:02:06.518Z","repository":{"id":37425125,"uuid":"208883499","full_name":"kdexd/virtex","owner":"kdexd","description":"[CVPR 2021] VirTex: Learning Visual Representations from Textual Annotations","archived":false,"fork":false,"pushed_at":"2024-01-01T21:38:15.000Z","size":3824,"stargazers_count":561,"open_issues_count":7,"forks_count":61,"subscribers_count":13,"default_branch":"master","last_synced_at":"2025-03-29T02:01:42.066Z","etag":null,"topics":["coco-dataset","cvpr2021","image-captioning","model-zoo","pretrained-models"],"latest_commit_sha":null,"homepage":"http://kdexd.xyz/virtex","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/kdexd.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null}},"created_at":"2019-09-16T19:42:13.000Z","updated_at":"2025-03-20T06:24:31.000Z","dependencies_parsed_at":"2024-04-20T05:34:05.993Z","dependency_job_id":"f3d6d02d-980b-4c42-8d5e-03bdbf534404","html_url":"https://github.com/kdexd/virtex","commit_stats":null,"previous_names":[],"tags_count":5,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kdexd%2Fvirtex","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kdexd%2Fvirtex/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kdexd%2Fvirtex/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kdexd%2Fvirtex/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/kdexd","download_url":"https://codeload.github.com/kdexd/virtex/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247280190,"owners_count":20912966,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["coco-dataset","cvpr2021","image-captioning","model-zoo","pretrained-models"],"created_at":"2024-08-01T13:01:29.201Z","updated_at":"2025-04-05T03:02:06.479Z","avatar_url":"https://github.com/kdexd.png","language":"Python","funding_links":[],"categories":["Python"],"sub_categories":[],"readme":"VirTex: Learning Visual Representations from Textual Annotations\n================================================================\n\n\u003ch4\u003e\nKaran Desai and Justin Johnson\n\u003c/br\u003e\n\u003cspan style=\"font-size: 14pt; color: #555555\"\u003e\nUniversity of Michigan\n\u003c/span\u003e\n\u003c/h4\u003e\n\u003chr\u003e\n\n**CVPR 2021** [arxiv.org/abs/2006.06666][1]\n\n**Model Zoo, Usage Instructions and API docs:** [kdexd.github.io/virtex](https://kdexd.github.io/virtex)\n\nVirTex is a pretraining approach which uses semantically dense captions to\nlearn visual representations. We train CNN + Transformers from scratch on\nCOCO Captions, and transfer the CNN to downstream vision tasks including\nimage classification, object detection, and instance segmentation.\nVirTex matches or outperforms models which use ImageNet for pretraining -- \nboth supervised or unsupervised -- despite using up to 10x fewer images.\n\n![virtex-model](docs/_static/system_figure.jpg)\n\n\nGet the pretrained ResNet-50 visual backbone from our best performing VirTex\nmodel in one line *without any installation*!\n\n```python\nimport torch\n\n# That's it, this one line only requires PyTorch.\nmodel = torch.hub.load(\"kdexd/virtex\", \"resnet50\", pretrained=True)\n```\n\n### Note (For returning users before January 2021):\n\nThe pretrained models in our model zoo have changed from [`v1.0`](https://github.com/kdexd/virtex/releases/tag/v1.0) onwards.\nThey are slightly better tuned than older models, and reproduce the results in our\nCVPR 2021 accepted paper ([arXiv v2](https://arxiv.org/abs/2006.06666v2)). \nSome training and evaluation hyperparams are changed since [`v0.9`](https://github.com/kdexd/virtex/releases/tag/v0.9).\nPlease refer [`CHANGELOG.md`](https://github.com/kdexd/virtex/blob/master/CHANGELOG.md)\n\n\nUsage Instructions\n------------------\n\n1. [How to setup this codebase?][2]  \n2. [VirTex Model Zoo][3]  \n3. [How to train your VirTex model?][4]  \n4. [How to evaluate on downstream tasks?][5]  \n\nFull documentation is available at [kdexd.github.io/virtex](https://kdexd.github.io/virtex).\n\n\nCitation\n--------\n\nIf you find this code useful, please consider citing:\n\n```text\n@inproceedings{desai2021virtex,\n    title={{VirTex: Learning Visual Representations from Textual Annotations}},\n    author={Karan Desai and Justin Johnson},\n    booktitle={CVPR},\n    year={2021}\n}\n```\n\nAcknowledgments\n---------------\n\nWe thank Harsh Agrawal, Mohamed El Banani, Richard  Higgins, Nilesh Kulkarni\nand Chris Rockwell for helpful discussions and feedback on the paper. We thank\nIshan Misra for discussions regarding PIRL evaluation protocol; Saining Xie for\ndiscussions about replicating iNaturalist evaluation as MoCo; Ross Girshick and\nYuxin Wu for help with Detectron2 model zoo; Georgia Gkioxari for suggesting\nthe Instance Segmentation pretraining task ablation; and Stefan Lee for\nsuggestions on figure aesthetics. We thank Jia Deng for access to extra GPUs\nduring project development; and UMich ARC-TS team for support with GPU cluster\nmanagement. Finally, we thank all the Starbucks outlets in Ann Arbor for many\nhours of free WiFi. This work was partially supported by the Toyota Research\nInstitute (TRI). However, note that this article solely reflects the opinions\nand conclusions of its authors and not TRI or any other Toyota entity.\n\n\n[1]: https://arxiv.org/abs/2006.06666\n[2]: https://kdexd.github.io/virtex/virtex/usage/setup_dependencies.html\n[3]: https://kdexd.github.io/virtex/virtex/usage/model_zoo.html\n[4]: https://kdexd.github.io/virtex/virtex/usage/pretrain.html\n[5]: https://kdexd.github.io/virtex/virtex/usage/downstream.html\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkdexd%2Fvirtex","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fkdexd%2Fvirtex","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkdexd%2Fvirtex/lists"}