{"id":18889002,"url":"https://github.com/naver/trex","last_synced_at":"2026-03-06T01:33:48.893Z","repository":{"id":152308658,"uuid":"610194542","full_name":"naver/trex","owner":"naver","description":"PyTorch implementation of the paper \"No reason for no supervision: Improving the generalization of supervised models\"","archived":false,"fork":false,"pushed_at":"2023-03-07T08:01:55.000Z","size":52,"stargazers_count":18,"open_issues_count":0,"forks_count":0,"subscribers_count":5,"default_branch":"master","last_synced_at":"2025-08-18T11:02:46.012Z","etag":null,"topics":["computer-vision","representation-learning","supervised-learning","transfer-learning"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/naver.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-03-06T09:33:14.000Z","updated_at":"2025-06-06T17:46:20.000Z","dependencies_parsed_at":null,"dependency_job_id":"5caeb0b9-3a5f-4714-8882-3580b2337e6f","html_url":"https://github.com/naver/trex","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/naver/trex","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/naver%2Ftrex","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/naver%2Ftrex/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/naver%2Ftrex/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/naver%2Ftrex/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/naver","download_url":"https://codeload.github.com/naver/trex/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/naver%2Ftrex/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":272971410,"owners_count":25024093,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-08-31T02:00:09.071Z","response_time":79,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["computer-vision","representation-learning","supervised-learning","transfer-learning"],"created_at":"2024-11-08T07:47:00.271Z","updated_at":"2026-03-06T01:33:43.859Z","avatar_url":"https://github.com/naver.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003c!--\ntrex\nCopyright (C) 2023-present NAVER Corp.\nCC BY-NC-SA 4.0\n--\u003e\n\n\u003c!-- omit in toc --\u003e\n# No reason for no supervision: Improved generalization in supervised models\n\n| [Project Website](https://europe.naverlabs.com/t-rex) | [Paper (arXiv)](https://arxiv.org/abs/2206.15369) | [Paper (ICLR 2023 - notable top 25%)](https://openreview.net/forum?id=3Y5Uhf5KgGK) |\n| :-: | :-: | :-: |\n\n\nIn this repository, we provide:\n- Several pretrained t-ReX and t-ReX* models in PyTorch (see [here](#model-zoo)).\n- Code for training our t-ReX and t-ReX* models on the ImageNet-1K dataset in PyTorch (see [here](#training-t-rex-models)).\n- Code for running transfer learning evaluations of pretrained models via linear classification over pre-extracted features on 16 downstream datasets (see [here](#transfer-learning-evaluation-suite)).\n\nIf you find this repository useful, please consider citing us:\n```\n@inproceedings{sariyildiz2023improving,\n    title={No Reason for No Supervision: Improved Generalization in Supervised Models},\n    author={Sariyildiz, Mert Bulent and Kalantidis, Yannis and Alahari, Karteek and Larlus, Diane},\n    booktitle={International Conference on Learning Representations},\n    year={2023},\n}\n```\n\n- [Model Zoo](#model-zoo)\n- [Training t-ReX models](#training-t-rex-models)\n  - [Installation](#installation)\n  - [Dataset](#dataset)\n  - [Training commands](#training-commands)\n    - [Commands for t-ReX-OCM models](#commands-for-t-rex-ocm-models)\n    - [Commands for plain t-ReX models](#commands-for-plain-t-rex-models)\n- [Transfer learning evaluation suite](#transfer-learning-evaluation-suite)\n\n\n# Model Zoo\n\nIn the table below, we provide links for several pretrained t-ReX and t-ReX* models.\nThese are the models which produce the results reported in the paper, as well as the models reproduced with the cleaner codebase released in this repo.\nTransfer performance of these models are averaged over 15 datasets, which include two additions, i.e., the i-Naturalist datasets, to the 13 transfer datasets we mainly used in the paper.\nTo perform transfer evaluations, see the [corresponding section of this readme](#evaluating-t-rex-models-on-transfer-datasets).\n\n\u003ctable style=\"border: 1px;\"\u003e\n    \u003ctr\u003e\n        \u003cth\u003eModel\u003c/th\u003e\n        \u003cth style=\"text-align: center\"\u003eResNet50 \u003cbr/\u003e Checkpoint\u003c/th\u003e\n        \u003cth style=\"text-align: center\"\u003eFull \u003cbr/\u003e Checkpoint\u003c/th\u003e\n        \u003cth style=\"text-align: center\"\u003eImageNet-1K \u003cbr/\u003e (Top-1 %)\u003c/th\u003e\n        \u003cth style=\"text-align: center\"\u003eAverage Transfer \u003cbr/\u003e (Log odds)\u003c/th\u003e\n    \u003c/tr\u003e\n    \u003ctr\u003e\n        \u003ctd colspan=\"5\"\u003e\u003ci\u003eModels reported in the paper\u003c/i\u003e\u003c/td\u003e\n    \u003c/tr\u003e\n    \u003ctr\u003e\n        \u003ctd\u003e t-ReX \u003c/td\u003e\n        \u003ctd style=\"text-align: center\"\u003e \u003ca href=\"https://download.europe.naverlabs.com/ComputerVision/trex_models/trex.pth\"\u003eLink\u003c/a\u003e \u003c/td\u003e\n        \u003ctd style=\"text-align: center\"\u003e\u003c/td\u003e\n        \u003ctd style=\"text-align: center\"\u003e78.0\u003c/td\u003e\n        \u003ctd style=\"text-align: center\"\u003e1.1704\u003c/td\u003e\n    \u003c/tr\u003e\n    \u003ctr\u003e\n        \u003ctd\u003e t-ReX* \u003c/td\u003e\n        \u003ctd style=\"text-align: center\"\u003e \u003ca href=\"https://download.europe.naverlabs.com/ComputerVision/trex_models/trexstar.pth\"\u003eLink\u003c/a\u003e \u003c/td\u003e\n        \u003ctd style=\"text-align: center\"\u003e\u003c/td\u003e\n        \u003ctd style=\"text-align: center\"\u003e80.2\u003c/td\u003e\n        \u003ctd style=\"text-align: center\"\u003e0.8829\u003c/td\u003e\n    \u003c/tr\u003e\n    \u003ctr\u003e\n        \u003ctd colspan=\"5\"\u003e\u003ci\u003eModels reproduced with this code base\u003c/i\u003e\u003c/td\u003e\n    \u003c/tr\u003e\n    \u003ctr\u003e\n        \u003ctd\u003e t-ReX \u003c/td\u003e\n        \u003ctd style=\"text-align: center\"\u003e \u003ca href=\"https://download.europe.naverlabs.com/ComputerVision/trex_models/trex_2.pth\"\u003eLink\u003c/a\u003e \u003c/td\u003e\n        \u003ctd style=\"text-align: center\"\u003e \u003ca href=\"https://download.europe.naverlabs.com/ComputerVision/trex_models/trex_2_checkpoint_full.pth\"\u003eLink\u003c/a\u003e \u003c/td\u003e\n        \u003ctd style=\"text-align: center\"\u003e77.9\u003c/td\u003e\n        \u003ctd style=\"text-align: center\"\u003e1.1664\u003c/td\u003e\n    \u003c/tr\u003e\n    \u003ctr\u003e\n        \u003ctd\u003e t-ReX* \u003c/td\u003e\n        \u003ctd style=\"text-align: center\"\u003e \u003ca href=\"https://download.europe.naverlabs.com/ComputerVision/trex_models/trexstar_2.pth\"\u003eLink\u003c/a\u003e \u003c/td\u003e\n        \u003ctd style=\"text-align: center\"\u003e \u003ca href=\"https://download.europe.naverlabs.com/ComputerVision/trex_models/trexstar_2_checkpoint_full.pth\"\u003eLink\u003c/a\u003e \u003c/td\u003e\n        \u003ctd style=\"text-align: center\"\u003e80.2\u003c/td\u003e\n        \u003ctd style=\"text-align: center\"\u003e0.8800\u003c/td\u003e\n    \u003c/tr\u003e\n\u003c/table\u003e\n\nFull checkpoints contain a separate state dictionary for the model, optimizer and gradient scaler (for mixed precision).\nWe share them for reference.\nWhereas, you can use the ResNet50 checkpoints simply by\n\n```python\nimport torch as th\nfrom torchvision.models import resnet50\nckpt = th.load(\"trex.pth\", \"cpu\")\nnet = resnet50()\nmsg = net.load_state_dict(ckpt, strict=False)\nassert msg.missing_keys == [\"fc.weight\", \"fc.bias\"] and msg.unexpected_keys == []\n```\n\n# Training t-ReX models\n\n## Installation\n\nWe developed this code by using a recent version of PyTorch, torchvision and Tensorboard.\nWe recommend creating a new conda environment to manage these packages.\n```bash\nconda create -n trex\nconda activate trex\nconda install pytorch=1.13.1 torchvision pytorch-cuda=11.6 -c pytorch -c nvidia\npip install tensorboard\n```\n\n## Dataset\n\nWe train our models on the ILSVRC-2012 dataset (also called ImageNet-1K).\nIt is available on [the ImageNet website](https://image-net.org/download-images.php).\nOnce you download the dataset, make sure that `data_dir=/path/to/imagenet` contains `train` and `val` directories, each including 1000 sub-directories for the images of the ImageNet-1K classes.\n\n## Training commands\n\nBelow, we provide commands for training plain t-ReX and t-ReX-OCM models on ImageNet-1K.\nNote that the results we report in the paper are obtained by 100 epoch trainings over 4 GPUs each processing a batch of 64 samples.\nIf you want to use a less number of GPUs or increase the batch size, etc., see the arguments of [main.py](./main.py).\n\n### Commands for t-ReX-OCM models\n\nt-ReX-OCM models are defined by Equation-2 of the paper.\n\n\u003cdetails\u003e\n\u003csummary\u003e Command for training a t-ReX-OCM-1 model (named \u003cstrong\u003et-ReX*\u003c/strong\u003e in the paper)\u003c/summary\u003e\n\n```bash\ndata_dir=/path/to/imagenet\noutput_dir=/path/where/to/save/checkpoints\nexport CUDA_VISIBLE_DEVICES=0,1,2,3  # change accordingly the \u003cnproc_per_node\u003e argument below\n\npython -m torch.distributed.launch --nproc_per_node=4 --master_port=12345 main.py  \\\n    --output_dir=${output_dir} \\\n    --data_dir=${data_dir} \\\n    --seed=${RANDOM} \\\n    --pr_hidden_layers=1 \\\n    --mc_global_scale 0.40 1.00 \\\n    --mc_local_scale 0.05 0.40\n```\n\n\u003c/details\u003e\n\n\u003cdetails\u003e\n\u003csummary\u003e Command for training a t-ReX-OCM-3 model (named \u003cstrong\u003et-ReX\u003c/strong\u003e in the paper)\u003c/summary\u003e\n\n```bash\ndata_dir=/path/to/imagenet\noutput_dir=/path/where/to/save/checkpoints\nexport CUDA_VISIBLE_DEVICES=0,1,2,3  # change accordingly the \u003cnproc_per_node\u003e argument below\n\npython -m torch.distributed.launch --nproc_per_node=4 --master_port=12345 main.py  \\\n    --output_dir=${output_dir} \\\n    --data_dir=${data_dir} \\\n    --seed=${RANDOM} \\\n    --pr_hidden_layers=3 \\\n    --mc_global_scale 0.25 1.00 \\\n    --mc_local_scale 0.05 0.25\n```\n\n\u003c/details\u003e\n\n### Commands for plain t-ReX models\n\nPlain t-ReX models are defined by Equation-1 of the paper.\nCompared to the commands for training OCM models above, we just add the ```--memory_size=0``` argument, which disables the OCM part.\n\n\u003cdetails\u003e\n\u003csummary\u003e Command for training a plain t-ReX-1 model\u003c/summary\u003e\n\n```bash\ndata_dir=/path/to/imagenet\noutput_dir=/path/where/to/save/checkpoints\nexport CUDA_VISIBLE_DEVICES=0,1,2,3  # change accordingly the \u003cnproc_per_node\u003e argument below\n\npython -m torch.distributed.launch --nproc_per_node=4 --master_port=12345 main.py  \\\n    --output_dir=${output_dir} \\\n    --data_dir=${data_dir} \\\n    --seed=${RANDOM} \\\n    --pr_hidden_layers=1 \\\n    --mc_global_scale 0.40 1.00 \\\n    --mc_local_scale 0.05 0.40 \\\n    --memory_size=0\n```\n\n\u003c/details\u003e\n\n\u003cdetails\u003e\n\u003csummary\u003e Command for training a plain t-ReX-3 model\u003c/summary\u003e\n\n```bash\ndata_dir=/path/to/imagenet\noutput_dir=/path/where/to/save/checkpoints\nexport CUDA_VISIBLE_DEVICES=0,1,2,3  # change accordingly the \u003cnproc_per_node\u003e argument below\n\npython -m torch.distributed.launch --nproc_per_node=4 --master_port=12345 main.py  \\\n    --output_dir=${output_dir} \\\n    --data_dir=${data_dir} \\\n    --seed=${RANDOM} \\\n    --pr_hidden_layers=3 \\\n    --mc_global_scale 0.25 1.00 \\\n    --mc_local_scale 0.05 0.25 \\\n    --memory_size=0\n```\n\n\u003c/details\u003e\n\n\n# Transfer learning evaluation suite\n\nWe provide the evaluation code under the [transfer](./transfer) folder.\nPlease navigate there.\n\n\n\u003c!-- omit in toc --\u003e\n# Acknowledgement\nOur implementation builds on several public code repositories such as [DINO](https://github.com/facebookresearch/dino), [MoCo](https://github.com/facebookresearch/moco) and [the PyTorch examples](https://github.com/pytorch/examples).\nWe thank all the authors and developers for making their code accessible.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnaver%2Ftrex","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fnaver%2Ftrex","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnaver%2Ftrex/lists"}