{"id":13738141,"url":"https://github.com/VITA-Group/CV_LTH_Pre-training","last_synced_at":"2025-05-08T15:32:35.478Z","repository":{"id":107045618,"uuid":"320877063","full_name":"VITA-Group/CV_LTH_Pre-training","owner":"VITA-Group","description":"[CVPR 2021] \"The Lottery Tickets Hypothesis for Supervised and Self-supervised Pre-training in Computer Vision Models\" Tianlong Chen, Jonathan Frankle, Shiyu Chang, Sijia Liu, Yang Zhang,  Michael Carbin, Zhangyang Wang","archived":false,"fork":false,"pushed_at":"2022-12-17T12:07:26.000Z","size":1514,"stargazers_count":69,"open_issues_count":2,"forks_count":14,"subscribers_count":14,"default_branch":"main","last_synced_at":"2025-04-19T18:51:23.370Z","etag":null,"topics":["imagenet-pr","lottery-ticket-hypothesis","moco","pre-training","simclr","simclrv2","transfer","transfer-learning"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/VITA-Group.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null}},"created_at":"2020-12-12T16:45:08.000Z","updated_at":"2025-03-11T09:57:11.000Z","dependencies_parsed_at":null,"dependency_job_id":"d4d1dc97-a353-44f8-8dcd-d7db4e2f80c0","html_url":"https://github.com/VITA-Group/CV_LTH_Pre-training","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/VITA-Group%2FCV_LTH_Pre-training","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/VITA-Group%2FCV_LTH_Pre-training/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/VITA-Group%2FCV_LTH_Pre-training/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/VITA-Group%2FCV_LTH_Pre-training/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/VITA-Group","download_url":"https://codeload.github.com/VITA-Group/CV_LTH_Pre-training/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":253096459,"owners_count":21853604,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["imagenet-pr","lottery-ticket-hypothesis","moco","pre-training","simclr","simclrv2","transfer","transfer-learning"],"created_at":"2024-08-03T03:02:12.275Z","updated_at":"2025-05-08T15:32:30.742Z","avatar_url":"https://github.com/VITA-Group.png","language":"Python","funding_links":[],"categories":["Python"],"sub_categories":[],"readme":"# The Lottery Tickets Hypothesis for Supervised and Self-supervised Pre-training in Computer Vision Models\n\n[![License: MIT](https://img.shields.io/badge/License-MIT-green.svg)](https://opensource.org/licenses/MIT)\n\nCodes for this paper [The Lottery Tickets Hypothesis for Supervised and Self-supervised Pre-training in Computer Vision Models](https://arxiv.org/abs/2012.06908). [CVPR 2021]\n\nTianlong Chen, Jonathan Frankle, Shiyu Chang, Sijia Liu, Yang Zhang, Michael Carbin, Zhangyang Wang.\n\n\n\n## Overview\n\n*Can we aggressively trim down the complexity of pre-trained models, without damaging their downstream transferability?*\n\n\u003cimg src = \"Figs/Teaser.png\" align = \"center\" width=\"50%\" hight=\"60%\"\u003e\n\n\n\n## Transfer Learning for Winning Tickets from Supervised and Self-supervised Pre-training\n\nDownstream classification tasks.\n\n![](Figs/cls.png)\n\nDownstream detection and segmentation tasks.\n\n![](Figs/dense.png)\n\n\n\n## Properties of Pre-training Tickets\n\n\u003cimg src = \"Figs/mask.png\" align = \"center\" width=\"50%\" hight=\"60%\"\u003e\u003cimg src = \"Figs/transfer.png\" align = \"center\" width=\"50%\" hight=\"60%\"\u003e\n\n\n\n## Reproduce\n\n### Preliminary\n\n#### Required environment:\n\n- pytorch \u003e= 1.5.0 \n- torchvision\n\n#### Pre-trained Models\n\nPre-trained models are provided [here](https://www.dropbox.com/sh/uwois7q7b6mfdg4/AAD493jEVwHB9A8RQPFiOeu0a?dl=0).\n\n```python\nimagenet_weight.pt # torchvision std model\n\nmoco.pt # pretrained moco v2 model (only contain encorder_q)\n\nmoco_v2_800ep_pretrain.pth.tar # pretrained moco v2 model (contain encorder_q\u0026k)\n\nsimclr_weight.pt # (pretrained_simclr weight)\n```\n\n### Task-Specific Tickets Finding\n\nRemark. for both pre-training tasks and downstream tasks.\n\n#### Iterative Magnitude Pruning \n\n##### SimCLR task \n\n```\ncd SimCLR \npython -u main.py \\\n    [experiment name] \\ \n    --gpu 0,1,2,3 \\    \n    --epochs 180 \\\n    --prun_epoch 10 \\ # pruning for ( 1 + 180/10 iterations)\n    --prun_percent 0.2 \\\n    --lr 1e-4 \\\n    --arch resnet50 \\\n    --batch_size 256 \\\n    --data [data direction] \\\n    --sim_model [pretrained_simclr_model] \\\n    --save_dir simclr_imp\n```\n\n##### MoCo task \n\n```\ncd MoCo\nCUDA_VISIBLE_DEVICES=0,1,2,3 python -u main_moco_imp.py \\\n\t[Dataset Direction] \\\n\t--pretrained_path [pretrained_moco_model] \\\n    -a resnet50 \\\n    --batch-size 256 \\\n    --dist-url 'tcp://127.0.0.1:5234' \\\n    --multiprocessing-distributed \\\n    --world-size 1 \\\n    --rank 0 \\\n    --mlp \\\n    --moco-t 0.2 \\\n    --aug-plus \\\n    --cos \\\n    --epochs 180 \\\n    --retrain_epoch 10 \\ # pruning for ( 1 + 180/10 iterations)\n    --save_dir moco_imp\n```\n\n##### Classification task on ImageNet\n\n```\nCUDA_VISIBLE_DEVICES=0,1,2,3 python -u main_imp_imagenet.py \\\n\t[Dataset Direction] \\\n\t-a resnet50 \\\n\t--epochs 10 \\\n\t-b 256 \\\n\t--lr 1e-4 \\\n\t--states 19 \\ # iterative pruning times \n\t--save_dir imagenet_imp\n```\n\n##### Classification task on Visda2017\n\n```\nCUDA_VISIBLE_DEVICES=0,1,2,3 python -u main_imp_visda.py \\\n\t[Dataset Direction] \\\n\t-a resnet50 \\\n\t--epochs 20 \\\n\t-b 256 \\\n\t--lr 0.001 \\\n\t--prune_type lt \\ # lt or pt_trans\n\t--pre_weight [pretrained weight] \\ # if pt_trans else None\n\t--states 19 \\ # iterative pruning times\n\t--save_dir visda_imp\n```\n\n##### Classification task on small dataset\n\n```\nCUDA_VISIBLE_DEVICES=0 python -u main_imp_downstream.py \\\n\t--data [dataset direction] \\\n\t--dataset [dataset name] \\#cifar10, cifar100, svhn, fmnist \n\t--arch resnet50 \\\n\t--pruning_times 19 \\\n\t--prune_type [lt, pt, rewind_lt, pt_trans] \\\n\t--save_dir imp_downstream \\\n\t# --pretrained [pretrained weight if prune_type==pt_trans] \\\n\t# --random_prune [if using random pruning] \\\n    # --rewind_epoch [rewind weight epoch if prune_type==rewind_lt] \\\n```\n\n### Transfer to Downstream Tasks\n\n##### Small datasets: (e.g., CIFAR-10, CIFAR-100, SVHN, Fashion-MNIST)\n\n```\nCUDA_VISIBLE_DEVICES=0 python -u main_eval_downstream.py \\\n\t--data [dataset direction] \\\n\t--dataset [dataset name] \\#cifar10, cifar100, svhn, fmnist \n\t--arch resnet50 \\\n\t--save_dir [save_direction] \\\n\t--pretrained [init weight] \\\n\t--dict_key state_dict [ dict_key in pretrained file, None means load all ] \\\n\t--mask_dir [mask for ticket] \\\n\t--reverse_mask \\ #if want to reverse mask\n```\n\n##### Visda2017:\n\n```\nCUDA_VISIBLE_DEVICES=0,1,2,3 python -u main_eval_visda.py \\\n\t[data direction] \\\n\t-a resnet50 \\\n\t--epochs 20 \\\n\t-b 256 \\\n\t--lr 0.001 \\\n\t--save_dir [save_direction] \\\n\t--pretrained [init weight] \\\n\t--dict_key state_dict [ dict_key in pretrained file, None means load all ] \\\n\t--mask_dir [mask for ticket] \\\n\t--reverse_mask \\ #if want to reverse mask\n```\n\n### Detection and Segmentation Experiments\n\nDetials of YOLOv4 for detection are collected [here](https://github.com/VITA-Group/CV_LTH_Pre-training/blob/main/Detection/README.md).\n\nDetials of DeepLabv3+ for segmentation are collected [here](https://github.com/VITA-Group/CV_LTH_Pre-training/blob/main/Segmentation/README.md).\n\n## Citation\n\n```\n@article{chen2020lottery,\n  title={The Lottery Tickets Hypothesis for Supervised and Self-supervised Pre-training in Computer Vision Models},\n  author={Chen, Tianlong and Frankle, Jonathan and Chang, Shiyu and Liu, Sijia and Zhang, Yang and Carbin, Michael and Wang, Zhangyang},\n  journal={arXiv preprint arXiv:2012.06908},\n  year={2020}\n}\n```\n\n\n\n## Acknowledgement\n\nhttps://github.com/google-research/simclr\n\nhttps://github.com/facebookresearch/moco\n\nhttps://github.com/VainF/DeepLabV3Plus-Pytorch\n\nhttps://github.com/argusswift/YOLOv4-pytorch\n\nhttps://github.com/yczhang1017/SSD_resnet_pytorch/tree/master\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FVITA-Group%2FCV_LTH_Pre-training","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FVITA-Group%2FCV_LTH_Pre-training","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FVITA-Group%2FCV_LTH_Pre-training/lists"}