{"id":20663803,"url":"https://github.com/vita-group/dataefficientlth","last_synced_at":"2025-07-29T03:44:49.302Z","repository":{"id":107045656,"uuid":"548981831","full_name":"VITA-Group/DataEfficientLTH","owner":"VITA-Group","description":"[NeurIPS 2022] \"Sparse Winning Tickets are Data-Efficient Image Recognizers\" by Mukund Varma T, Xuxi Chen, Zhenyu Zhang, Tianlong Chen, Subhashini Venugopalan, Zhangyang Wang","archived":false,"fork":false,"pushed_at":"2022-12-24T09:09:28.000Z","size":31,"stargazers_count":8,"open_issues_count":0,"forks_count":1,"subscribers_count":11,"default_branch":"main","last_synced_at":"2025-04-19T18:51:46.535Z","etag":null,"topics":["data-efficient-learning","sparsity"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/VITA-Group.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2022-10-10T13:44:07.000Z","updated_at":"2025-01-22T13:21:40.000Z","dependencies_parsed_at":"2023-04-19T22:18:19.633Z","dependency_job_id":null,"html_url":"https://github.com/VITA-Group/DataEfficientLTH","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/VITA-Group/DataEfficientLTH","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/VITA-Group%2FDataEfficientLTH","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/VITA-Group%2FDataEfficientLTH/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/VITA-Group%2FDataEfficientLTH/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/VITA-Group%2FDataEfficientLTH/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/VITA-Group","download_url":"https://codeload.github.com/VITA-Group/DataEfficientLTH/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/VITA-Group%2FDataEfficientLTH/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":267625224,"owners_count":24117582,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-07-29T02:00:12.549Z","response_time":2574,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["data-efficient-learning","sparsity"],"created_at":"2024-11-16T19:19:54.912Z","updated_at":"2025-07-29T03:44:49.274Z","avatar_url":"https://github.com/VITA-Group.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Sparse Winning Tickets are Data-Efficient Image Recognizers\n[Mukund Varma T]()\u003csup\u003e1\u003c/sup\u003e,\n[Xuxi Chen](https://xxchen.site/)\u003csup\u003e2\u003c/sup\u003e,\n[Zhenyu Zhang](https://scholar.google.com/citations?user=ZLyJRxoAAAAJ\u0026hl=zh-CN)\u003csup\u003e2\u003c/sup\u003e,\n[Tianlong Chen](https://tianlong-chen.github.io/)\u003csup\u003e2\u003c/sup\u003e,\n[Subhashini Venugopalan](https://vsubhashini.github.io/)\u003csup\u003e3\u003c/sup\u003e,\n[Zhangyang Wang](https://vita-group.github.io/)\u003csup\u003e2\u003c/sup\u003e\n\n\u003csup\u003e1\u003c/sup\u003eIndian Institute of Technology Madras, \u003csup\u003e2\u003c/sup\u003eUniversity of Texas at Austin, \u003csup\u003e3\u003c/sup\u003eGoogle Research\n\nAccepted at NeurIPS '22 (Featured Paper)\n\n[Paper](https://openreview.net/forum?id=wfKbtSjHA6F), [Slides](https://docs.google.com/presentation/d/1gVNX23VgFRUR9e_4tHvBlMBXLg6wnMQoa_zWOri1rWM/edit?usp=sharing)\n\n## Abstract\n\nImproving performance of deep networks in data limited regimes has warranted much attention. In this work, we empirically show that “winning tickets” (small subnetworks) obtained via magnitude pruning based on the lottery ticket hypothesis, apart from being sparse are also effective recognizers in data limited regimes. Based on extensive experiments, we find that in low data regimes (datasets of 50-100 examples per class), sparse winning tickets substantially outperform the original dense networks. This approach, when combined with augmentations or fine-tuning from a self-supervised backbone network, shows further improvements in performance by as much as 16% (absolute) on low sample datasets and longtailed classification. Further, sparse winning tickets are more robust to synthetic noise and distribution shifts compared to their dense counterparts. Our analysis of winning tickets on small datasets indicates that, though sparse, the networks retain density in the initial layers and their representations are more generalizable.\n\n## Installation\n\n```bash\npip install -r requirements.txt\n```\n\nAdditional datasets must be downloaded and placed in the appropriate directories - [CIFAR10-C](https://zenodo.org/record/2535967#.Y6a9EdJBw1g), [CIFAR10.2](https://github.com/modestyachts/cifar-10.2), [ImageNet (50 images/class)](https://github.com/VIPriors/vipriors-challenges-toolkit), [EuroSAT (50 images/class)](https://github.com/cvjena/deic), [ISIC 2018 (80 images/class)](https://github.com/cvjena/deic), [CLaMM (50 images/class)](https://github.com/cvjena/deic)\n\n## Usage\n\n### Training\n\n```bash\n# to run cifar10 all augmentation strategies, all data sizes\nbash run_cifar10.sh sparse 1 imp\nbash run_cifar10.sh sparse 0.5 imp\nbash run_cifar10.sh sparse 0.2 imp\nbash run_cifar10.sh sparse 0.1 imp\nbash run_cifar10.sh sparse 0.02 imp\nbash run_cifar10.sh sparse 0.01 imp\n\n# run other methods on cifar10 subsets\nbash run_cifar10_othermethods.sh\n\n# run cifar100 long_tailed\nbash run_cifar100_longtailed.sh\n\n# run on other datasets\nbash run_otherdsets.sh eurosat_rgb \u003cpath-to-eurosatrgb\u003e\nbash run_otherdsets.sh isic \u003cpath-to-isic\u003e\nbash run_otherdsets.sh clamm \u003cpath-to-clamm\u003e\n```\n\nAdditional scripts can be found [here](scripts/)\n\n### Evaluation\n\nCode to evaluate robustness - synthetic, adversarial, distribution shifts can be found [here](helpers.ipynb)\n\n## Cite this work\n\nIf you find our work / code implementation useful for your own research, please cite our paper.\n\n```\n@inproceedings{\n    t2022sparse,\n    title={Sparse Winning Tickets are Data-Efficient Image Recognizers},\n    author={Mukund Varma T and Xuxi Chen and Zhenyu Zhang and Tianlong Chen and Subhashini Venugopalan and Zhangyang Wang},\n    booktitle={Advances in Neural Information Processing Systems},\n    editor={Alice H. Oh and Alekh Agarwal and Danielle Belgrave and Kyunghyun Cho},\n    year={2022},\n    url={https://openreview.net/forum?id=wfKbtSjHA6F}\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvita-group%2Fdataefficientlth","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fvita-group%2Fdataefficientlth","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvita-group%2Fdataefficientlth/lists"}