{"id":13576236,"url":"https://github.com/LeapLabTHU/EfficientTrain","last_synced_at":"2025-04-05T05:31:16.324Z","repository":{"id":64789595,"uuid":"565770100","full_name":"LeapLabTHU/EfficientTrain","owner":"LeapLabTHU","description":"1.5−3.0× lossless training or pre-training speedup. An off-the-shelf, easy-to-implement algorithm for the efficient training of foundation visual backbones.","archived":false,"fork":false,"pushed_at":"2024-08-23T07:38:47.000Z","size":10422,"stargazers_count":206,"open_issues_count":2,"forks_count":9,"subscribers_count":6,"default_branch":"main","last_synced_at":"2024-11-05T12:33:13.732Z","etag":null,"topics":["computer-vision","deep-learning","efficient-training","machine-learning","pytorch"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/LeapLabTHU.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2022-11-14T09:42:24.000Z","updated_at":"2024-10-30T11:56:10.000Z","dependencies_parsed_at":"2024-08-23T08:51:44.665Z","dependency_job_id":null,"html_url":"https://github.com/LeapLabTHU/EfficientTrain","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LeapLabTHU%2FEfficientTrain","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LeapLabTHU%2FEfficientTrain/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LeapLabTHU%2FEfficientTrain/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LeapLabTHU%2FEfficientTrain/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/LeapLabTHU","download_url":"https://codeload.github.com/LeapLabTHU/EfficientTrain/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247294209,"owners_count":20915332,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["computer-vision","deep-learning","efficient-training","machine-learning","pytorch"],"created_at":"2024-08-01T15:01:08.243Z","updated_at":"2025-04-05T05:31:11.304Z","avatar_url":"https://github.com/LeapLabTHU.png","language":"Python","funding_links":[],"categories":["Python"],"sub_categories":[],"readme":"# EfficientTrain++ (TPAMI 2024 \u0026 ICCV 2023)\n\nThis repo releases the code and pre-trained models of **EfficientTrain++**, an off-the-shelf, easy-to-implement algorithm for the efficient training of foundation visual backbones.\n\n\n[TPAMI 2024]\n[EfficientTrain++: Generalized Curriculum Learning for Efficient Visual Backbone Training](https://arxiv.org/abs/2405.08768) \\\n[Yulin Wang](https://www.wyl.cool/), [Yang Yue](https://github.com/yueyang2000), [Rui Lu](https://scholar.google.com/citations?user=upMvIv4AAAAJ\u0026hl=zh-CN), [Yizeng Han](https://yizenghan.top/), [Shiji Song](https://scholar.google.com/citations?user=rw6vWdcAAAAJ\u0026hl=zh-CN), and [Gao Huang](http://www.gaohuang.net/)\\\nTsinghua University, BAAI\\\n[[`arXiv`](https://arxiv.org/abs/2405.08768)]\n\n[ICCV 2023]\n[EfficientTrain: Exploring Generalized Curriculum Learning for Training Visual Backbones](https://arxiv.org/abs/2211.09703) \\\n[Yulin Wang](https://www.wyl.cool/), [Yang Yue](https://github.com/yueyang2000), [Rui Lu](https://scholar.google.com/citations?user=upMvIv4AAAAJ\u0026hl=zh-CN), [Tianjiao Liu](https://www.semanticscholar.org/author/Tianjiao-Liu/2570085), [Zhao Zhong](https://scholar.google.com/citations?user=igtXP_kAAAAJ\u0026hl=en), [Shiji Song](https://scholar.google.com/citations?user=rw6vWdcAAAAJ\u0026hl=zh-CN), and [Gao Huang](http://www.gaohuang.net/)\\\nTsinghua University, Huawei, BAAI\\\n[[`arXiv`](https://arxiv.org/abs/2211.09703)]\n\n- *Update on 2024.05.14:* I'm highly interested in extending EfficientTrain++ to CLIP-style models, multi-modal large language models, generative models (*e.g.*, diffusion-based or token-based), and advanced visual self-supervised learning methods. I'm always open to discussions and potential collaborations. If you are interested, please kindly send an e-mail to me (wang-yl19@mails.tsinghua.edu.cn).\n\n\n## Overview\n\nWe present a novel curriculum learning approach for the efficient training of foundation visual backbones. Our algorithm, **EfficientTrain++**, is simple, general, yet surprisingly effective. As an off-the-shelf approach, it reduces the training time of various popular models (*e.g.*, ResNet, ConvNeXt, DeiT, PVT, Swin, CSWin, and CAFormer) by **1.5−3.0×** on ImageNet-1K/22K without sacrificing accuracy. It also demonstrates efficacy in self-supervised learning (*e.g.*, MAE).\n\n\u003cp align=\"center\"\u003e\n    \u003cimg src=\"./imgs/overview.png\" width= \"450\"\u003e\n\u003c/p\u003e\n\n\n## Highlights of our work\n- **1.5−3.0×** lossless training or pre-training speedup on ImageNet-1K and ImageNet-22K. Practical efficiency aligns with theoretical performance. Both upstream and downstream performance are not affected.\n- Effective for diverse visual backbones, including ConvNets, isotropic/multi-stage ViTs, and ConvNet-ViT hybrid models. For example, ResNet, ConvNeXt, DeiT, PVT, Swin, CSWin, and CAFormer.\n- Dramatically improving the performance of relatively smaller models (*e.g.*, on ImageNet-1K, DeiT-S: 80.3% -\u003e 81.3%, DeiT-T: 72.5% -\u003e 74.4%).\n- Superior performance across varying training budgets (*e.g.*, training cost of 0 - 300 epochs or more).\n- Applicable to both supervised learning and self-supervised learning (*e.g.*, MAE).\n- Optional techniques optimized for limited CPU/memory capabilities (*e.g.*, cannot support high data pre-processing speed).\n- Optional techniques optimized for large-scale parallel training (*e.g.*, 16-64 GPUs or more).\n\n## Catalog\n- [x] ImageNet-1K Training Code\n- [x] ImageNet-1K Pre-trained Models \n- [x] ImageNet-22K -\u003e ImageNet-1K Fine-tuning Code\n- [x] ImageNet-22K Pre-trained Models \n- [x] ImageNet-22K -\u003e ImageNet-1K Fine-tuned Models \n\n\n## Installation\nWe support [PyTorch](https://pytorch.org/)\u003e=2.0.0 and [torchvision](https://pytorch.org/vision/stable/index.html)\u003e=0.15.1. Please install them following the official instructions. \n\nClone this repo and install the required packages:\n```\ngit clone https://github.com/LeapLabTHU/EfficientTrain\npip install timm==0.4.12 tensorboardX six\n```\nThe instructions for preparing [ImageNet-1K/22K](http://image-net.org/) datasets can be found [here](https://github.com/facebookresearch/ConvNeXt/blob/main/INSTALL.md#dataset-preparation).\n\n## Training\nSee [TRAINING.md](TRAINING.md) for the training instructions.\n\n\n## Pre-trained models \u0026 evaluation \u0026 fine-tuning\nSee [EVAL.md](EVAL.md) for the pre-trained models and the instructions for evaluating or fine-tuning them.\n\n\n## Results\n\n### Supervised learning on ImageNet-1K\n\u003cp align=\"center\"\u003e\n    \u003cimg src=\"./imgs/in_1k.png\" width= \"900\"\u003e\n\u003c/p\u003e\n\n\n### ImageNet-22K pre-training\n\u003cp align=\"center\"\u003e\n    \u003cimg src=\"./imgs/in_22k.png\" width= \"900\"\u003e\n\u003c/p\u003e\n\n\n### Supervised learning on ImageNet-1K (varying training budgets)\n\u003cp align=\"center\"\u003e\n    \u003cimg src=\"./imgs/vary_epoch.png\" width= \"900\"\u003e\n\u003c/p\u003e\n\n\u003cp align=\"center\"\u003e\n    \u003cimg src=\"./imgs/300ep.png\" width= \"450\"\u003e\n\u003c/p\u003e\n\n### Object detection and instance segmentation on COCO\n\u003cp align=\"center\"\u003e\n    \u003cimg src=\"./imgs/coco.png\" width= \"450\"\u003e\n\u003c/p\u003e\n\n\n### Semantic segmentation on ADE20K\n\u003cp align=\"center\"\u003e\n    \u003cimg src=\"./imgs/seg.png\" width= \"450\"\u003e\n\u003c/p\u003e\n\n\n### Self-supervised learning results on top of MAE\n\u003cp align=\"center\"\u003e\n    \u003cimg src=\"./imgs/mae.png\" width= \"450\"\u003e\n\u003c/p\u003e\n\n\n## TODO\n\nThis repo is still being updated. If you need anything, no matter it is listed in the following or not, please send an e-mail to me (wang-yl19@mails.tsinghua.edu.cn).\n- [ ] A detailed tutorial on how to implement this repo to train (customized) models on customized datasets.\n- [ ] ImageNet-22K Training Code\n- [ ] ImageNet-1K Self-supervised Learning Code (EfficientTrain + [MAE](https://arxiv.org/pdf/2111.06377.pdf)) \n- [ ] EfficientTrain + [MAE](https://arxiv.org/pdf/2111.06377.pdf) Pre-trained Models\n\n## Acknowledgments\n\nThis repo is mainly developed on the top of [ConvNeXt](https://github.com/facebookresearch/ConvNeXt), we sincerely thank them for their efficient and neat codebase. This repo is also built using [DeiT](https://github.com/facebookresearch/deit) and [timm](https://github.com/rwightman/pytorch-image-models).\n\n\n## Citation\nIf you find this work valuable or use our code in your own research, please consider citing us:\n```bibtex\n@article{wang2024EfficientTrain_pp,\n        title = {EfficientTrain++: Generalized Curriculum Learning for Efficient Visual Backbone Training},\n       author = {Wang, Yulin and Yue, Yang and Lu, Rui and Han, Yizeng and Song, Shiji and Huang, Gao},\n      journal = {IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)},\n         year = {2024},\n          doi = {10.1109/TPAMI.2024.3401036}\n}\n@inproceedings{wang2023EfficientTrain,\n        title = {EfficientTrain: Exploring Generalized Curriculum Learning for Training Visual Backbones},\n       author = {Wang, Yulin and Yue, Yang and Lu, Rui and Liu, Tianjiao and Zhong, Zhao and Song, Shiji and Huang, Gao},\n    booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},\n         year = {2023}\n}\n\n\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FLeapLabTHU%2FEfficientTrain","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FLeapLabTHU%2FEfficientTrain","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FLeapLabTHU%2FEfficientTrain/lists"}