{"id":13737929,"url":"https://github.com/google-research/l2p","last_synced_at":"2025-10-13T03:39:00.442Z","repository":{"id":37672841,"uuid":"434753994","full_name":"google-research/l2p","owner":"google-research","description":"Learning to Prompt (L2P) for Continual Learning @ CVPR22 and DualPrompt: Complementary Prompting for Rehearsal-free Continual Learning @ ECCV22","archived":false,"fork":false,"pushed_at":"2024-07-30T20:49:03.000Z","size":406,"stargazers_count":434,"open_issues_count":8,"forks_count":45,"subscribers_count":8,"default_branch":"main","last_synced_at":"2025-03-29T05:08:52.726Z","etag":null,"topics":["continual-learning","deep-learning","jax"],"latest_commit_sha":null,"homepage":"https://arxiv.org/pdf/2112.08654.pdf","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/google-research.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-12-03T22:50:15.000Z","updated_at":"2025-03-19T12:42:04.000Z","dependencies_parsed_at":"2024-11-24T13:01:53.147Z","dependency_job_id":"64b26d76-3ee6-4d98-b462-05f0a66b9794","html_url":"https://github.com/google-research/l2p","commit_stats":{"total_commits":23,"total_committers":2,"mean_commits":11.5,"dds":0.4347826086956522,"last_synced_commit":"dd8836e6e372df29f03d83bf3dc3a806114e9d8e"},"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/google-research%2Fl2p","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/google-research%2Fl2p/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/google-research%2Fl2p/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/google-research%2Fl2p/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/google-research","download_url":"https://codeload.github.com/google-research/l2p/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247294541,"owners_count":20915340,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["continual-learning","deep-learning","jax"],"created_at":"2024-08-03T03:02:06.236Z","updated_at":"2025-10-13T03:38:55.374Z","avatar_url":"https://github.com/google-research.png","language":"Python","funding_links":[],"categories":["Python"],"sub_categories":[],"readme":"# Prompt-based Continual Learning Official Jax Implementation\n\nThis codebase contains the implementation of two continual learning methods: \n\n- **[Learning to Prompt for Continual Learning (L2P)](https://arxiv.org/pdf/2112.08654.pdf) (CVPR2022)** [[Google AI Blog]](https://ai.googleblog.com/2022/04/learning-to-prompt-for-continual.html)\n- **[DualPrompt: Complementary Prompting for Rehearsal-free Continual Learning](https://arxiv.org/pdf/2204.04799.pdf) (ECCV2022)**\n\n## Introduction\nL2P is a novel continual learning technique which learns to dynamically prompt a pre-trained model to learn tasks sequentially under different task transitions. Different from mainstream rehearsal-based or architecture-based methods, L2P requires neither a rehearsal buffer nor test-time task identity. L2P can be generalized to various continual learning settings including the most challenging and realistic task-agnostic setting. L2P consistently outperforms prior state-of-the-art methods. Surprisingly, L2P achieves competitive results against rehearsal-based methods even without a rehearsal buffer.\n\u003cp align=\"center\"\u003e\n\u003cimg src=\"./l2p_illustration.png\" width=\"850\" height=\"320\"\u003e\n\u003c/p\u003e\nDualPrompt improves L2P by attaching complementary prompts to the pre-trained backbone, and then formulates the objective as learning task-invariant and task-specific “instructions\". With extensive experimental validation, DualPrompt consistently sets state-of-the-art performance under the challenging class-incremental setting. In particular, DualPrompt outperforms recent advanced continual learning methods with relatively large buffer sizes. We also introduce a more challenging benchmark, Split ImageNet-R, to help generalize rehearsal-free continual learning research.\n\u003cp align=\"center\"\u003e\n\u003cimg src=\"./dualprompt_illustration.png\" width=\"850\" height=\"240\"\u003e\n\u003c/p\u003e\n\n\nCode is written by Zifeng Wang. Acknowledgement to https://github.com/google-research/nested-transformer.\n\nThis is not an officially supported Google product.\n\n## Novel CL benchmark: Split ImageNet-R\nThe Split ImageNet-R benchmark is build upon [ImageNet-R](https://www.tensorflow.org/datasets/catalog/imagenet_r) by dividing the 200 classes into 10 tasks with 20 classes per task, see [libml/input_pipeline.py](libml/input_pipeline.py) for details. We believe the Split ImageNet-R is of great importance to the continual learning community, for the following reasons:\n\n- Split ImageNet-R contains classes with different styles, which is closer to the complicated real-world problems.\n- The significant intra-class diversity poses a great challenge for rehearsal-based methods to work effectively with a small buffer size, thus encouraging the development of more practical, rehearsal-free methods.\n- Pre-trained vision models are useful in practical continual learning. However, their training set usually includes ImageNet. Thus, Split ImageNet-R serves as a relative fair and challenging benchmark, and an alternative to ImageNet-based benchmarks for continual learning that uses pre-trained models.\n\n## PyTorch Reimplementation\nThe codebase has been reimplemented in PyTorch by Jaeho Lee in [l2p-pytorch](https://github.com/JH-LEE-KR/l2p-pytorch) and [dualprompt-pytorch](https://github.com/JH-LEE-KR/dualprompt-pytorch).\n\n\n## Enviroment setup\n```\npip install -r requirements.txt\n```\nAfter this, you may need to adjust your jax version according to your cuda driver version so that jax correctly identifies your GPUs (see [this issue](https://github.com/google/jax/issues/5231) for more details).\n\nNote: The codebase has been throughly tested under the TPU enviroment using the newest JAX version. We are currently working on further verifying the GPU environment.\n\n## Dataset preparation\nBefore running experiments for 5-datasets and CORe50, additional dataset preparation step should be conducted as follows:\n\n1. Download CORe50 classification benchmark here: https://vlomonaco.github.io/core50/ and download not-mnist here: http://yaroslavvb.com/upload/notMNIST/\n2. Transform them into TFDS compatible form following the tutorial in https://www.tensorflow.org/datasets/add_dataset\n3. Replace corresponding dataset paths `\"PATH_TO_CORE50\"` and `\"PATH_TO_NOT_MNIST\"` in [libml/input_pipeline.py](libml/input_pipeline.py) by the destination paths in step 2\n\n\n## Getting pretrained ViT model\nViT-B/16 model used in this paper can be downloaded at [here](https://storage.googleapis.com/vit_models/imagenet21k/ViT-B_16.npz).\nNote: Our codebase actually supports various sizes of ViTs. If you would like to try variations of ViTs, feel free to change the `config.model_name` in the config files, following the valid options defined in [models/vit.py](models/vit.py).\n\n\n## Instructions on running L2P and DualPrompt\nWe provide the configuration file to train and evaluate L2P and DualPrompt on multiple benchmarks in [configs](configs/).\n\n\nTo run L2P on benchmark datasets:\n\n```\npython main.py --my_config configs/$L2P_CONFIG --workdir=./l2p --my_config.init_checkpoint=\u003cViT-saved-path/ViT-B_16.npz\u003e\n```\nwhere `$L2P_CONFIG` can be one of the followings: `[cifar100_l2p.py, five_datasets_l2p.py, core50_l2p.py, cifar100_gaussian_l2p.py]`.\n\nNote: we run our experiments using 8 V100 GPUs or 4 TPUs, and we specify a per device batch size of 16 in the config files. This indicates that we use a total batch size of 128.\n\n\nTo run DualPrompt on benchmark datasets:\n\n```\npython main.py --my_config configs/$DUALPROMPT_CONFIG --workdir=./dualprompt --my_config.init_checkpoint=\u003cViT-saved-path/ViT-B_16.npz\u003e\n```\nwhere `$DUALPROMPT_CONFIG` can be one of the followings: `[imr_dualprompt.py, cifar100_dualprompt.py]`.\n\n\n\n\n## Visualize results\nWe use tensorboard to visualize the result. For example, if the working directory specified to run L2P is `workdir=./cifar100_l2p`, the command to check result is as follows:\n\n```\ntensorboard --logdir ./cifar100_l2p\n```\nHere are the important metrics to keep track of, and their corresponding meanings:\n\n| Metric    | Description |\n| ----------- | ----------- |\n| accuracy_n      | Accuracy of the n-th task       |\n| forgetting   | Average forgetting up until the current task       |\n| avg_acc  | Average evaluation accuracy up until the current task      |\n\n\n\n## Cite\n```\n@inproceedings{wang2022learning,\n  title={Learning to prompt for continual learning},\n  author={Wang, Zifeng and Zhang, Zizhao and Lee, Chen-Yu and Zhang, Han and Sun, Ruoxi and Ren, Xiaoqi and Su, Guolong and Perot, Vincent and Dy, Jennifer and Pfister, Tomas},\n  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},\n  pages={139--149},\n  year={2022}\n}\n```\n\n```\n@article{wang2022dualprompt,\n  title={DualPrompt: Complementary Prompting for Rehearsal-free Continual Learning},\n  author={Wang, Zifeng and Zhang, Zizhao and Ebrahimi, Sayna and Sun, Ruoxi and Zhang, Han and Lee, Chen-Yu and Ren, Xiaoqi and Su, Guolong and Perot, Vincent and Dy, Jennifer and others},\n  journal={European Conference on Computer Vision},\n  year={2022}\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgoogle-research%2Fl2p","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fgoogle-research%2Fl2p","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgoogle-research%2Fl2p/lists"}