{"id":20238727,"url":"https://github.com/yujun-shi/blip","last_synced_at":"2025-04-10T19:36:04.879Z","repository":{"id":41100211,"uuid":"351759812","full_name":"Yujun-Shi/BLIP","owner":"Yujun-Shi","description":"Official Implementation of CVPR2021 paper: Continual Learning via Bit-Level Information Preserving","archived":false,"fork":false,"pushed_at":"2023-01-24T05:46:05.000Z","size":5694,"stargazers_count":38,"open_issues_count":1,"forks_count":2,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-03-24T17:14:03.991Z","etag":null,"topics":["continual-learning","cvpr2021"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Yujun-Shi.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2021-03-26T11:30:36.000Z","updated_at":"2024-05-31T19:16:00.000Z","dependencies_parsed_at":"2023-02-13T18:16:53.601Z","dependency_job_id":null,"html_url":"https://github.com/Yujun-Shi/BLIP","commit_stats":null,"previous_names":[],"tags_count":1,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Yujun-Shi%2FBLIP","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Yujun-Shi%2FBLIP/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Yujun-Shi%2FBLIP/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Yujun-Shi%2FBLIP/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Yujun-Shi","download_url":"https://codeload.github.com/Yujun-Shi/BLIP/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248281414,"owners_count":21077423,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["continual-learning","cvpr2021"],"created_at":"2024-11-14T08:35:31.168Z","updated_at":"2025-04-10T19:36:04.858Z","avatar_url":"https://github.com/Yujun-Shi.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# (CVPR2021) Continual Learning via Bit-Level Information Preserving [ArXiv](https://arxiv.org/pdf/2105.04444.pdf)\n\nThis repo contains the official Implementation of the CVPR2021 paper: Continual Learning via Bit-Level Information Preserving.\n\n\n\n### Abstract\n\nContinual learning tackles the setting of learning different tasks sequentially. Despite the lots of previous solutions, most of them still suffer significant forgetting or expensive memory cost. In this work, targeted at these problems, we first study the continual learning process through the lens of information theory and observe that forgetting of a model stems from the loss of information gain on its parameters from the previous tasks when learning a new task. From this viewpoint, we then propose a novel continual learning approach called **B**it-**L**evel **I**nformation **P**reserving (**BLIP**) that preserves the information gain on model parameters through updating the parameters at the bit level, which can be conveniently implemented with parameter quantization. More specifically, BLIP first trains a neural network with weight quantization on the new incoming task and then estimates information gain on each parameter provided by the task data to determine the bits to be frozen to prevent forgetting. We conduct extensive experiments ranging from classification tasks to reinforcement learning tasks, and the results show that our method produces better or on par results comparing to previous state-of-the-arts. Indeed, BLIP achieves close to zero forgetting while only requiring constant memory overheads throughout continual learning\n\n\n\n### Authors\n\nYujun Shi (***LV Lab***), Li Yuan (***LV Lab***), Yunpeng Chen (***YITU Technology***), Jiashi Feng (***LV Lab***)\n\n\n\n### Graphical Illustration\n\n![graphical_illustration](./pictures/graphical_illustration.png)\n\nWe consider a simple scenario with ***one single parameter***  quantized to 10 bits to illustrate our method. $\\theta_{t}$ denotes the parameter after learning on task $1$ to $t$, and $\\theta_{0}$ is a randomly initialized value before training on any task. $IG_{t}$ denotes information gain on $\\theta$ after learning the task $t$. Bit representation of $\\theta$ after learning each task is shown below. From the higher bit positions to lower ones is more significant bits to less significant ones. Frozen bits are filled with color and the rest bits are free bits. After learning each task, the information gain is calculated and then $\\lceil IG_{t} \\rceil$ bits are to be frozen in the bit representation. By repeating this process, the information on previous tasks can be preserved, enabling continual learning for neural networks.\n\n\n\n### Experiment Results\n\nFor **numerical results and ablation studies**, please check our paper.\n\nHere, we render and compare agents trained by EWC and BLIP under different environments.\n\nBelow is visualization of sequentially learning the first 3 Atari games in our setups (i.e., kung fu master -- boxing -- james bond).\n\n**The i-th row, j-th column GIF illustrates how well does the agent perform in the j-th task after learning the first i tasks.**\n\nAs can be seen, for EWC, the agent's performance on previous task degraded drastically after learning new tasks, while agent trained with BLIP can still perform quite well. (This phenomenon is most significant for task 1.)\n\n\n\n***EWC***\n\n\u003cimg align=\"left\" width=\"140\" height=\"140\" src=\"./pictures/ewc_task_0_0.gif\"\u003e\u003cbr/\u003e\u003cbr/\u003e\u003cbr/\u003e\u003cbr/\u003e\u003cbr/\u003e\u003cbr/\u003e\n\n\u003cimg align=\"left\" width=\"140\" height=\"140\" src=\"./pictures/ewc_task_1_0.gif\"\u003e\u003cimg align=\"left\" width=\"140\" height=\"140\" src=\"./pictures/ewc_task_1_1.gif\"\u003e\u003cbr/\u003e\u003cbr/\u003e\u003cbr/\u003e\u003cbr/\u003e\u003cbr/\u003e\u003cbr/\u003e\n\u003cimg align=\"left\" width=\"140\" height=\"140\" src=\"./pictures/ewc_task_2_0.gif\"\u003e\u003cimg align=\"left\" width=\"140\" height=\"140\" src=\"./pictures/ewc_task_2_1.gif\"\u003e\u003cimg align=\"left\" width=\"140\" height=\"140\" src=\"./pictures/ewc_task_2_2.gif\"\u003e\u003cbr/\u003e\u003cbr/\u003e\u003cbr/\u003e\u003cbr/\u003e\u003cbr/\u003e\u003cbr/\u003e\n\n\n\n***BLIP***\n\n\u003cimg align=\"left\" width=\"140\" height=\"140\" src=\"./pictures/blip_task_0_0.gif\"\u003e\u003cbr/\u003e\u003cbr/\u003e\u003cbr/\u003e\u003cbr/\u003e\u003cbr/\u003e\u003cbr/\u003e\n\n\u003cimg align=\"left\" width=\"140\" height=\"140\" src=\"./pictures/blip_task_1_0.gif\"\u003e\u003cimg align=\"left\" width=\"140\" height=\"140\" src=\"./pictures/blip_task_1_1.gif\"\u003e\u003cbr/\u003e\u003cbr/\u003e\u003cbr/\u003e\u003cbr/\u003e\u003cbr/\u003e\u003cbr/\u003e\n\u003cimg align=\"left\" width=\"140\" height=\"140\" src=\"./pictures/blip_task_2_0.gif\"\u003e\u003cimg align=\"left\" width=\"140\" height=\"140\" src=\"./pictures/blip_task_2_1.gif\"\u003e\u003cimg align=\"left\" width=\"140\" height=\"140\" src=\"./pictures/blip_task_2_2.gif\"\u003e\u003cbr/\u003e\u003cbr/\u003e\u003cbr/\u003e\u003cbr/\u003e\u003cbr/\u003e\u003cbr/\u003e\n\n\n\n### Citation\n\nIf you find our repo/paper helpful, please consider citing our work :)\n```\n@article{shi2021continual,\n  title={Continual Learning via Bit-Level Information Preserving},\n  author={Shi, Yujun and Yuan, Li and Chen, Yunpeng and Feng, Jiashi},\n  journal={arXiv preprint arXiv:2105.04444},\n  year={2021}\n}\n```\n\n\n### Prerequisites\n\n* pytorch \u003e= 1.3.1\n* gym (*required by RL, no need if you only run image classifications*)\n* baselines (*required by RL, no need if you only run image classifications*)\n\n\n\n\u003cbr/\u003e\u003cbr/\u003e\u003cbr/\u003e\n\n### Image Classifications (Besides mini-ImageNet)\n\nUnder the folder of ***ImageClassification/src***:\n\nTo run BLIP with MNIST-5:\n\n```sh\npython run_blip.py --approach blip --experiment mnist5 --lr 0.01 --sbatch 64 --F-prior 1e-15 --nepochs 200\n```\n\nTo run BLIP with PMNIST:\n\n```sh\npython run_blip.py --approach blip --experiment pmnist --lr 0.01 --sbatch 64 --F-prior 1e-15 --nepochs 200\n```\n\nTo run BLIP with Alternating Cifar10/100:\n\n```sh\npython run_blip.py --experiment cifar --lr 0.05 --sbatch 32 --F-prior 5e-16 --mul 2\n```\n\nTo run BLIP with Sequence of 5 datasets:\n\n```sh\npython run_blip.py --experiment mixture5 --lr 0.05 --sbatch 32 --F-prior 5e-17 --mul 0.8 --seed 0\n```\n\nAll datasets will be automatically downloaded and processed under ***ImageClassification/data***\n\n\n\u003cbr/\u003e\u003cbr/\u003e\u003cbr/\u003e\n### Image Classification (mini-ImageNet)\n\nUnder the folder ***miniImageNetClassification/src***:\n\nThe following two steps are needed to run the experiment:\n\n* ##### Step 1: Prepare Data\n\nFirst, [download](https://github.com/Yujun-Shi/BLIP/releases/download/initial/MI_raw.zip) the zipped file and extract it under the folder ***miniImageNetClassification/src/data***\n\nThen, under ***miniImageNetClassification/src/data***, execute the following to obtain data split:\n\n```sh\npython generate_train_test_split.sh\n```\n\nAfter executing the file, two files named \"train.pkl\" and \"test.pkl\" will be generated. These are the data files and will be loaded for training/testing.\n\n\n\n* ##### Step 2: run shell command\n\nUnder the folder ***miniImageNetClassification/src***:\n\nTo run BLIP with AlexNet, use:\n\n```sh\npython run_blip.py --F-prior 5e-16 --lr 0.01 --momentum 0.0 --mul 1 --sbatch 32 --seed 0 --ntasks 20 --arch alexnet\n```\n\nTo run BLIP with ResNet-18, use:\n\n```sh\npython run_blip.py --F-prior 1e-16 --lr 0.01 --momentum 0.0 --mul 1.5 --sbatch 32 --seed 0 --ntasks 20 --arch resnet\n```\n\nTo run Baseline methods with AlexNet, use:\n\n```sh\npython run_baselines.py --lr 0.01 --approach \u003cbaseline-method\u003e --momentum 0.0 --mul 1 --sbatch 32 --seed 0 --ntasks 20 --arch alexnet\n```\n\nwhere \\\u003cbaseline-method\\\u003e should be replaced by the name of baseline methods (e.g., sgd, sgd-frozen, lwf, imm-mode, ewc).\n\n\n\u003cbr/\u003e\u003cbr/\u003e\u003cbr/\u003e\n### RL (sequence of 6 Atari games)\n\nUnder the folder ***RL/src***\n\nTo run our method BLIP, use:\n\n```sh\n./run_blip.sh\n```\n\nTo run online EWC, use:\n\n```sh\n./run_ewc.sh\n```\n\nTo run plain fine-tuning, use:\n\n```shell\n./run_ft.sh\n```\n\n\n\u003cbr/\u003e\u003cbr/\u003e\u003cbr/\u003e\n## Contact\n\nYujun Shi (shi.yujun@u.nus.edu)\n\n\n\n\n\n## Acknowledgements\n\nOur code is inspired by the following repo: [HAT](https://github.com/joansj/hat), [ACL](https://github.com/facebookresearch/Adversarial-Continual-Learning), [UCL](https://github.com/csm9493/UCL), [pytorch-a2c-ppo-acktr-gail](https://github.com/ikostrikov/pytorch-a2c-ppo-acktr-gail)\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fyujun-shi%2Fblip","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fyujun-shi%2Fblip","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fyujun-shi%2Fblip/lists"}