{"id":16388241,"url":"https://github.com/tlesort/scole-scaling-continual-learning","last_synced_at":"2025-10-26T11:31:55.694Z","repository":{"id":176531044,"uuid":"655328844","full_name":"TLESORT/SCoLe-SCaling-Continual-Learning","owner":"TLESORT","description":"Official Code for \"Challenging Common Assumptions about Catastrophic Forgetting and Knowledge Accumulation\", CoLLas 2023.","archived":false,"fork":false,"pushed_at":"2023-07-06T15:58:58.000Z","size":1972,"stargazers_count":5,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-01-31T19:05:34.807Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/TLESORT.png","metadata":{"files":{"readme":"Readme.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null}},"created_at":"2023-06-18T15:12:59.000Z","updated_at":"2024-07-22T17:20:26.000Z","dependencies_parsed_at":"2023-07-11T14:46:56.955Z","dependency_job_id":null,"html_url":"https://github.com/TLESORT/SCoLe-SCaling-Continual-Learning","commit_stats":null,"previous_names":["tlesort/scole","tlesort/scole-scaling-continual-learning"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/TLESORT%2FSCoLe-SCaling-Continual-Learning","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/TLESORT%2FSCoLe-SCaling-Continual-Learning/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/TLESORT%2FSCoLe-SCaling-Continual-Learning/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/TLESORT%2FSCoLe-SCaling-Continual-Learning/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/TLESORT","download_url":"https://codeload.github.com/TLESORT/SCoLe-SCaling-Continual-Learning/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":238319573,"owners_count":19452360,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-10-11T04:28:40.448Z","updated_at":"2025-10-26T11:31:55.169Z","avatar_url":"https://github.com/TLESORT.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\n# Official Code for \"Challenging Common Assumptions about Catastrophic Forgetting and Knowledge Accumulation\", CoLLas 2023\n\n\n## Abstract\n\nBuilding learning agents that can progressively learn and accumulate knowledge is the core goal\nof the continual learning (CL) research field. Unfortunately, training a model on new data usually\ncompromises the performance on past data. In the CL literature, this effect is referred to as catastrophic\nforgetting (CF). CF has been largely studied, and a plethora of methods have been proposed to address\nit on short sequences of non-overlapping tasks. In such setups, CF always leads to a quick and\nsignificant drop in performance in past tasks. Nevertheless, despite CF, recent work showed that\nSGD training on linear models accumulates knowledge in a CL regression setup. This phenomenon\nbecomes especially visible when tasks reoccur. We might then wonder if DNNs trained with SGD or\nany standard gradient-based optimization accumulate knowledge in such a way. Such phenomena\nwould have interesting consequences for applying DNNs to real continual scenarios. Indeed, standard\ngradient-based optimization methods are significantly less computationally expensive than existing\nCL algorithms. In this paper, we study the progressive knowledge accumulation (KA) in DNNs trained\nwith gradient-based algorithms in long sequences of tasks with data re-occurrence. We propose a new\nframework, SCoLe (Scaling Continual Learning), to investigate KA and discover that catastrophic\nforgetting has a limited effect on DNNs trained with SGD. When trained on long sequences with data\nsparsely re-occurring, the overall accuracy improves, which might be counter-intuitive given the CF\nphenomenon. We empirically investigate KA in DNNs under various data occurrence frequencies\nand propose simple and scalable strategies to increase knowledge accumulation in DNNs.\n\n## Main Contributions\n\nThe main contribution of this work is to show precisely that the effect of catastrophic forgetting is limited on deep neural networks (DNNs)\nand that it does not prevent knowledge accumulation. \nSecondly, it proposes an evaluation framework (SCoLe) to study the knowledge accumulation in DNNs at scale.\n\nWe hope that this benchmark will help to design continual algorithms that could be efficient and deployable.\n\n\n## SCoLe\n\nSCoLe (Scaling Continual Learning) is a continual learning framework for generating long sequences of tasks with various\nfrequencies of tasks and classes. It is made to study the knowledge accumulation capability of learning algorithms.\nThe scenario is generated from a fixed dataset, then each task is generated online by randomly selecting a subset of classes\nor data point.\n\n\n\u003cp align=\"center\"\u003e\n\u003cimg src=\"./Images/scole.png\" width=\"800\" alt=\"Illustration of SCoLe scenario\"\u003e\n\u003c/p\u003e\n\nBy training and long sequences of automatically generated tasks, we can visualize progress (knowledge accumulation)\nby plotting the evaluation of accuracy on the test set composed of all classes.\n\n\nIn the paper, we show that knowledge accumulation consistently happens on various datasets and architectures which means that catastrophic forgetting is consistently limited in its effect on the whole model.\n\n\u003cp align=\"center\"\u003e\n\u003cimg src=\"./Images/KA_datasets.png\" width=\"400\" alt=\"Knowledge Accumulation on Various Dataset through Long Sequences of Tasks.\"\u003e\n\u003c/p\u003e\n\n\n\nTo analyse further, we can control the probability of sampling classes to control their frequency of appearance in the sequence of tasks.\nAs such we can visualize knowledge accumulation with respect to classes' frequency of appearances:\n\nWhen all classes are sampled with the same frequency (balanced distribution):\n\n\u003cp align=\"center\"\u003e\n\u003cimg src=\"./Images/balanced.png\" width=\"400\" alt=\" Illustration of Results with Balanced Distribution of Classes within the Scenario.\"\u003e\n\u003c/p\u003e\n\n\nor when all classes are sampled with different distributions:\n\n\u003cp align=\"center\"\u003e\n\u003cimg src=\"./Images/unbalanced.png\" width=\"400\" alt=\" Illustration of Results with Unbalanced Distribution of Classes within the Scenario.\"\u003e\n\u003c/p\u003e\n\nThe influence of various design choices, such as hyper-parameters,\ncan then be evaluated to know which composition leads to the best knowledge accumulation.\n\n## Installation\n```bash\npip install -r requirement.txt\n```\n\n\n## Examples of runs\n\nHere is some example of runs to play with the code base.\nIf you are looking for the exact configuration to reproduce one figure of the paper do not hesitate to contact us.\n\nOne run with MNIST, 2 classes per task, 1 epoch per task,500 tasks using group masking on Adam optimization. (estimated duration ~20mins)\n```bash\npython main.py --wandb_api_key $YOUR_WANDB_API_KEY  --classes_per_task=2 --dataset=MNIST --masking=group --momentum=0 --nb_epochs=1 --classes_per_task=2 --num_classes=10 --num_tasks=500 --optim=Adam\n```\n\nOne run with 100 classes of TinyImagenet with random perturbation (severity 1), imbalanced class distribution by factor 2 (param: entropy_decrease), 1 epoch per task, 10 classes per task, 2500 tasks using group masking on SGD optimization and no momentum. (estimated duration ~15hrs)\n```bash\npython main.py --wandb_api_key $YOUR_WANDB_API_KEY --class_acc=True --classes_per_task=10 --dataset=Tiny --entropy_decrease=2 --lr=0.01 --masking=group --momentum=0 --nb_epochs=1 --num_classes=100 --num_tasks=2500 --optim=SGD --rand_transform=perturbations --severity=1\n```\n  \nOne run with CIFAR100 dataset with wide resnet with growth factor 2 (use the pretrained_model parameter together with reinit_model):\n```bash\npython main.py --wandb_api_key $YOUR_WANDB_API_KEY --classes_per_task 2 --dataset CIFAR100 --masking group --momentum 0 --nb_epochs 1 --pretrained_model wrn --wrn_width_factor 2 --classes_per_task 2 --num_classes 10 --num_tasks 500 --optim Adam --reinit_model 1\n```\n\nSame as above with frequency replay (by default frequency replay classes between frequency low_frequency=0.01 and high_frequency=0.1 ) (estimated duration ~16hrs)\n```bash\npython main.py --wandb_api_key $YOUR_WANDB_API_KEY --class_acc=True --classes_per_task=10 --dataset=Tiny --entropy_decrease=2 --lr=0.01 --masking=group --momentum=0 --nb_epochs=1 --num_classes=100 --num_tasks=2500 --optim=SGD --rand_transform=perturbations --replay=frequency --severity=1\n```\n\n\n\n\n\n### Citing the Paper\n\n```Array.\u003cstring\u003e\n@misc{Lesort2023Challenging,\n  title = \"{Challenging Common Assumptions about Catastrophic Forgetting and Knowledge Accumulation}\",\n  author={{Lesort}, Timoth{\\'e}e and {Ostapenko}, Oleksiy and {Rodr{\\'\\i}guez}, Pau and {Misra}, Diganta and {Rifat Arefin}, Md and {Charlin}, Laurent and {Rish}, Irina},\n  booktitle={Conference on Lifelong Learning Agents},\n  year={2023},\n  organization={PMLR}\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftlesort%2Fscole-scaling-continual-learning","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ftlesort%2Fscole-scaling-continual-learning","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftlesort%2Fscole-scaling-continual-learning/lists"}