{"id":13737952,"url":"https://github.com/aimagelab/mammoth","last_synced_at":"2025-05-14T21:07:15.972Z","repository":{"id":50905933,"uuid":"301440607","full_name":"aimagelab/mammoth","owner":"aimagelab","description":"An Extendible (General) Continual Learning Framework based on Pytorch - official codebase of Dark Experience for General Continual Learning","archived":false,"fork":false,"pushed_at":"2025-04-12T09:17:41.000Z","size":10175,"stargazers_count":677,"open_issues_count":7,"forks_count":117,"subscribers_count":14,"default_branch":"master","last_synced_at":"2025-05-14T10:38:56.044Z","etag":null,"topics":["continual-learning","dark-experience-replay","deep-learning","der","experience-replay","knowledge-distillation","neurips2020","pytorch"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/aimagelab.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2020-10-05T14:41:51.000Z","updated_at":"2025-05-13T18:11:08.000Z","dependencies_parsed_at":"2024-02-03T22:21:28.235Z","dependency_job_id":"811a31ea-ad35-448d-952c-80d2d94a0b03","html_url":"https://github.com/aimagelab/mammoth","commit_stats":{"total_commits":296,"total_committers":14,"mean_commits":"21.142857142857142","dds":"0.32770270270270274","last_synced_commit":"3d6fc4b0645734e5d8c416293efadc44f3032382"},"previous_names":[],"tags_count":2,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aimagelab%2Fmammoth","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aimagelab%2Fmammoth/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aimagelab%2Fmammoth/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aimagelab%2Fmammoth/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/aimagelab","download_url":"https://codeload.github.com/aimagelab/mammoth/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254227612,"owners_count":22035669,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["continual-learning","dark-experience-replay","deep-learning","der","experience-replay","knowledge-distillation","neurips2020","pytorch"],"created_at":"2024-08-03T03:02:06.894Z","updated_at":"2025-05-14T21:07:10.952Z","avatar_url":"https://github.com/aimagelab.png","language":"Python","funding_links":[],"categories":["Python"],"sub_categories":[],"readme":"\u003cp align=\"center\"\u003e\n  \u003c!-- \u003cimg width=\"230\" height=\"230\" src=\"docs/_static/logo.png\" alt=\"logo\"\u003e --\u003e\n  \u003cimg width=\"1000\" height=\"200\" src=\"docs/_static/mammoth_banner.svg\" alt=\"logo\"\u003e\n\u003c/p\u003e\n\n\u003cp align=\"center\"\u003e\n  \u003cimg alt=\"GitHub commit activity\" src=\"https://img.shields.io/github/commit-activity/m/aimagelab/mammoth\"\u003e\n  \u003ca href=\"https://aimagelab.github.io/mammoth/index.html\"\u003e\u003cimg alt=\"Documentation\" src=\"https://img.shields.io/badge/docs-mammoth-blue?style=flat\u0026logo=readthedocs\"\u003e\u003c/a\u003e\n  \u003cimg alt=\"GitHub stars\" src=\"https://img.shields.io/github/stars/aimagelab/mammoth?style=social\"\u003e\n  \u003cimg alt=\"PyTorch\" src=\"https://img.shields.io/badge/PyTorch-%23EE4C2C.svg?\u0026logo=PyTorch\u0026logoColor=white\"\u003e\n\u003c/p\u003e\n\n# 🦣 Mammoth - A PyTorch Framework for Benchmarking Continual Learning\n\n_Mammoth_ is built to streamline the development and benchmark of continual learning research. With **more than 60 methods and 20 datasets**, it includes the most complete list competitors and benchmarks for research purposes.\n\nThe core idea of Mammoth is that it is designed to be modular, easy to extend, and - most importantly - _easy to debug_.\n\nWith Mammoth, nothing is set in stone. You can easily add new models, datasets, training strategies, or functionalities.\n\n## 📚 Documentation\n\n\u003cp align=\"center\"\u003e\n  \u003ca href=\"https://aimagelab.github.io/mammoth/\"\u003e\n    \u003cem style=\"display: inline-block; margin-top: 8px; font-size: 16px; color: #4B73C9; background-color: #f8f9fa; padding: 8px 16px; border-radius: 0 0 8px 8px; border: 1px solid #4B73C9; border-top: none; box-shadow: 0 2px 5px rgba(0,0,0,0.1);\"\u003eCheck out our guides on using Mammoth for continual learning research\u003c/em\u003e\n    \u003cbr/\u003e\n    \u003cimg src=\"https://img.shields.io/badge/Documentation-📚-4B73C9?style=for-the-badge\u0026logo=gitbook\u0026logoColor=white\" alt=\"Documentation\" height=\"40\"\u003e\n  \u003c/a\u003e\n\u003c/p\u003e\n\n## ⚙️ Setup\n\n- 📥 Install with `pip install -r requirements.txt` or run it directly with `uv run python main.py ...`\n  \u003e **Note**: PyTorch version \u003e= 2.1.0 is required for scaled_dot_product_attention. If you cannot support this requirement, uncomment the lines 136-139 under `scaled_dot_product_attention` in `backbone/vit.py`.\n- 🚀 Use `main.py` or `./utils/main.py` to run experiments.\n- 🧩 New models can be added to the `models/` folder.\n- 📊 New datasets can be added to the `datasets/` folder.\n\n## 🧪 Examples\n\n### Run a model\n\nThe following command will run the model `derpp` on the dataset `seq-cifar100` with a buffer of 500 samples the some random hyperparameters for _lr_, _alpha_, and _beta_:\n```bash\npython main.py --model derpp --dataset seq-cifar100 --alpha 0.5 --beta 0.5 --lr 0.001 --buffer_size 500\n```\n\nTo run the model with the best hyperparameters, use the `--model_config=best` argument:\n```bash\npython main.py --model derpp --dataset seq-cifar100 --model_config best\n```\n\n \u003e NOTE: the `--model_config` argument will look for a file `\u003cmodel_name\u003e.yaml` in the `models/configs/` folder. This file should contain the hyperparameters for the best configuration of the model. You can find more information in [the documentation](https://aimagelab.github.io/mammoth/models/model_arguments.html#model-configurations-and-best-arguments).\n\n### Build a new model\n\nSee the [documentation](https://aimagelab.github.io/mammoth/models/build_a_model.html) for a detailed guide on how to create a new model.\n\n### Build a new dataset\n\nSee the [documentation](https://aimagelab.github.io/mammoth/datasets/build_a_dataset.html) for a detailed guide on how to create a new dataset.\n\n\n## 🗺️ Update Roadmap\n\nAll the code is under active development. Here are some of the features we are working on:\n\n- 🧠 **New models**: We are continuously working on adding new models to the repository.\n- 🔄 **New training modalities**: New training regimes, such a *regression*, *segmentation*, *detection*, etc.\n- 📊 **Openly accessible result dashboard**: The ideal would be a dashboard to visualize the results of all the models in both their respective settings (to prove their reproducibility) and in a general setting (to compare them). *This may take some time, since compute is not free.*\n\nAll the new additions will try to preserve the current structure of the repository, making it easy to add new functionalities with a simple merge.\n\n## 🧠 Models\n\nMammoth currently supports **more than 60** models, with new releases covering the main competitors in literature.\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cb\u003eClick to expand model list\u003c/b\u003e\u003c/summary\u003e\n\n- AttriCLIP: A Non-Incremental Learner for Incremental Knowledge Learning (AttriCLIP): `attriclip`.\n- Bias Correction (BiC): `bic`.\n- CaSpeR-IL (on DER++, X-DER with RPC, iCaRL, and ER-ACE): `derpp_casper`, `xder_rpc_casper`, `icarl_casper`, `er_ace_casper`.\n- CODA-Prompt: COntinual Decomposed Attention-based Prompting for Rehearsal-Free Continual Learning (CODA-Prompt) - _Requires_ `pip install timm==0.9.8`: `coda-prompt`.\n- Continual Contrastive Interpolation Consistency (CCIC) - _Requires_ `pip install kornia`: `ccic`.\n- Continual Generative training for Incremental prompt-Learning (CGIL): `cgil`\n- Contrastive Language-Image Pre-Training (CLIP): `clip` (*static* method with no learning).\n- CSCCT (on DER++, X-DER with RPC, iCaRL, and ER-ACE): `derpp_cscct`, `xder_rpc_cscct`, `icarl_cscct`, `er_ace_cscct`.\n- Dark Experience for General Continual Learning: a Strong, Simple Baseline (DER \u0026 DER++): `der` and `derpp`.\n- DualPrompt: Complementary Prompting for Rehearsal-free Continual Learning (DualPrompt) - _Requires_ `pip install timm==0.9.8`: `dualprompt`.\n- Efficient Lifelong Learning with A-GEM (A-GEM, A-GEM-R - A-GEM with reservoir buffer): `agem`, `agem_r`.\n- Experience Replay (ER): `er`.\n- Experience Replay with Asymmetric Cross-Entropy (ER-ACE): `er_ace`.\n- eXtended-DER (X-DER): `xder` (full version), `xder_ce` (X-DER with CE), `xder_rpc` (X-DER with RPC).\n- Function Distance Regularization (FDR): `fdr`.\n- Generating Instance-level Prompts for Rehearsal-free Continual Learning (DAP): `dap`.\n- Gradient Episodic Memory (GEM) - _Unavailable on windows_: `gem`.\n- Greedy gradient-based Sample Selection (GSS): `gss`.\n- Greedy Sampler and Dumb Learner (GDumb): `gdumb`.\n- Hindsight Anchor Learning (HAL): `hal`.\n- Image-aware Decoder Enhanced à la Flamingo with Interleaved Cross-attentionS (IDEFICS): `idefics` (*static* method with no learning).\n- Incremental Classifier and Representation Learning (iCaRL): `icarl`.\n- Joint training for the General Continual setting: `joint_gcl` (_only for General Continual_).\n- Large Language and Vision Assistant (LLAVA): `llava` (*static* method with no learning).\n- Learning a Unified Classifier Incrementally via Rebalancing (LUCIR): `lucir`.\n- Learning to Prompt (L2P) - _Requires_ `pip install timm==0.9.8`: `l2p`.\n- Learning without Forgetting (LwF): `lwf`.\n- Learning without Forgetting adapted for Multi-Class classification (LwF.MC): `lwf_mc` (from the iCaRL paper).\n- Learning without Shortcuts (LwS): `lws`.\n- LiDER (on DER++, iCaRL, GDumb, and ER-ACE): `derpp_lider`, `icarl_lider`, `gdumb_lider`, `er_ace_lider`.\n- May the Forgetting Be with You: Alternate Replay for Learning with Noisy Labels (AER \u0026 ABS): `er_ace_aer_abs`.\n- Meta-Experience Replay (MER): `mer`.\n- Mixture-of-Experts Adapters (MoE Adapters): `moe_adapters`.\n- Online Continual Learning on a Contaminated Data Stream with Blurry Task Boundaries (PuriDivER): `puridiver`.\n- online Elastic Weight Consolidation (oEWC): `ewc_on`.\n- Progressive Neural Networks (PNN): `pnn`.\n- Random Projections and Pre-trained Models for Continual Learning (RanPAC): `ranpac`.\n- Regular Polytope Classifier (RPC): `rpc`.\n- Rethinking Experience Replay: a Bag of Tricks for Continual Learning (ER-ACE with tricks): `er_ace_tricks`.\n- Semantic Two-level Additive Residual Prompt (STAR-Prompt): `starprompt`. Also includes the first-stage only (`first_stage_starprompt`) and second-stage only (`second_stage_starprompt`) versions.\n- SLCA: Slow Learner with Classifier Alignment for Continual Learning on a Pre-trained Model (SLCA) - _Requires_ `pip install timm==0.9.8`: `slca`.\n- Slow Learner with Classifier Alignment (SLCA): `slca`.\n- Synaptic Intelligence (SI): `si`.\n- Transfer without Forgetting (TwF): `twf`.\n- ZSCL: Zero-Shot Continual Learning: `zscl`.\n\u003c/details\u003e\n\n## 📊 Datasets\n\n**NOTE**: Datasets are automatically downloaded in `data/`.  \n- This can be changed by changing the `base_path` function in `utils/conf.py` or using the `--base_path` argument.  \n- The `data/` folder should not be tracked by _git_ and is created automatically if missing.\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cb\u003eClick to expand dataset list\u003c/b\u003e\u003c/summary\u003e\n\nMammoth currently includes **23** datasets, covering *toy classification problems* (different versions of MNIST), *standard natural-image domains* (CIFAR, Imagenet-R, TinyImagenet, MIT-67), *fine-grained classification domains* (Cars-196, CUB-200), *aerial domains* (EuroSAT-RGB, Resisc45), *medical domains* (CropDisease, ISIC, ChestX).\n\n- Sequential MNIST (_Class-Il / Task-IL_): `seq-mnist`.\n- Permuted MNIST (_Domain-IL_): `perm-mnist`.\n- Rotated MNIST (_Domain-IL_): `rot-mnist`.\n- MNIST-360 (_General Continual Learning_): `mnist-360`.\n- Sequential CIFAR-10 (_Class-Il / Task-IL_): `seq-cifar10`.\n- Sequential CIFAR-10 resized 224x224 (ViT version) (_Class-Il / Task-IL_): `seq-cifar10-224`.\n- Sequential CIFAR-10 resized 224x224 (ResNet50 version) (_Class-Il / Task-IL_): `seq-cifar10-224-rs`.\n- Sequential Tiny ImageNet (_Class-Il / Task-IL_): `seq-tinyimg`.\n- Sequential Tiny ImageNet resized 32x32 (_Class-Il / Task-IL_): `seq-tinyimg-r`.\n- Sequential CIFAR-100 (_Class-Il / Task-IL_): `seq-cifar100`.\n- Sequential CIFAR-100 resized 224x224 (ViT version) (_Class-Il / Task-IL_): `seq-cifar100-224`.\n- Sequential CIFAR-100 resized 224x224 (ResNet50 version) (_Class-Il / Task-IL_): `seq-cifar100-224-rs`.\n- Sequential CUB-200 (_Class-Il / Task-IL_): `seq-cub200`.\n- Sequential ImageNet-R (_Class-Il / Task-IL_): `seq-imagenet-r`.\n- Sequential Cars-196 (_Class-Il / Task-IL_): `seq-cars196`.\n- Sequential RESISC45 (_Class-Il / Task-IL_): `seq-resisc45`.\n- Sequential EuroSAT-RGB (_Class-Il / Task-IL_): `seq-eurosat-rgb`.\n- Sequential ISIC (_Class-Il / Task-IL_): `seq-isic`.\n- Sequential ChestX (_Class-Il / Task-IL_): `seq-chestx`.\n- Sequential MIT-67 (_Class-Il / Task-IL_): `seq-mit67`.\n- Sequential CropDisease (_Class-Il / Task-IL_): `seq-cropdisease`.\n- Sequential CelebA (_Biased-Class-Il_): `seq-celeba`. *This dataset is multi-label (i.e., trains with binary cross-entropy)*\n\u003c/details\u003e\n\n## 📝 Citing the library\n\n```bibtex\n@article{boschini2022class,\n  title={Class-Incremental Continual Learning into the eXtended DER-verse},\n  author={Boschini, Matteo and Bonicelli, Lorenzo and Buzzega, Pietro and Porrello, Angelo and Calderara, Simone},\n  journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},\n  year={2022},\n  publisher={IEEE}\n}\n\n@inproceedings{buzzega2020dark,\n author = {Buzzega, Pietro and Boschini, Matteo and Porrello, Angelo and Abati, Davide and Calderara, Simone},\n booktitle = {Advances in Neural Information Processing Systems},\n editor = {H. Larochelle and M. Ranzato and R. Hadsell and M. F. Balcan and H. Lin},\n pages = {15920--15930},\n publisher = {Curran Associates, Inc.},\n title = {Dark Experience for General Continual Learning: a Strong, Simple Baseline},\n volume = {33},\n year = {2020}\n}\n```\n\n## 🔬 On the reproducibility of Mammoth\nWe take great pride and care in the reproducibility of the models in Mammoth and we are commited to provide the community with the most accurate results possible. To this end, we provide a _REPRODUCIBILITY.md_ file in the repository that contains the results of the models in Mammoth.\n\nThe performance of each model is evaluated on the same dataset used in the paper and we report in _REPRODUCIBILITY.md_ the list of models that have been verified. We also provide the exact command used to train the model (most times, it follows `python main.py --model \u003cmodel-name\u003e --dataset \u003cdataset-name\u003e --model_config best`).\n\nWe encourage the community to report any issues with the reproducibility of the models in Mammoth. If you find any issues, please open an issue in the GitHub repository or contact us directly.\n\n**Disclaimer**: Since there are many models in Mammoth (and some of them predate PyTorch), the process of filling the _REPRODUCIBILITY.md_ file is ongoing. We are working hard to fill the file with the results of all models in Mammoth. If you need the results of a specific model, please open an issue in the GitHub repository or contact us directly.\n\n\u003e Does this mean that the models that are not in the _REPRODUCIBILITY.md_ file do not reproduce?\n\nNo! It means that we have not yet found the appropriate dataset and hyperparameters to fill the file with the results of that model. We are working hard to fill the file with the results of all models in Mammoth. If you need the results of a specific model, please open an issue in the GitHub repository or contact us directly.\n\n## 🤝 Contributing\nPull requests are welcome!\n\n\u003ca href=\"https://github.com/aimagelab/mammoth/graphs/contributors\"\u003e \u003cimg src=\"https://contrib.rocks/image?repo=aimagelab/mammoth\" /\u003e \u003c/a\u003e\n\nPlease use autopep8 with parameters:\n\n```\n--aggressive\n--max-line-length=200\n--ignore=E402\n```\n\n## Previous versions\n\nIf you're interested in a version of this repo that only includes the original code for _\"Dark Experience for General Continual Learning: a Strong, Simple Baseline\"_ or _\"Class-Incremental Continual Learning into the eXtended DER-verse\"_, please use the following tags:\n\n`neurips2020` for DER (NeurIPS 2020).  \n`tpami2023` for X-DER (TPAMI 2022).","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Faimagelab%2Fmammoth","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Faimagelab%2Fmammoth","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Faimagelab%2Fmammoth/lists"}