Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/Mattdl/ContinualEvaluation
[Spotlight ICLR 2023 paper] Continual evaluation for lifelong learning with neural networks, identifying the stability gap.
https://github.com/Mattdl/ContinualEvaluation
avalanche continual-evaluation continual-learning deep-learning lifelong-learning machine-learning pytorch stability-gap
Last synced: 3 months ago
JSON representation
[Spotlight ICLR 2023 paper] Continual evaluation for lifelong learning with neural networks, identifying the stability gap.
- Host: GitHub
- URL: https://github.com/Mattdl/ContinualEvaluation
- Owner: Mattdl
- License: mit
- Created: 2022-05-26T13:53:57.000Z (over 2 years ago)
- Default Branch: main
- Last Pushed: 2023-04-02T09:40:41.000Z (over 1 year ago)
- Last Synced: 2024-04-20T17:00:11.370Z (7 months ago)
- Topics: avalanche, continual-evaluation, continual-learning, deep-learning, lifelong-learning, machine-learning, pytorch, stability-gap
- Language: Python
- Homepage: https://openreview.net/forum?id=Zy350cRstc6
- Size: 498 KB
- Stars: 21
- Watchers: 1
- Forks: 4
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Continual Evaluation for Lifelong Learning
[//]: # (**What is this repo?**)
This is a Pytorch and Avalanche based repository to enable finegrained continual evaluation in continual learning, opposed to the standard task transition based evaluation.
It is the main codebase for the *Spotlight ICLR 2023 paper*:
["Continual evaluation for lifelong learning: Identifying the stability gap"](https://openreview.net/forum?id=Zy350cRstc6).**Why continual evaluation?**
Using continual evaluation, [our work](https://arxiv.org/abs/2205.13452) finds a *stability gap*, where representative continual learning
methods falter in maintaining robust performance during the learning process.
Measuring continual worst-case performance is important to enable continual
learners in the real world, especially for safety-critical applications and real-world actuators.**Main features of this repo:**
- **Continual Eval**: Per-iteration evaluation, including test-set subsampling and adjustable evaluation periodicity
- **Continual evaluation metrics**:
- **New metrics**: Worst-case Accuracy (WC-ACC), Average Minimum Accuracy (Min-ACC), Windowed-Forgetting (WF), Windowed-Plasticity (WP).
- **Existing metrics**: Learning Curve Area (LCA), Average Forgetting (FORG), Average Accuracy (ACC).
- **Extensive tracking**: Track all stats of your continual learning model, e.g. per-iteration feature drift and gradient norms.
- **7 Continual Learning benchmarks** based on: MNIST, CIFAR10, Mini-Imagenet, Mini-DomainNet, PermutedMNIST, RotatedMNIST, Digits*Project Status*: Codebase delivered as is, no support available.
## Setup
This code uses
- Python 3.8
- [Avalanche](https://github.com/ContinualAI/avalanche) 0.1.0 (beta)
- Pytorch 1.8.1To setup your environment, you can use the [install script](./install_script.sh), which automatically creates an Anaconda environment for you.
The script defines the default versions used for the paper.
./install_script.shYou can also define your own conda environment with the [environment.yml](environment.yml) file.
conda env create -n CLEVAL_ENV -f environment.yml python=3.8
conda activate CLEVAL_ENV## Reproducing results
All configs for the experiments can be found in [reproduce/configs](reproduce/configs).
The yaml config files enable a structured way to pass arguments to the python script.To reproduce an experiment, simply run *./reproduce/run.sh* and pass the yaml filename. For example:
./reproduce/run.sh splitmnist_ER.yamlNote that we didn't run with deterministic CUDNN backbone for computational efficiency, which might result in small deviations in results.
We average all results over 5 initialization seeds, these can be run at once (with n_seeds=5),
or define the specific seed per run (e.g. seed=0).## Continual Evaluation Implementation
The continual evaluation is integrated in the Avalanche flow of continual learning.
For documentation, see [here](https://avalanche.continualai.org/).- [src/eval/continual_eval.py](src/eval/continual_eval.py): Introduces Continual Evaluation tracking flow after training iterations in Avalanche `after_training_iteration`.
The additional phases are defined as:# Standard Avalanche CL flow
...
- before_training_iteration
- after_training_iteration
- before_tracking # BEGIN Integrated Continual Evaluation
- before_tracking_step
- before_tracking_batch
- after_tracking_batch
- after_tracking_step
- after_tracking # END
- after_training_epoch
...
- [src/eval/continual_eval_metrics.py](src/eval/continual_eval_metrics.py): Contains all the continual evaluation metrics, which all inherit from `TrackerPluginMetric`.
This Plugin defines all the Continual Evaluation tracking phases, which each metric can overwrite as appropriate.
- [main.py](main.py) first passes a list of plugins to Avalanche's `EvaluationPlugin` with `ContinualEvaluationPhasePlugin` first to update the Continual Evaluation metric states on `after_training_iteration`.
Next in the list, the metric plugins are passed to `EvaluationPlugin` so `after_training_iteration` logs the metrics on each iteration.## Visualize results
We support both tensorboard and WandB.
To view results for Tensorboard, run:tensorboard --logdir=OUTPUT_DIR
## Citing and license
Please consider citing us upon using this repo:@inproceedings{
delange2023continual,
title={Continual evaluation for lifelong learning: Identifying the stability gap},
author={Matthias De Lange and Gido M van de Ven and Tinne Tuytelaars},
booktitle={The Eleventh International Conference on Learning Representations },
year={2023},
url={https://openreview.net/forum?id=Zy350cRstc6}
}
Code is available under MIT license: A short and simple permissive license with conditions only requiring preservation of copyright and license notices.
See [LICENSE](LICENSE) for the full license.