{"id":24615779,"url":"https://github.com/google-deepmind/dm_nevis","last_synced_at":"2025-05-07T02:24:59.575Z","repository":{"id":65542584,"uuid":"562884571","full_name":"google-deepmind/dm_nevis","owner":"google-deepmind","description":"NEVIS'22: Benchmarking the next generation of never-ending learners","archived":false,"fork":false,"pushed_at":"2022-12-13T14:47:35.000Z","size":399,"stargazers_count":99,"open_issues_count":0,"forks_count":6,"subscribers_count":10,"default_branch":"master","last_synced_at":"2025-01-24T11:47:01.592Z","etag":null,"topics":["benchmark","continual-learning","efficient-learning","jax","pytorch"],"latest_commit_sha":null,"homepage":"https://arxiv.org/abs/2211.11747","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/google-deepmind.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2022-11-07T13:13:01.000Z","updated_at":"2025-01-22T09:57:45.000Z","dependencies_parsed_at":"2023-01-28T13:03:07.904Z","dependency_job_id":null,"html_url":"https://github.com/google-deepmind/dm_nevis","commit_stats":null,"previous_names":["google-deepmind/dm_nevis","deepmind/dm_nevis"],"tags_count":2,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/google-deepmind%2Fdm_nevis","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/google-deepmind%2Fdm_nevis/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/google-deepmind%2Fdm_nevis/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/google-deepmind%2Fdm_nevis/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/google-deepmind","download_url":"https://codeload.github.com/google-deepmind/dm_nevis/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":235510118,"owners_count":19001653,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["benchmark","continual-learning","efficient-learning","jax","pytorch"],"created_at":"2025-01-24T22:14:28.145Z","updated_at":"2025-01-24T22:14:28.853Z","avatar_url":"https://github.com/google-deepmind.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cdiv align=\"center\"\u003e\n\n# 🏝️ NEVIS'22\n\n[![Paper](https://img.shields.io/badge/arXiv-2211.11747-red)](https://arxiv.org/abs/2211.11747)\n[![Blog](https://img.shields.io/badge/blog-link-blue)](https://www.deepmind.com/blog/benchmarking-the-next-generation-of-never-ending-learners)\n\n\u003c/div\u003e\n\nNEVIS’22 is a benchmark for measuring the performance of algorithms in the field\nof continual learning. Please see the accompanying [paper] for more details.\n\nWithin this Python package, we provide three components,\n\n1.  Library code to download and post-process datasets that are not available\n    within [tfds], so that the stream used in the [paper] can be replicated.\n2.  A package to combine the NEVIS’22 datasets into a *stream*, and robustly\n    evaluate learners using the evaluation protocol proposed in the NEVIS’22\n    [paper].\n3.  Baseline learners implemented in JAX and PyTorch. The JAX learners are\n    identical to the learners used for the figures in the [paper], the PyTorch\n    learners are provided for example purposes.\n\nNEVIS’22 is composed of 106 tasks chronologically sorted and extracted from\npublications randomly sampled from online proceedings of major computer vision\nconferences over the past three decades. Each task is a supervised\nclassification task, which is the most well understood setting in machine\nlearning. The challenge is how to automatically transfer knowledge across\nrelated tasks in order to achieve a higher performance or be more efficient on\nthe next task.\n\nBy construction, NEVIS’22 is reproducible, diverse and at a scale sufficiently\nlarge to test state of the art learning algorithms. The task selection process\ndoes not favor any particular approach, but merely tracks what the computer\nvision community has deemed interesting over time. NEVIS’22 is not just about\ndata, it is also about the methodology used to train and evaluate learners. We\nevaluate learners in terms of their ability to learn future tasks, as measured\nby their trade-off between error rate and compute measured in the number of\nfloating-point operations. In NEVIS’22, achieving lower error rate is by itself\nnot sufficient, if this comes at an unreasonable computational cost. Instead, we\nincentivise both accurate and efficient models.\n\nYou can read more about NEVIS'22 in our [paper] and our [blog post].\n\n## 0. Dependencies\n\nPlease follow these steps and read in details section 1. and 2. before launching\nanything.\n\n-   Our datasets use the Tensorflow(-datasets) API. Our JAX learners use\n    TensorFlow and JAX, and our PyTorch Learners use PyTorch. Each (datasets,\n    jax learners, and pytorch learners) have their own `requirements.txt` that\n    you are welcome to install with `pip` and a Python version above 3.8.\n\n-   It is also possible to run the code directly using the provided Dockerfiles.\n    see [here](https://docs.docker.com/get-docker/) for installing Docker.\n\n-   Some datasets are downloaded from Kaggle. See the\n    [Kaggle website](https://www.kaggle.com/docs/api) for configuring your\n    credentials, and place them in the folder ~/.kaggle.\n\n## 1. Replicating the NEVIS'22 stream\n\nIn NEVIS'22, we train and evaluate on *streams*. Each stream is a sequence of\ndatasets. Some streams have a large number of datasets, up to 106, allowing us\nto evaluate Large-Scale Continual Learning.\n\nThere are three different sources for datasets in NEVIS'22:\n\n1.  **Datasets on Tensorflow-Datasets (TFDS)**: they will be downloaded\n    automatically when needed\n\n2.  **Custom dataset downloaders**: you need the `./build_dataset.sh` script\n\n3.  **Manual dataset download**: you need to download data yourself\n\nNote that we do not host or distribute these datasets, instead we provide URLS\nto their original source to help you download them at your own risk. We do not\nvouch for their quality or fairness, or claim that you have license to use the\ndataset. It is your responsibility to determine whether you have permission to\nuse the dataset under the dataset's license. If you're a dataset owner and wish\nto update any part of it (description, citation, etc.), or do not want your\ndataset URL to be included in this library, please get in touch through a GitHub\nissue. Thanks for your contribution to the ML community!\n\nWe do our best to keep datasets URLs up-to-date. If a dataset doesn't download,\nplease contact the dataset owners and open an issue to let us know. If a dataset\ndoesn't get fixed by the owners, we will remove it from our benchmark.\n\n### 1.1. TFDS Datasets\n\nDifferent streams are available, each is made of a sequence of datasets. When\niterating the datasets of a stream, TFDS datasets are automatically downloaded\non-the-fly if they don't exist.\n\n### 1.2. Custom Dataset Downloaders\n\nMany datasets implemented in Nevis can be automatically downloaded. This has to\nbe done in advance of training with the script `./build_dataset.sh`:\n\n```bash\n$ ./build_dataset.sh -h\nUsage:\n        -d \u003cDATASET_NAME\u003e | Dataset name\n        -s \u003cSTREAM_NAME\u003e  | Stream variant name (FULL|SHORT|TINY|DEBUG|...)\n        -b                | Build docker before running\n        -e                | Develop mode where code is mounted\n        -h                | Help message\n```\n\nIf running for the first time, pass the option `-b` alongside other commands to\nbuild the **docker** (`nevis-data`). The develop mode is useful if you need to\nchange the codebase (e.g. for adding a new dataset) and need to debug quickly\nwithout having to re-building the docker everytime (you still need to build the\ndocker in develop mode! `-b -e`).\n\nSee in `dm_nevis/streams/nevis_stream.py` the enum `NEVISStreamVariant` for the\nfull list of downloadable streams.\n\nSome datasets are downloaded from Kaggle. See on the\n[Kaggle website](https://www.kaggle.com/docs/api) how to configure your\ncredentials and place them in the folder `~/.kaggle`.\n\n### 1.3. Manual Download\n\nImageNet is a TFDS Dataset but it needs to be downloaded manually. Please check\nthe [instructions](https://www.tensorflow.org/datasets/catalog/imagenet2012).\n\nFor info, TFDS will look for datasets in the directory defined by the\nenvironment variable `TFDS_DATA_DIR`.\n\n## 2. Experiments\n\nEach experiment consists of training a model on a stream of multiple datasets.\nThus, this command will train a model on each dataset. We provide two main\nparadigms of learners: *independent* and *finetune from previous*. In the\nformer, we create a new randomly initialized model for each dataset. In the\nlatter, a model is initialized for the first dataset of the stream, and tuned\nsequentially for all datasets.\n\nTo launch an experiment, run:\n\n```\n./launch_local.sh \u003cX\u003e example\n```\n\nWith `\u003cX\u003e` being the framework to use (`jax` or `torch`), second argument is the\nconfig to use.\n\nNote that for the torch version, if you want to run on gpu instead of cpu, you\nneed to provide the gpu id with `--device \u003cGPU_ID\u003e`. By default, the code is\nusing the id `-1` to symbolize cpu.\n\n## Output directory for metrics\n\nBy default the metrics computed by `experiments_\u003cX\u003e/metrics/nevis_metrics.py`\nwill be written in `./nevis_output_dir`.\n\nYou can specify a different path by overriding the environment variable\n`NEVIS_DEFAULT_OUTPUT_DIR`.\n\n## Metrics visualization with TensorBoard\n\nThe TensorBoard events file will be saved to `~/nevis/tensorboard`. Each run\nwill create a folder below this directory named with the date and time when the\nrun was launched.\n\nThe tensorboard can be launched with the following command.\n\n`tensorboard --lodir=~/nevis/tensorboard`\n\nYou will need to have `tensorboard` installed outside the docker using\n\n```bash\npip install tensorboard\n```\n\nRegarding the different groups of plots on tensorboard dashboard: -\n`benchmark_metrics` contains metrics from prediction events across the stream,\nwhere the x-axis is the index (0-based) of the most training event. -\n`train_event_\u003ci\u003e` contains training and validation metrics on the training index\nwith index `i`.\n\n## 3. Example\n\nLet's take an example learner (returns always zeros) that we will \"train\" on the\nDEBUG stream made of Pascal VOC 2007 and Coil100 datasets.\n\nPascal VOC 2007 is a TFDS dataset so it will be automatically downloaded when\nneeded.\n\nFirst we download Coil100 dataset:\n\n```bash\n./build_dataset.sh -e -b -s debug\n```\n\nNote that since the DEBUG stream only downloads Coil100, we could also have used\n`-d coil100` instead of `-s debug`. As you can see in the script\n`build_dataset.sh`, we download data in `~/nevis`. You can change this directory\nby overriding the env variable `LOCAL_DIR` in the script.\n\nThen, we launch the example learner:\n\n```bash\n./launch_local.sh jax example\n```\n\nNote that the stream `DEBUG` is already specified in the config\n`./experiments_jax/config/example.py`.\n\n## 4. Baselines\n\nWe provide several baselines, defined in the `learners/` directory with configurations\nin the `configs/` directory. Note that the same approach might have multiple configurations.\n\nReminder, to run configuration `configs/X.py`, do `./launch_local.sh jax X.py`.\n\nWe provide the following baselines:\n- **Independent**, in `configs/finetuning_ind.py` where each dataset is learned by an independent model\n- **Previous**, in `configs/finetuning_prev.py` where we learn sequentially each dataset and initialize its parameters from the parameter vector learned on the previous task.\n- **Dynamic**, in `configs/finetuning_dknn.py`. where the initialization of task T is chosen among the models which have been trained on a dataset most similar to the current dataset. This baseline performs hyperparameter tuning while learning the task, following the protocol described in our tech report.\n\n\nVariants are also proposed, such as cheaper configurations in `configs/cheap_finetuning_dknn.py` which use a smaller net and fewer trials of hyper-parameter search. These are the best entry point for people who have access to only one or few GPUs.\n\n\nIt is also possible to run a pretrained model on the Nevis stream. First train \nyour own pretrained model. For example on ImageNet, run the configuration `configs/pretrain_imagenet.py`. Collect the resulting checkpoint, see configuration file to see where it's saved. \nThen, use this checkpoint for `configs/finetuning_ind_pretrained.py`.\n\n## 5. Code paths\n\nThe code is structured as follows:\n\n```bash\n|--- dm_nevis/\n|    |--- benchmarker/\n|    |--- datasets_storage/\n|    |--- streams/\n|--- experiments_jax/\n|    |--- launch.py\n|    |--- experiment.py\n|    |--- configs/\n|    |--- learners/\n|    |--- metrics/\n|    |--- environment/\n|    |--- training/\n|--- experiments_torch/\n|    |--- launch.py\n|    |--- experiment.py\n|    |--- configs/\n|    |--- learners/\n|    |--- metrics/\n|    |--- environment/\n|    |--- training/\n```\n\n`dm_nevis/` is the library of the benchmark, containing the `benchmarker/`\nlibrary, which implements the evaluation protocol used in the [paper].\n`datasets_storage/` is a package to support the downloading and preparation of\ndatasets, and `streams/` is a package defining different streams.\n\nThere are two directories containing baseline model implementations, one for jax\n(`experiments_jax`), and one for pytorch (`experiments_torch`). In each,\n`launch.py` is the Docker entrypoint, `experiment.py` is the module where all\nthe execution happens, `configs/` provides the hyperparameters for each learner,\n`learners/` implements the learners (note: in some cases, there are different\nconfigs for the same learner), `metrics/` implements the metrics used in\nNEVIS'22, `environment/` provides the logger and checkpointer, and `training/`\nprovides learner-agnostic utilities such as the heads, the backbone, but also a\nflops counter for example.\n\n# Contact\n\nIf you wish to contact us, please raise a GitHub issue.\n\nIf you are using the NEVIS'22 benchmark, please cite the following paper,\n\n```bibtex\n@article{bornschein2022nevis,\n  author={Bornschein, J\\\"org and Galashov, Alexandre and Hemsley, Ross and Rannen-Triki, Amal and Chen, Yutian and Chaudhry, Arslan and He, Xu Owen and Douillard, Arthur and Caccia, Massimo and Feng, Qixuang and Shen, Jiajun and Rebuffi, Sylvestre-Alvise and Stacpoole, Kitty and de las Casas, Diego and Hawkins, Will and Lazaridou, Angeliki and Teh, Yee Whye and Rusu, Andrei A. and Pascanu, Razvan and Ranzato, Marc'Aurelio},\n  title={Nevis\\'22: A Stream of 100 Tasks Sampled from 30 Years of Computer Vision Research},\n  journal={CoRR},\n  volume={abs/2211.11747},\n  year={2022},\n  url={https://arxiv.org/abs/2211.11747},\n  eprinttype={arXiv}\n}\n```\n\n[paper]: https://arxiv.org/abs/2211.11747\n[blog post]: https://www.deepmind.com/blog/benchmarking-the-next-generation-of-never-ending-learners\n[tfds]: https://www.tensorflow.org/datasets/api_docs/python/tfds\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgoogle-deepmind%2Fdm_nevis","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fgoogle-deepmind%2Fdm_nevis","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgoogle-deepmind%2Fdm_nevis/lists"}