An open API service indexing awesome lists of open source software.

https://github.com/zaccharieramzi/shine

Code for the DEQ experiments of the ICLR 2022 spotlight "SHINE: SHaring the INverse Estimate from the forward pass for bi-level optimization and implicit models"
https://github.com/zaccharieramzi/shine

bi-level-optimization deep-learning implicit-models quasi-newton-method

Last synced: about 1 year ago
JSON representation

Code for the DEQ experiments of the ICLR 2022 spotlight "SHINE: SHaring the INverse Estimate from the forward pass for bi-level optimization and implicit models"

Awesome Lists containing this project

README

          

# MDEQ - SHINE

This is the second part of the code for the paper "SHINE: SHaring the INverse Estimate from the forward pass for bi-level optimization and implicit models", submitted at the 2022 ICLR conference.
The first part of the code to reproduce the Bi-level optimizations experiments is available [here](https://github.com/zaccharieramzi/hoag/tree/shine).
This source code allows to reproduce the experiments on multiscale DEQs, i.e. Figure 3, and Figure E.2. in Appendix.
This repo is based on the original [mdeq repo](https://github.com/locuslab/mdeq) by @jerrybai1995.

## General instructions

You need Python 3.7 or above to run this code.
This code will only run on a computer equipped with a GPU.
You can then install the requirements with: `pip install -r requirements.txt`.

## Reproducing Figure 3, DEQ

You can reproduce Figure 3 of the paper with the following sequence of scripts:
```
# cifar
python paper_trainings.py
python paper_backward_times.py
# imagenet
python paper_trainings.py --dataset imagenet --n_runs 1 --refines 0,5,None
python paper_backward_times.py --dataset imagenet
python paper_plot.py
```

You can further indicate how many gpus to use in each script with the `--n_gpus` option (default for training is 4).
You can find other options using the `--help` option.
Beware:
- each CIFAR training is 11hours to 15 hours long (100 of them by default)
- each ImageNet training is 3 days to 7 days long (6 of them by default)

For a practical reproduction you might want to run those in an HPC (i.e. change line 56-60 to work with e.g. [submitit](https://github.com/facebookincubator/submitit)).
For a test use, you can use the `--n_runs` (the number of repetitions for the error bar) and `--n_refines` (the number of points on the Pareto curve) options.

You can also just do the CIFAR trainings, by simply not running the ImageNet ones.
The Figure will still be generated.

## Reproducing Figure E.2., Quality of the inversion using OPA in DEQs

You can reproduce Figure E.2. of the paper with the following script:

```
python mdeq_lib/tests/modules/adj_broyden_correl.py
```

This should take about 15 mins to run with a single GPU.