https://github.com/geyang/dmc_gen

Last synced: 4 months ago
JSON representation

Host: GitHub
URL: https://github.com/geyang/dmc_gen
Owner: geyang
License: mit
Created: 2021-02-21T20:10:01.000Z (over 4 years ago)
Default Branch: main
Last Pushed: 2021-03-13T14:25:06.000Z (over 4 years ago)
Last Synced: 2025-01-10T12:58:32.973Z (5 months ago)
Language: Python
Size: 103 MB
Stars: 0
Watchers: 0
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

        # DMControl Generalization Benchmark

This code base is a fork-in-progress from [Hansen & Wang 2020](https://arxiv.org/abs/2011.13389)'s 

The [./custom_vendor](./custom_vendor) folder contains the benchmark for generalization in continuous control from pixels based on [DMControl](https://github.com/deepmind/dm_control). Installation guide for these custom library could be found [there](./custom_vendor).

 

## Algorithms

This repository contains implementations of the following papers in a unified framework:

- [SODA (Hansen and Wang, 2020)](https://arxiv.org/abs/2011.13389)

- [PAD (Hansen et al., 2020)](https://arxiv.org/abs/2007.04309)

- [RAD (Laskin et al., 2020)](https://arxiv.org/abs/2004.14990)

- [CURL (Srinivas et al., 2020)](https://arxiv.org/abs/2004.04136)

- [SAC (Haarnoja et al., 2018)](https://arxiv.org/abs/1812.05905)

using standardized architecture and hyper-parameters, wherever applicable. 

## Setup

We assume that you have access to a GPU with CUDA >=9.2 support. All dependencies can then be installed with the following commands:

```bash

conda env create -f conda.yml

conda activate dmcgen

```

### Datasets Needed for Envs

Part of this repository relies on external datasets. SODA uses the [Places](http://places2.csail.mit.edu/download.html) dataset for data augmentation, which can be downloaded by running

```

wget http://data.csail.mit.edu/places/places365/places365standard_easyformat.tar

```

You should familiarize yourself with [their terms](http://places2.csail.mit.edu/download.html) before downloading. After downloading and extracting the data, add your dataset directory to the `data_dirs` list in `src/augmentations.py`.

> if wget is being too slow, you can try too use `axel` which parallelizes the download.

>

> ```bash

> sudo apt-get install axel

> axel http://data.csail.mit.edu/places/places365/places365standard_easyformat.tar

> ```

>

> If you do not have sudo rights, you can install axel from source (or use precompiled binary) [here](https://github.com/axel-download-accelerator/axel).

The `video_easy` environment was proposed in [PAD](https://github.com/nicklashansen/policy-adaptation-during-deployment), and the `video_hard` environment uses a subset of the [RealEstate10K](https://google.github.io/realestate10k/) dataset for background rendering. All test environments (including video files) are included in this repository, namely in the `src/env/` directory.

```bash

cd custom_vendor && make

```

Remember to add `export DMCGEN_DATA=$($PWD)/custom_vendor/data` environment variable to point to the location of this folder.

## Training & Evaluation

The `scripts` directory contains training and evaluation bash scripts for all the included algorithms. Alternatively, you can call the python scripts directly, e.g. for training call

```

python3 src/train.py \

    --algorithm soda \

    --aux_lr 3e-4 \

    --seed 0

```

to run SODA on the default task, `walker_walk`. This should give you an output of the form:

```

Working directory: logs/walker_walk/soda/0

Evaluating: logs/walker_walk/soda/0

| eval | S: 0 | ER: 26.2285 | ERTEST: 25.3730

| train | E: 1 | S: 250 | D: 70.1 s | R: 0.0000 | ALOSS: 0.0000 | CLOSS: 0.0000 | AUXLOSS: 0.0000

```

where `ER` and `ERTEST` corresponds to the average return in the training and test environments, respectively. You can select the test environment used in evaluation with the `--eval_mode` argument, which accepts one of `(train, color_easy, color_hard, video_easy, video_hard)`.

## Results

SODA demonstrates significantly improved generalization over previous methods, exhibits stable training, and has a sample efficiency that is comparable to the baseline SAC. Average return of SODA and baselines in the `train` and `color_hard` environments is shown below.

![soda curves results](figures/results_curves.png)

We also provide a full comparison of the SODA, PAD, RAD, and CURL methods on all four test environments. Results for `video_easy` and `color_hard` are shown below:

![soda table results](figures/results_table.png)

See [our paper](https://arxiv.org/abs/2011.13389) for more results.

## Acknowledgements

We want to thank the numerous researchers and engineers involved in work of which this implementation is based on. This benchmark is a product of our work on [SODA](https://arxiv.org/abs/2011.13389) and [PAD](https://arxiv.org/abs/2007.04309), our SAC implementation is based on [this repository](https://github.com/denisyarats/pytorch_sac_ae), the original DMControl is available [here](https://github.com/deepmind/dm_control), and the gym wrapper for it is available [here](https://github.com/denisyarats/dmc2gym). PAD, RAD, and CURL baselines are based on their official implementations provided [here](https://github.com/nicklashansen/policy-adaptation-during-deployment), [here](https://github.com/MishaLaskin/rad), and [here](https://github.com/MishaLaskin/curl), respectively.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/geyang/dmc_gen

Awesome Lists containing this project

README