https://github.com/ml-jku/reactive-exploration

Code for the paper "Reactive Exploration to Cope with Non-Stationarity in Lifelong Reinforcement Learning"
https://github.com/ml-jku/reactive-exploration

Last synced: about 2 months ago
JSON representation

Code for the paper "Reactive Exploration to Cope with Non-Stationarity in Lifelong Reinforcement Learning"

Host: GitHub
URL: https://github.com/ml-jku/reactive-exploration
Owner: ml-jku
License: mit
Created: 2022-07-04T11:23:00.000Z (almost 3 years ago)
Default Branch: main
Last Pushed: 2022-07-04T12:05:41.000Z (almost 3 years ago)
Last Synced: 2025-04-05T13:23:34.662Z (2 months ago)
Language: Python
Homepage:
Size: 372 KB
Stars: 14
Watchers: 3
Forks: 3
Open Issues: 1
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

        # Reactive Exploration to Cope with Non-Stationarity in Lifelong Reinforcement Learning

Christian Steinparz^**2**, Thomas Schmied^**1**, Fabian Paischer^**1**, Marius-Constantin Dinu^**1,3**, Vihang Patil^**1**, Angela Bitto-Nemling^**1,4**, Hamid Eghbal-Zadeh^**1**, Sepp Hochreiter^**1,4** 

^**1**ELLIS Unit Linz and LIT AI Lab, Institute for Machine Learning, Johannes Kepler University Linz, Austria\

^**2**Visual Data Science Lab, Institute of Compute Graphics, Johannes Kepler University Linz, Austria\

^**3**Dynatrace Research, Linz, Austria\

^**4**Institute of Advanced Research in Artificial Intelligence (IARAI), Vienna, Austria

This repository contains the source code for our paper **"Reactive Exploration to Cope with Non-Stationarity in Lifelong Reinforcement Learning"** accepted at [CoLLAs 2022](https://lifelong-ml.cc/).

![Reactive exploration](./img/reactexp1.png)

[//]: # (---)

[//]: # ()

[//]: # (**This is the repository for the paper:)

[//]: # ([X](https://arxiv.org/abs/2205.12258).**)

[//]: # ()

[//]: # (**Detailed blog post on this paper at [this link](https://ml-jku.github.io/blog/2022/helm/).**)

[//]: # ()

[//]: # (---)

## Installation

[![Python 3.9](https://img.shields.io/badge/Python-3.9-blue.svg)](https://www.python.org/downloads/release/python-390/)

[![Pytorch](https://img.shields.io/badge/PyTorch-1.11-red.svg)](https://pytorch.org/get-started/previous-versions/)

![Licence](https://img.shields.io/github/license/ml-jku/reactive-exploration)

First, clone the repository and install the conda environment from the repository root (using either linux or windows config file): 

```

conda env create -f environment_linux.yaml

conda activate reactive_exploration 

```

Then follow the Jelly-Bean-World [installation instructions](https://github.com/eaplatanios/jelly-bean-world#installation-instructions). We use the following version:

```

git clone https://github.com/eaplatanios/jelly-bean-world.git

cd jelly-bean-world

git checkout 9bb16780e72d9d871384f9bcefd3b4e029a7b0ef

git submodule update --init --recursive

cd api/python

python setup.py install

```

From the project root directory install and update the submodule for ICM+PPO:

```

git submodule update --init --recursive

```

## Running experiments

This codebase relies on [Hydra](https://github.com/facebookresearch/hydra), which configures experiments via `.yaml` files. 

Hydra automatically creates the log folder structure for a given run, as specified in the respective `config.yaml` file.

### Running single experiments

By default, Hydra uses the configuration in `configs/config.yaml`. 

This config file defines how Hydra generates the directory structures for executed experiments under the block `hydra`. 

The  `config.yaml` contains the default parameters. The file references the respective default parameter files under the block

`defaults`. 

By default, `main.py` trains a PPO+ICM agent on the Colour-Swap task from the paper by referencing the environment configuration `configs/env_params/colour_swap.yaml` and agent configuration `configs/agent_params/icm.yaml`

To execute this configuration, run:

```

python main.py

```

For other configurations regarding agents and environments reference the files in `configs/`. For instance, to execute PPO+ICM on the Rotation task, run: 

```

python main.py env_params=rotator

```

Experiment logs are synced to [wandb](https://github.com/wandb/client). To execute experiments without wandb-logging, run:

```

python main.py use_wandb=False

```

### Running multiple experiments

All hyperparameters specified in the `.yaml` configuration files can be manipulated from the commandline. 

For example, to execute ICM and RND on the Colour-Swap task and Rotation task (`configs/env_params/rotator.yaml`) using 5 seeds, run: 

```

python -m main.py agent_params=icm,rnd env_params=colour_swap,rotator seed=1,2,3,4,5

```

## Citation

This paper has been accepted to the Conference on Lifelong Learning Agents (CoLLAs) 2022. While the conference proceedings do not yet exist, we recommend the following citation:

```bib

@misc{steinparz2022reactiveexp,

  title={Reactive Exploration to Cope with Non-Stationarity in Lifelong Reinforcement Learning},

  author={Steinparz, Christian and Schmied, Thomas and Paischer, Fabian and Dinu, Marius-Constantin and Patil, Vihang and 

          Bitto-Nemling, Angela and Eghbal-zadeh, Hamid and Hochreiter, Sepp},

  journal={arXiv preprint, accepted to Conference on Lifelong Learning Agents 2022},

  year={2022},

  eprint={X}

}

```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/ml-jku/reactive-exploration

Awesome Lists containing this project

README