An open API service indexing awesome lists of open source software.

https://github.com/rmst/rtrl

PyTorch implementation of our paper Real-Time Reinforcement Learning (NeurIPS 2019)
https://github.com/rmst/rtrl

deep-learning deep-reinforcement-learning machine-learning pytorch reinforcement-learning

Last synced: about 1 year ago
JSON representation

PyTorch implementation of our paper Real-Time Reinforcement Learning (NeurIPS 2019)

Awesome Lists containing this project

README

          

# Real-Time Reinforcement Learning

This repo is accompanying our paper "Real-Time Reinforcement Learning" (https://arxiv.org/abs/1911.04448).





Traditional Reinforcement Learning
            
Real-Time Reinforcement Learning

### Getting Started
This repo can be pip-installed via
```bash
pip install git+https://github.com/rmst/rtrl.git
```

To train an RTAC agent on the basic `Pendulum-v0` task run
```bash
python -m rtrl run rtrl:RtacTraining Env.id=Pendulum-v0
```

### Mujoco Experiments
To install Mujoco you follow the instructions at [openai/gym](https://github.com/openai/gym) or have a look at [`our dockerfile`](github.com/rmst/rtrl/blob/master/docker/gym/Dockerfile). The following environments were used in the paper.

![MuJoCo](resources/mujoco_horizontal.png)

To train an RTAC agent on `HalfCheetah-v2` run
```bash
python -m rtrl run rtrl:RtacTraining Env.id=HalfCheetah-v2
```

To train a SAC agent on `Ant-v2` with a real-time wrapper (i.e. RTMDP in the paper) run
```bash
python -m rtrl run rtrl:SacTraining Env.id=Ant-v2 Env.real_time=True
```

### Avenue Experiments
Avenue [(Ibrahim et al., 2019)](https://github.com/elementaI/avenue) can be pip-installed via
```bash
pip install git+https://github.com/elementai/avenue.git
```

To train an RTAC agent to drive on a race track (right video) run
```bash
python -m rtrl run rtrl:RtacAvenueTraining Env.id=RaceSolo-v0
```
Note that this requires a lot of resources, especially memory (16GB+).

### Storing Stats
`python -m rtrl run` just prints stats to stdout. To save stats use the following instead.
```bash
python -m rtrl run-fs experiment-1 rtrl:RtacTraining Env.id=Pendulum-v0
```
Stats are generated and printed every `round` but only saved to disk every `epoch`. The stats will be saved as pickled pandas dataframes in `experiment-1/stats`.

### Checkpointing
This repo supports checkpointing. Every `epoch` the whole run object (e.g. instances of `rtrl.training:Training`) is pickled to disk and reloaded. This is to ensure reproducibilty.

You can manually load and inspect pickled run instances with the standard `pickle:load` or the more convenient `rtrl:load`. For example, to look at the first transition in a SAC agent's replay memory run
```python
import rtrl
run = rtrl.load('experiment-1/state')
print(run.agent.memory[0])
```