https://github.com/toshikwa/wappo.pytorch

PyTorch implementation of Wasserstein Adversarial Proximal Policy Optimization(WAPPO).
https://github.com/toshikwa/wappo.pytorch

Last synced: 3 months ago
JSON representation

PyTorch implementation of Wasserstein Adversarial Proximal Policy Optimization(WAPPO).

Host: GitHub
URL: https://github.com/toshikwa/wappo.pytorch
Owner: toshikwa
License: mit
Created: 2020-06-10T04:50:43.000Z (about 5 years ago)
Default Branch: master
Last Pushed: 2022-06-22T02:21:12.000Z (about 3 years ago)
Last Synced: 2025-04-22T22:18:34.063Z (3 months ago)
Language: Python
Homepage:
Size: 109 KB
Stars: 6
Watchers: 4
Forks: 0
Open Issues: 2
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# WAPPO in PyTorch
This is a PyTorch implementation of Wasserstein Adversarial Proximal Policy Optimization (WAPPO)[[1]](#references). I tried to make it easy for readers to understand the algorithm. Please let me know if you have any questions.

## Setup
If you are using Anaconda, first create the virtual environment.

```bash
conda create -n wappo python=3.8 -y
conda activate wappo
```

You can install Python liblaries using pip.

```bash
pip install --upgrade pip
pip install -r requirements.txt
```

If you're using other than CUDA 10.2, you need to install PyTorch for the proper version of CUDA. See [instructions](https://pytorch.org/get-started/locally/) for more details.

## Example

### VisualCartpole

I trained WAPPO and PPO on `cartpole-visual-v1` as below. Following the WAPPO paper, results are averaged over 5 trials. a graph below corresponds to Figure 2 in the paper. Source and target tasks in my experiment are also shown below.

Note that I changed some hyperparameters from the paper. I set 128 for `rollout_length` instead of 256, and 2 for `num_initial_blocks` instead of 1. Please refer to `config/cartpole.yaml` for details.

```bash
python train.py --cuda --wappo --env_id cartpole-visual-v1 --config config/cartpole.yaml --trial 0
```

graph graph

## References
[[1]](https://arxiv.org/abs/2006.03465) Roy, Josh, and George Konidaris. "Visual Transfer for Reinforcement Learning via Wasserstein Domain Confusion."

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/toshikwa/wappo.pytorch

Awesome Lists containing this project

README