Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/toshikwa/wappo.pytorch
PyTorch implementation of Wasserstein Adversarial Proximal Policy Optimization(WAPPO).
https://github.com/toshikwa/wappo.pytorch
Last synced: about 2 months ago
JSON representation
PyTorch implementation of Wasserstein Adversarial Proximal Policy Optimization(WAPPO).
- Host: GitHub
- URL: https://github.com/toshikwa/wappo.pytorch
- Owner: toshikwa
- License: mit
- Created: 2020-06-10T04:50:43.000Z (over 4 years ago)
- Default Branch: master
- Last Pushed: 2022-06-22T02:21:12.000Z (over 2 years ago)
- Last Synced: 2024-04-16T00:21:36.919Z (5 months ago)
- Language: Python
- Homepage:
- Size: 109 KB
- Stars: 6
- Watchers: 4
- Forks: 0
- Open Issues: 2
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# WAPPO in PyTorch
This is a PyTorch implementation of Wasserstein Adversarial Proximal Policy Optimization (WAPPO)[[1]](#references). I tried to make it easy for readers to understand the algorithm. Please let me know if you have any questions.## Setup
If you are using Anaconda, first create the virtual environment.```bash
conda create -n wappo python=3.8 -y
conda activate wappo
```You can install Python liblaries using pip.
```bash
pip install --upgrade pip
pip install -r requirements.txt
```If you're using other than CUDA 10.2, you need to install PyTorch for the proper version of CUDA. See [instructions](https://pytorch.org/get-started/locally/) for more details.
## Example
### VisualCartpole
I trained WAPPO and PPO on `cartpole-visual-v1` as below. Following the WAPPO paper, results are averaged over 5 trials. a graph below corresponds to Figure 2 in the paper. Source and target tasks in my experiment are also shown below.
Note that I changed some hyperparameters from the paper. I set 128 for `rollout_length` instead of 256, and 2 for `num_initial_blocks` instead of 1. Please refer to `config/cartpole.yaml` for details.
```bash
python train.py --cuda --wappo --env_id cartpole-visual-v1 --config config/cartpole.yaml --trial 0
```
## References
[[1]](https://arxiv.org/abs/2006.03465) Roy, Josh, and George Konidaris. "Visual Transfer for Reinforcement Learning via Wasserstein Domain Confusion."