Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/geyang/pql

Fork from PQL
https://github.com/geyang/pql

Last synced: 10 days ago
JSON representation

Fork from PQL

Awesome Lists containing this project

README

        

# Parallel Q-Learning (PQL)
This repository provides an implementation of the paper "Parallel *Q*-Learning: Scaling Off-policy Reinforcement Learning under Massively Parallel Simulation".

- [Installation](#installation)
- [Install :zap: PQL](#install_pql)
- [Install Isaac Gym](#install_isaac)
- [System Requirements](#requirements)
- [Usage](#usage)
- [Train with :zap: PQL](#usage_pql)
- [Baselines](#usage_baselines)
- [Logging](#usage_logging)
- [Saving and Loading](#usage_saving_loading)
- [Citation](#citation)
- [Acknowledgement](#acknowledgement)

## Installation

### Install :zap: PQL

1. Clone the package:

```bash
git clone [email protected]:geyang/pql.git
cd pql
```

2. Create Conda environment and install dependencies:

```bash
./create_conda_env_pql.sh
pip install -e .
```

### Install Isaac Gym

> **Note**
> In original paper, we use Isaac Gym Preview 3 and task configs in commit ca7a4fb762f9581e39cc2aab644f18a83d6ab0ba in IsaacGymEnvs.

1. Download and install Isaac Gym Preview 4 from https://developer.nvidia.com/isaac-gym

2. Unzip the file:
```bash
tar -xf IsaacGym_Preview_4_Package.tar.gz
```

3. Install IsaacGym
```bash
cd isaacgym/python
pip install -e . --no-deps
```

5. Install IsaacGymEnvs

```bash
git clone https://github.com/NVIDIA-Omniverse/IsaacGymEnvs.git
cd IsaacGymEnvs
pip install -e . --no-deps
```

6. Export LIBRARY_PATH

```bash
export LD_LIBRARY_PATH=$(conda info --base)/envs/pql/lib/:$LD_LIBRARY_PATH
```

## System Requirements
> **Warning**
> Note that wall-clock efficiency highly depends on the GPU type and will decrease with smaller/fewer GPUs (check Section 4.4 in the paper).

Isaac Gym requires an NVIDIA GPU. To train in the default configuration, we recommend a GPU with at least 10GB of VRAM. For smaller GPUs, you can decrease the number of parallel environments (`cfg.num_envs`), batch_size (`cfg.algo.batch_size`), replay buffer capacity (`cfg.algo.memory_size`), etc. :zap: PQL can run on 1/2/3 GPUs (set GPU ID `cfg.p_learner_gpu` and `cfg.v_learner_gpu`; default GPU ID for Isaac Gym env is `GPU:0`).

## Usage

### Train with :zap: PQL

Run :zap: PQL on Shadow Hand task. A full list of tasks in Isaac Gym is available [here](https://github.com/NVIDIA-Omniverse/IsaacGymEnvs/blob/main/docs/rl_examples.md).

```
python train_pql.py task=ShadowHand
```

Run :zap: PQL-D (with distributional RL)

```
python train_pql.py task=ShadowHand algo.distl=True algo.cri_class=DistributionalDoubleQ
```

### Baselines

Run DDPG baseline

```
python train_baselines.py algo=ddpg_algo task=ShadowHand
```

Run SAC baseline

```
python train_baselines.py algo=sac_algo task=ShadowHand
```

Run PPO baseline

```
python train_baselines.py algo=ppo_algo task=ShadowHand isaac_param=True
```

### Logging

We use ML-Logger

1. Add your profile here https://dash.ml/profile
[//]: # (2. Get your API key from https://wandb.ai/authorize)
3. set up your account in terminal
```bash
export ML_LOGGER_USER=
export ML_LOGGER_ROOT=
```

### Saving and Loading

Checkpoints are automatically saved as pickle files

To load and visualize the policy, run

```bash
python visualize.py task=ShadowHand headless=False num_envs=10 artifact=$team-name$/$project-name$/$run-id$/$version$
```