https://github.com/geyang/pql

Fork from PQL
https://github.com/geyang/pql

Last synced: 4 months ago
JSON representation

Fork from PQL

Host: GitHub
URL: https://github.com/geyang/pql
Owner: geyang
Created: 2023-06-26T17:48:39.000Z (about 2 years ago)
Default Branch: main
Last Pushed: 2023-06-26T17:49:22.000Z (about 2 years ago)
Last Synced: 2025-01-10T12:58:29.647Z (6 months ago)
Language: Python
Size: 36.1 KB
Stars: 0
Watchers: 0
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

        # Parallel Q-Learning (PQL)

This repository provides an implementation of the paper "Parallel *Q*-Learning: Scaling Off-policy Reinforcement Learning under Massively Parallel Simulation".

- [Installation](#installation)

    - [Install :zap: PQL](#install_pql)

    - [Install Isaac Gym](#install_isaac)

- [System Requirements](#requirements)

- [Usage](#usage)

    - [Train with :zap: PQL](#usage_pql)

    - [Baselines](#usage_baselines)

    - [Logging](#usage_logging)

    - [Saving and Loading](#usage_saving_loading)

- [Citation](#citation)

- [Acknowledgement](#acknowledgement)

## Installation

### Install :zap: PQL 

1. Clone the package:

    ```bash

    git clone [email protected]:geyang/pql.git

    cd pql

    ```

2. Create Conda environment and install dependencies:

    ```bash

    ./create_conda_env_pql.sh

    pip install -e .

    ```

### Install Isaac Gym 

> **Note**

> In original paper, we use Isaac Gym Preview 3 and task configs in commit ca7a4fb762f9581e39cc2aab644f18a83d6ab0ba in IsaacGymEnvs.

1. Download and install Isaac Gym Preview 4 from https://developer.nvidia.com/isaac-gym

2. Unzip the file:

    ```bash

    tar -xf IsaacGym_Preview_4_Package.tar.gz

    ```

3. Install IsaacGym

    ```bash

    cd isaacgym/python

    pip install -e . --no-deps

    ```

5. Install IsaacGymEnvs

    ```bash

    git clone https://github.com/NVIDIA-Omniverse/IsaacGymEnvs.git

    cd IsaacGymEnvs

    pip install -e . --no-deps

    ```

    

6. Export LIBRARY_PATH

    

    ```bash

    export LD_LIBRARY_PATH=$(conda info --base)/envs/pql/lib/:$LD_LIBRARY_PATH

    ```

## System Requirements 

> **Warning**

> Note that wall-clock efficiency highly depends on the GPU type and will decrease with smaller/fewer GPUs (check Section 4.4 in the paper).

Isaac Gym requires an NVIDIA GPU. To train in the default configuration, we recommend a GPU with at least 10GB of VRAM. For smaller GPUs, you can decrease the number of parallel environments (`cfg.num_envs`), batch_size (`cfg.algo.batch_size`), replay buffer capacity (`cfg.algo.memory_size`), etc. :zap: PQL can run on 1/2/3 GPUs (set GPU ID `cfg.p_learner_gpu` and `cfg.v_learner_gpu`; default GPU ID for Isaac Gym env is `GPU:0`). 

## Usage

### Train with :zap: PQL 

Run :zap: PQL on Shadow Hand task. A full list of tasks in Isaac Gym is available [here](https://github.com/NVIDIA-Omniverse/IsaacGymEnvs/blob/main/docs/rl_examples.md).

```

python train_pql.py task=ShadowHand

```

Run :zap: PQL-D (with distributional RL)

```

python train_pql.py task=ShadowHand algo.distl=True algo.cri_class=DistributionalDoubleQ

```

### Baselines 

Run DDPG baseline

```

python train_baselines.py algo=ddpg_algo task=ShadowHand

```

Run SAC baseline

```

python train_baselines.py algo=sac_algo task=ShadowHand

```

Run PPO baseline

```

python train_baselines.py algo=ppo_algo task=ShadowHand isaac_param=True

```

### Logging 

We use ML-Logger 

1. Add your profile here https://dash.ml/profile

[//]: # (2. Get your API key from https://wandb.ai/authorize)

3. set up your account in terminal

    ```bash

    export ML_LOGGER_USER=

    export ML_LOGGER_ROOT=

    ```

    

### Saving and Loading 

Checkpoints are automatically saved as pickle files

To load and visualize the policy, run

```bash

python visualize.py task=ShadowHand headless=False num_envs=10 artifact=$team-name$/$project-name$/$run-id$/$version$

```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/geyang/pql

Awesome Lists containing this project

README