https://github.com/RLE-Foundation/RLeXplore

RLeXplore provides stable baselines of exploration methods in reinforcement learning, such as intrinsic curiosity module (ICM), random network distillation (RND) and rewarding impact-driven exploration (RIDE).
https://github.com/RLE-Foundation/RLeXplore

baselines efficient-algorithm exploration-strategy gym machine-learning pybullet pytorch reinforcement-learning robotics toolbox

Last synced: 5 months ago
JSON representation

Host: GitHub
URL: https://github.com/RLE-Foundation/RLeXplore
Owner: RLE-Foundation
License: mit
Created: 2022-09-19T11:38:16.000Z (over 2 years ago)
Default Branch: main
Last Pushed: 2024-09-29T12:55:16.000Z (7 months ago)
Last Synced: 2024-10-06T08:45:50.255Z (7 months ago)
Topics: baselines, efficient-algorithm, exploration-strategy, gym, machine-learning, pybullet, pytorch, reinforcement-learning, robotics, toolbox
Language: Jupyter Notebook
Homepage: https://docs.rllte.dev/
Size: 16.1 MB
Stars: 352
Watchers: 3
Forks: 16
Open Issues: 14
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

awesome-production-machine-learning - RLeXplore - Foundation/RLeXplore.svg?style=social) - RLeXplore provides stable baselines of exploration methods in reinforcement learning. (Industry Strength RL)

README

        










## RLeXplore: Accelerating Research in Intrinsically-Motivated Reinforcement Learning



**RLeXplore** is a unified, highly-modularized and plug-and-play toolkit that currently provides high-quality and reliable implementations of eight representative intrinsic reward algorithms. It used to be challenging to compare intrinsic reward algorithms due to various confounding factors, including distinct implementations, optimization strategies, and evaluation methodologies. Therefore, RLeXplore is designed to provide unified and standardized procedures for constructing, computing, and optimizing intrinsic reward modules.

The workflow of RLeXplore is illustrated as follows:







# Table of Contents

- [Installation](#installation)

- [Module List](#module-list)

- [Tutorials](#tutorials)

- [Benchmark Results](#benchmark-results)

- [Cite Us](#cite-us)

# Installation

- with pip `recommended`

Open a terminal and install **rllte** with `pip`:

``` shell

conda create -n rllte python=3.8

pip install rllte-core 

```

- with git

Open a terminal and clone the repository from [GitHub](https://github.com/RLE-Foundation/rllte) with `git`:

``` sh

git clone https://github.com/RLE-Foundation/rllte.git

pip install -e .

```

Now you can invoke the intrinsic reward module by:

``` python

from rllte.xplore.reward import ICM, RIDE, ...

```

## Module List

| **Type** 	| **Modules** 	|

|---	|---	|

| Count-based 	| [PseudoCounts](https://arxiv.org/pdf/2002.06038), [RND](https://arxiv.org/pdf/1810.12894.pdf), [E3B](https://proceedings.neurips.cc/paper_files/paper/2022/file/f4f79698d48bdc1a6dec20583724182b-Paper-Conference.pdf) 	|

| Curiosity-driven 	| [ICM](http://proceedings.mlr.press/v70/pathak17a/pathak17a.pdf), [Disagreement](https://arxiv.org/pdf/1906.04161.pdf), [RIDE](https://arxiv.org/pdf/2002.12292) 	|

| Memory-based 	| [NGU](https://arxiv.org/pdf/2002.06038) 	|

| Information theory-based 	| [RE3](http://proceedings.mlr.press/v139/seo21a/seo21a.pdf) 	|

## Tutorials

Click the following links to get the code notebook:

0. [Quick Start](./0%20quick_start.ipynb)

1. [RLeXplore with RLLTE](./1%20rlexplore_with_rllte.ipynb)

2. [RLeXplore with Stable-Baselines3](./2%20rlexplore_with_sb3.ipynb)

3. [RLeXplore with CleanRL](./3%20rlexplore_with_cleanrl.py)

4. [Exploring Hybrid Intrinsic Rewards](./4%20hybrid_intrinsic_rewards.ipynb)

4. [Custom Intrinsic Rewards](./5%20custom_intrinsic_reward.ipynb)

## Benchmark Results

We have published a space using Weights & Biases (W&B) to store reusable experiment results on recognized benchmarks. The space link is: [RLeXplore's W&B Space](https://wandb.ai/yuanmingqi/RLeXplore/reportlist).







- `RLLTE's PPO+RLeXplore` on *SuperMarioBros*:







- `RLLTE's PPO+RLeXplore` on *MiniGrid*:

  + DoorKey-16×16

  


  

  


  + KeyCorridorS8R5, KeyCorridorS9R6, KeyCorridorS10R7, MultiRoom-N7-S8, MultiRoom-N10-S10, MultiRoom-N12-S10,	Dynamic-Obstacles-16x16,	and LockedRoom

  


  

  


- `RLLTE's PPO+RLeXplore` on *Procgen-Maze*:

  + Number of levels=1

  


  

  


  + Number of levels=200

  


  

  


- `RLLTE's PPO+RLeXplore` on five hard-exploration tasks of *ALE*:

| **Algorithm** | **Gravitar** | **MontezumaRevenge** | **PrivateEye** | **Seaquest** | **Venture** |

|:-------------:|:------------:|:--------------------:|:--------------:|:------------:|:-----------:|

| Extrinsic     |  **1060.19** |         42.83        |      88.37     |    942.37    |    391.73   |

| Disagreement  |    689.12    |         0.00         |      33.23     |    6577.03   |    468.43   |

| E3B           |    503.43    |         0.50         |      66.23     |  **8690.65** |     0.80    |

| ICM           |    194.71    |         31.14        |     -27.50     |    2626.13   |     0.54    |

| PseudoCounts  |    295.49    |         0.00         |   **1076.74**  |    668.96    |     1.03    |

| RE3           |    130.00    |         2.68         |     312.72     |    864.60    |     0.06    |

| RIDE          |    452.53    |         0.00         |      -1.40     |    1024.39   |    404.81   |

| RND           |    835.57    |      **160.22**      |      45.85     |    5989.06   |  **544.73** |

- `CleanRL's PPO+RLeXplore's RND` on *Montezuma's Revenge*:







- `RLLTE's SAC+RLeXplore` on *Ant-UMaze*:







## Cite Us

To cite this repository in publications:

``` bib

@article{yuan_roger2024rlexplore,

  title={RLeXplore: Accelerating Research in Intrinsically-Motivated Reinforcement Learning},

  author={Yuan, Mingqi and Castanyer, Roger Creus and Li, Bo and Jin, Xin and Berseth, Glen and Zeng, Wenjun},

  journal={arXiv preprint arXiv:2405.19548},

  year={2024}

}

```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/RLE-Foundation/RLeXplore

Awesome Lists containing this project

README