{"id":14081270,"url":"https://github.com/RLE-Foundation/RLeXplore","last_synced_at":"2025-07-30T19:32:44.682Z","repository":{"id":59869963,"uuid":"538472312","full_name":"RLE-Foundation/RLeXplore","owner":"RLE-Foundation","description":"RLeXplore provides stable baselines of exploration methods in reinforcement learning, such as intrinsic curiosity module (ICM), random network distillation (RND) and rewarding impact-driven exploration (RIDE).","archived":false,"fork":false,"pushed_at":"2024-09-29T12:55:16.000Z","size":16892,"stargazers_count":352,"open_issues_count":14,"forks_count":16,"subscribers_count":3,"default_branch":"main","last_synced_at":"2024-10-06T08:45:50.255Z","etag":null,"topics":["baselines","efficient-algorithm","exploration-strategy","gym","machine-learning","pybullet","pytorch","reinforcement-learning","robotics","toolbox"],"latest_commit_sha":null,"homepage":"https://docs.rllte.dev/","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/RLE-Foundation.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2022-09-19T11:38:16.000Z","updated_at":"2024-10-02T14:23:20.000Z","dependencies_parsed_at":"2024-02-27T21:27:57.401Z","dependency_job_id":"cab4fd66-5353-4255-b9a8-e3564495ed8b","html_url":"https://github.com/RLE-Foundation/RLeXplore","commit_stats":{"total_commits":56,"total_committers":2,"mean_commits":28.0,"dds":0.0535714285714286,"last_synced_commit":"4140bd560206b87a9d73c73dd2aba016cfadd210"},"previous_names":["yuanmingqi/rlexplore","yuanmingqi/rl-exploration-baselines"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/RLE-Foundation%2FRLeXplore","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/RLE-Foundation%2FRLeXplore/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/RLE-Foundation%2FRLeXplore/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/RLE-Foundation%2FRLeXplore/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/RLE-Foundation","download_url":"https://codeload.github.com/RLE-Foundation/RLeXplore/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":228178912,"owners_count":17881108,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["baselines","efficient-algorithm","exploration-strategy","gym","machine-learning","pybullet","pytorch","reinforcement-learning","robotics","toolbox"],"created_at":"2024-08-13T13:00:37.039Z","updated_at":"2025-07-30T19:32:44.673Z","avatar_url":"https://github.com/RLE-Foundation.png","language":"Jupyter Notebook","funding_links":[],"categories":["Industry Strength RL"],"sub_categories":[],"readme":"\u003cdiv align=center\u003e\n\u003cbr\u003e\n\u003cimg src='./assets/logo.png' style=\"width: 70%\"\u003e\n\u003cbr\u003e\n\n## RLeXplore: Accelerating Research in Intrinsically-Motivated Reinforcement Learning\n\u003c/div\u003e\n\n**RLeXplore** is a unified, highly-modularized and plug-and-play toolkit that currently provides high-quality and reliable implementations of eight representative intrinsic reward algorithms. It used to be challenging to compare intrinsic reward algorithms due to various confounding factors, including distinct implementations, optimization strategies, and evaluation methodologies. Therefore, RLeXplore is designed to provide unified and standardized procedures for constructing, computing, and optimizing intrinsic reward modules.\n\nThe workflow of RLeXplore is illustrated as follows:\n\u003cdiv align=center\u003e\n\u003cimg src='./assets/workflow.png' style=\"width: 100%\"\u003e\n\u003c/div\u003e\n\n# Table of Contents\n- [Installation](#installation)\n- [Module List](#module-list)\n- [Tutorials](#tutorials)\n- [Benchmark Results](#benchmark-results)\n- [Cite Us](#cite-us)\n\n# Installation\n- with pip `recommended`\n\nOpen a terminal and install **rllte** with `pip`:\n``` shell\nconda create -n rllte python=3.8\npip install rllte-core \n```\n\n- with git\n\nOpen a terminal and clone the repository from [GitHub](https://github.com/RLE-Foundation/rllte) with `git`:\n``` sh\ngit clone https://github.com/RLE-Foundation/rllte.git\npip install -e .\n```\n\nNow you can invoke the intrinsic reward module by:\n``` python\nfrom rllte.xplore.reward import ICM, RIDE, ...\n```\n\n## Module List\n| **Type** \t| **Modules** \t|\n|---\t|---\t|\n| Count-based \t| [PseudoCounts](https://arxiv.org/pdf/2002.06038), [RND](https://arxiv.org/pdf/1810.12894.pdf), [E3B](https://proceedings.neurips.cc/paper_files/paper/2022/file/f4f79698d48bdc1a6dec20583724182b-Paper-Conference.pdf) \t|\n| Curiosity-driven \t| [ICM](http://proceedings.mlr.press/v70/pathak17a/pathak17a.pdf), [Disagreement](https://arxiv.org/pdf/1906.04161.pdf), [RIDE](https://arxiv.org/pdf/2002.12292) \t|\n| Memory-based \t| [NGU](https://arxiv.org/pdf/2002.06038) \t|\n| Information theory-based \t| [RE3](http://proceedings.mlr.press/v139/seo21a/seo21a.pdf) \t|\n\n## Tutorials\nClick the following links to get the code notebook:\n\n0. [Quick Start](./0%20quick_start.ipynb)\n1. [RLeXplore with RLLTE](./1%20rlexplore_with_rllte.ipynb)\n2. [RLeXplore with Stable-Baselines3](./2%20rlexplore_with_sb3.ipynb)\n3. [RLeXplore with CleanRL](./3%20rlexplore_with_cleanrl.py)\n4. [Exploring Hybrid Intrinsic Rewards](./4%20hybrid_intrinsic_rewards.ipynb)\n4. [Custom Intrinsic Rewards](./5%20custom_intrinsic_reward.ipynb)\n\n## Benchmark Results\nWe have published a space using Weights \u0026 Biases (W\u0026B) to store reusable experiment results on recognized benchmarks. The space link is: [RLeXplore's W\u0026B Space](https://wandb.ai/yuanmingqi/RLeXplore/reportlist).\n\n\u003cdiv align=center\u003e\n\u003cimg src='./assets/wandb.png' style=\"width: 75%\"\u003e\n\u003c/div\u003e\n\n\n- `RLLTE's PPO+RLeXplore` on *SuperMarioBros*:\n\n\u003cdiv align=center\u003e\n\u003cimg src='./assets/smb.png' style=\"width: 100%\"\u003e\n\u003c/div\u003e\n\n- `RLLTE's PPO+RLeXplore` on *MiniGrid*:\n\n  + DoorKey-16×16\n  \u003cdiv align=center\u003e\n  \u003cimg src='./assets/mgd.png' style=\"width: 100%\"\u003e\n  \u003c/div\u003e\n\n  + KeyCorridorS8R5, KeyCorridorS9R6, KeyCorridorS10R7, MultiRoom-N7-S8, MultiRoom-N10-S10, MultiRoom-N12-S10,\tDynamic-Obstacles-16x16,\tand LockedRoom\n  \u003cdiv align=center\u003e\n  \u003cimg src='./assets/mg_hard.png' style=\"width: 100%\"\u003e\n  \u003c/div\u003e\n\n- `RLLTE's PPO+RLeXplore` on *Procgen-Maze*:\n\n  + Number of levels=1\n  \u003cdiv align=center\u003e\n  \u003cimg src='./assets/procgen_1maze.png' style=\"width: 100%\"\u003e\n  \u003c/div\u003e\n\n  + Number of levels=200\n  \u003cdiv align=center\u003e\n  \u003cimg src='./assets/procgen_allmaze.png' style=\"width: 100%\"\u003e\n  \u003c/div\u003e\n\n- `RLLTE's PPO+RLeXplore` on five hard-exploration tasks of *ALE*:\n\n| **Algorithm** | **Gravitar** | **MontezumaRevenge** | **PrivateEye** | **Seaquest** | **Venture** |\n|:-------------:|:------------:|:--------------------:|:--------------:|:------------:|:-----------:|\n| Extrinsic     |  **1060.19** |         42.83        |      88.37     |    942.37    |    391.73   |\n| Disagreement  |    689.12    |         0.00         |      33.23     |    6577.03   |    468.43   |\n| E3B           |    503.43    |         0.50         |      66.23     |  **8690.65** |     0.80    |\n| ICM           |    194.71    |         31.14        |     -27.50     |    2626.13   |     0.54    |\n| PseudoCounts  |    295.49    |         0.00         |   **1076.74**  |    668.96    |     1.03    |\n| RE3           |    130.00    |         2.68         |     312.72     |    864.60    |     0.06    |\n| RIDE          |    452.53    |         0.00         |      -1.40     |    1024.39   |    404.81   |\n| RND           |    835.57    |      **160.22**      |      45.85     |    5989.06   |  **544.73** |\n\n- `CleanRL's PPO+RLeXplore's RND` on *Montezuma's Revenge*:\n\n\u003cdiv align=center\u003e\n\u003cimg src='./assets/atari_curves.png' style=\"width: 70%\"\u003e\n\u003c/div\u003e\n\n\n- `RLLTE's SAC+RLeXplore` on *Ant-UMaze*:\n\n\u003cdiv align=center\u003e\n\u003cimg src='./assets/sac_ant.png' style=\"width: 70%\"\u003e\n\u003c/div\u003e\n\n## Cite Us\nTo cite this repository in publications:\n\n``` bib\n@article{yuan_roger2025rlexplore,\n  title={RLeXplore: Accelerating Research in Intrinsically-Motivated Reinforcement Learning},\n  author={Yuan, Mingqi and Castanyer, Roger Creus and Li, Bo and Jin, Xin and Berseth, Glen and Zeng, Wenjun},\n  journal={Transactions on Machine Learning Research},\n  issn={2835-8856},\n  year={2025},\n  url={https://openreview.net/forum?id=B9BHjTN4z6},\n  note={}\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FRLE-Foundation%2FRLeXplore","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FRLE-Foundation%2FRLeXplore","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FRLE-Foundation%2FRLeXplore/lists"}