Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
awesome-exploration-rl
A curated list of awesome exploration RL resources (continually updated)
https://github.com/opendilab/awesome-exploration-rl
Last synced: 1 day ago
JSON representation
-
A Taxonomy of Exploration RL Methods
-
Papers
-
Classic Exploration RL Papers
- VizDoom - py)
- An empirical evaluation of thompson sampling
- Using Confidence Bounds for Exploitation-Exploration Trade-offs
- How can we define intrinsic motivation?
- A Contextual-Bandit Approach to Personalized News Article Recommendation
- A Tutorial on Thompson Sampling
- Large-Scale Study of Curiosity-Driven Learning
- Episodic Curiosity through Reachability
- Self-Supervised Exploration via Disagreement
- EMI: Exploration with Mutual Information
- Optimistic Exploration even with a Pessimistic Initialisation
- RIDE: Rewarding Impact-Driven Exploration for Procedurally-Generated Environments
- Neural Contextual Bandits with UCB-based Exploration
- MNIST
- Rank the Episodes: A Simple Approach for Exploration in Procedurally-Generated Environments
- First return then explore
- \#Exploration: A Study of Count-Based Exploration for Deep Reinforcement Learning
- First return then explore
- First return then explore
- First return then explore
- First return then explore
- First return then explore
- First return then explore
- First return then explore
- First return then explore
- First return then explore
- Atari
- First return then explore
- rllab
- Unifying Count-Based Exploration and Intrinsic Motivation
- First return then explore
- First return then explore
- First return then explore
- Deep Exploration via Bootstrapped DQN
- First return then explore
- First return then explore
- First return then explore
- First return then explore
- First return then explore
- First return then explore
- First return then explore
- First return then explore
- First return then explore
- First return then explore
- First return then explore
- First return then explore
- First return then explore
- First return then explore
- First return then explore
- First return then explore
- First return then explore
- First return then explore
- First return then explore
- First return then explore
- First return then explore
- First return then explore
- VIME: Variational information maximizing exploration
- First return then explore
- First return then explore
- First return then explore
- First return then explore
- First return then explore
- First return then explore
- First return then explore
- First return then explore
- (More) Efficient Reinforcement Learning via Posterior Sampling
- First return then explore
- First return then explore
- First return then explore
- First return then explore
- First return then explore
- First return then explore
-
ICML 2023
- Guiding Pretraining in Reinforcement Learning with Large Language Models
- A Study of Global and Episodic Bonuses for Exploration in Contextual MDPs
- Curiosity in Hindsight: Intrinsic Exploration in Stochastic Environments
- Representations and Exploration for Deep Reinforcement Learning using Singular Value Decomposition
- Reparameterized Policy Learning for Multimodal Trajectory Optimization
- Fast Rates for Maximum Entropy Exploration
- Do Embodied Agents Dream of Pixelated Sheep?: Embodied Decision Making using Language Guided World Modelling
- Cell-Free Latent Go-Explore
- Go Beyond Imagination: Maximizing Episodic Reachability with World Models
- Efficient Online Reinforcement Learning with Offline Data
- Anti-Exploration by Random Network Distillation
- The Impact of Exploration on Convergence and Performance of Multi-Agent Q-Learning Dynamics
- An Adaptive Entropy-Regularization Framework for Multi-Agent Reinforcement Learning
- Lazy Agents: A New Perspective on Solving Sparse Reward Problem in Multi-agent Reinforcement Learning
- Automatic Intrinsic Reward Shaping for Exploration in Deep Reinforcement Learning
- LESSON: Learning to Integrate Exploration Strategies for Reinforcement Learning via an Option Framework
- Flipping Coins to Estimate Pseudocounts for Exploration in Reinforcement Learning
-
ICLR 2023
- The Role of Coverage in Online Reinforcement Learning
- Learnable Behavior Control: Breaking Atari Human World Records via Sample-Efficient Behavior Selection
- Planning Goals for Exploration
- Pink Noise Is All You Need: Colored Noise Exploration in Deep Reinforcement Learning
- Learning About Progress From Experts
- DEP-RL: Embodied Exploration for Reinforcement Learning in Overactuated and Musculoskeletal Systems
- Does Zero-Shot Reinforcement Learning Exist?
- Human-level Atari 200x faster
- Learning Achievement Structure for Structured Exploration in Domains with Sparse Reward
- Safe Exploration Incurs Nearly No Additional Sample Complexity for Reward-Free RL
- Latent State Marginalization as a Low-cost Approach to Improving Exploration
- Revisiting Curiosity for Exploration in Procedurally Generated Environments
- MoDem: Accelerating Visual Model-Based Reinforcement Learning with Demonstrations
- Simplifying Model-based RL: Learning Representations, Latent-space Models, and Policies with One Objective
- EUCLID: Towards Efficient Unsupervised Reinforcement Learning with Multi-choice Dynamics Model
- Guarded Policy Optimization with Imperfect Online Demonstrations
- NetHack
- Near-optimal Policy Identification in Active Reinforcement Learning
-
NeurIPS 2023
- On the Importance of Exploration for Generalization in Reinforcement Learning
- Monte Carlo Tree Search with Boltzmann Exploration
- Breadcrumbs to the Goal: Supervised Goal Selection from Human-in-the-Loop Feedback
- MIMEx: Intrinsic Rewards from Masked Input Modeling
- Accelerating Exploration with Unlabeled Prior Data
- Pitfall of Optimism: Distributional Reinforcement Learning by Randomizing Risk Criterion
- CQM: Curriculum Reinforcement Learning with a Quantized World Model
- Safe Exploration in Reinforcement Learning: A Generalized Formulation and Algorithms
- Successor-Predecessor Intrinsic Exploration
- Accelerating Reinforcement Learning with Value-Conditional State Entropy Exploration
- ELDEN: Exploration via Local Dependencies
- On the Convergence and Sample Complexity Analysis of Deep Q-Networks with ε-Greedy Exploration
- Maximize to Explore: One Objective Function Fusing Estimation, Planning, and Exploration
-
ICML 2022
- Ant Maze
- From Dirichlet to Rubin: Optimistic Exploration in RL without Bonuses
- The Importance of Non-Markovianity in Maximum State Entropy Exploration
- Phasic Self-Imitative Reduction for Sparse-Reward Goal-Conditioned Reinforcement Learning
- Thompson Sampling for (Combinatorial) Pure Exploration
- Near-Optimal Algorithms for Autonomous Exploration and Multi-Goal Stochastic Shortest Path
- Safe Exploration for Efficient Policy Evaluation and Comparison
- Ant Maze
- From Dirichlet to Rubin: Optimistic Exploration in RL without Bonuses
- The Importance of Non-Markovianity in Maximum State Entropy Exploration
-
NeurIPS 2022
- Redeeming Intrinsic Rewards via Constrained Optimization
- You Only Live Once: Single-Life Reinforcement Learning via Learned Reward Shaping
- Curious Exploration via Structured World Models Yields Zero-Shot Object Manipulation
- Model-based Lifelong Reinforcement Learning with Bayesian Exploration
- On the Statistical Efficiency of Reward-Free Exploration in Non-Linear RL
- DOPE: Doubly Optimistic and Pessimistic Exploration for Safe Reinforcement Learning
- Bayesian Optimistic Optimization: Optimistic Exploration for Model-based Reinforcement Learning
- Active Exploration for Inverse Reinforcement Learning
- Exploration-Guided Reward Shaping for Reinforcement Learning under Sparse Rewards
- Monte Carlo Augmented Actor-Critic for Sparse Reward Deep Reinforcement Learning from Suboptimal Demonstrations
- Incentivizing Combinatorial Bandit Exploration
-
ICLR 2022
- Learning Long-Term Reward Redistribution via Randomized Return Decomposition
- The Information Geometry of Unsupervised Reinforcement Learning
- When should agents explore?
- Learning more skills through optimistic exploration
- Reinforcement Learning with Sparse Rewards using Guidance from Offline Demonstration
- Generative Planning for Temporally Coordinated Exploration in Reinforcement Learning
- Learning Altruistic Behaviours in Reinforcement Learning without External Rewards
- blackjack
- Anti-Concentrated Confidence Bonuses for Scalable Exploration
- Lipschitz-constrained Unsupervised Skill Discovery
- LIGS: Learnable Intrinsic-Reward Generation Selection for Multi-Agent Learning
- Multi-Stage Episodic Control for Strategic Exploration in Text Games
- On the Convergence of the Monte Carlo Exploring Starts Algorithm for Reinforcement Learning
- Jericho
- multi-armed bandit
- foraging
-
NeurIPS 2021
- Interesting Object, Curious Agent: Learning Task-Agnostic Exploration
- Tactical Optimism and Pessimism for Deep Reinforcement Learning
- Which Mutual-Information Representation Learning Objectives are Sufficient for Control?
- On the Theory of Reinforcement Learning with Once-per-Episode Feedback
- MADE: Exploration via Maximizing Deviation from Explored Regions
- Adversarial Intrinsic Motivation for Reinforcement Learning
- Information Directed Reward Learning for Reinforcement Learning
- Dynamic Bottleneck for Robust Self-Supervised Exploration
- Hierarchical Skills for Efficient Exploration
- Exploration-Exploitation in Multi-Agent Competition: Convergence with Bounded Rationality
- NovelD: A Simple yet Effective Exploration Criterion
- Episodic Multi-agent Reinforcement Learning with Curiosity-driven Exploration
- Learning Diverse Policies in MOBA Games via Macro-Goals
- honor of kings
- CIC: Contrastive Intrinsic Control for Unsupervised Skill Discovery
- URLB
- MiniGrid
- MuJoCo
- Predator-Prey
-
ICLR 2024
- A Theoretical Explanation of Deep RL Performance in Stochastic Environments
- DrM: Mastering Visual Reinforcement Learning through Dormant Ratio Minimization
- METRA: Scalable Unsupervised RL with Metric-Aware Abstraction
- Text2Reward: Reward Shaping with Language Models for Reinforcement Learning
- Pre-Training Goal-based Models for Sample-Efficient Reinforcement Learning
- Efficient Episodic Memory Utilization of Cooperative Multi-Agent Reinforcement Learning
- Simple Hierarchical Planning with Diffusion
- Sample Efficient Myopic Exploration Through Multitask Reinforcement Learning with Diverse Tasks
- PAE: Reinforcement Learning from External Knowledge for Efficient Exploration
- In-context Exploration-Exploitation for Reinforcement Learning
- Transformers as Decision Makers: Provable In-Context Reinforcement Learning via Supervised Pretraining
- Learning to Act without Actions
- Consciousness-Inspired Spatio-Temporal Abstractions for Better Generalization in Reinforcement Learning
- Unlocking the Power of Representations in Long-term Novelty-based Exploration
-
ICML 2024
- Scalable Online Exploration via Coverability
- Just Cluster It: An Approach for Exploration in High-Dimensions using Clustering and Pre-Trained Representations
- Provably Efficient Long-Horizon Exploration in Monte Carlo Tree Search through State Occupancy Regularization
- Efficient Exploration for LLMs
- Constrained Ensemble Exploration for Unsupervised Skill Discovery
- Random Latent Exploration for Deep Reinforcement Learning
- Exploration and Anti-Exploration with Distributional Random Network Distillation
- Breadth-First Exploration on Adaptive Grid for Reinforcement Learning
- Uncertainty-Aware Reward-Free Exploration with General Function Approximation
- Bayesian Exploration Networks
- Geometric Active Exploration in Markov Decision Processes: the Benefit of Abstraction
- Fast Peer Adaptation with Context-aware Exploration
- Individual Contributions as Intrinsic Exploration Scaffolds for Multi-agent Reinforcement Learning
-
Sub Categories
Keywords
reinforcement-learning
3
gym
2
artificial-intelligence
1
deep-learning
1
machine-learning
1
mujoco
1
neural-networks
1
physics-simulation
1
c
1
game
1
roguelike
1
gridworld-environment
1
api
1
gymnasium
1
multi-agent-reinforcement-learning
1
multiagent-reinforcement-learning
1
interactive-fiction
1
text-based-adventure
1