Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
awesome-model-based-RL
A curated list of awesome model based RL resources (continually updated)
https://github.com/opendilab/awesome-model-based-RL
Last synced: 1 day ago
JSON representation
-
Papers
-
NeurIPS 2021
- Self-Consistent Models and Values
- Proper Value Equivalence
- Model-Based Reinforcement Learning via Imagination with Derived Memory
- On Effective Scheduling of Model-based Reinforcement Learning
- Safe Reinforcement Learning by Imagining the Near Future
- MobILE: Model-Based Imitation Learning From Observation Alone
- Model-Based Episodic Memory Induces Dynamic Hybrid Controls
- A Consciousness-Inspired Planning Agent for Model-Based Reinforcement Learning
- Mastering Atari Games with Limited Data
- Online and Offline Reinforcement Learning by Planning with a Learned Model
- MOPO: Model-based Offline Policy Optimization
- RoMA: Robust Model Adaptation for Offline Model-based Optimization
- Offline Reinforcement Learning with Reverse Model-based Imagination
- Offline Model-based Adaptable Policy Learning
- Weighted model estimation for offline model-based reinforcement learning
- Reward-Free Model-Based Reinforcement Learning with Linear Function Approximation
- Provable Model-based Nonlinear Bandit and Reinforcement Learning: Shelve Optimism, Embrace Virtual Curvature
- Discovering and Achieving Goals via World Models
- four rooms
- walker, quadruped, bins, kitchen
- 2D maze navigation - miniworld](https://github.com/maximecb/gym-miniworld)
- mujoco
- COMBO: Conservative Offline Model-Based Policy Optimization
-
NeurIPS 2023
- Efficient Exploration in Continuous-time Model-based Reinforcement Learning
- Action Inference by Maximising Evidence: Zero-Shot Imitation from Observation with World Models
- STORM: Efficient Stochastic Transformer based World Models for Reinforcement Learning
- Optimal Exploration for Model-Based RL in Nonlinear Systems
- board games - py), [gobigger](https://github.com/opendilab/GoBigger)
- RePo: Resilient Model-Based Reinforcement Learning by Regularizing Posterior Predictability
- LightZero: A Unified Benchmark for Monte Carlo Tree Search in General Sequential Decision Scenarios
- Diffusion Model is an Effective Planner and Data Synthesizer for Multi-Task Reinforcement Learning
- MoVie: Visual Model-Based Policy Adaptation for View Generalization
- Model-Based Reparameterization Policy Gradient Methods: Theory and Practical Algorithms
- Leveraging Pre-trained Large Language Models to Construct and Utilize World Models for Model-based Task Planning
- Model-Based Control with Sparse Neural Dynamics
- affine dynamics system
- State2Explanation: Concept-Based Explanations to Benefit Agent Learning and User Understanding
- Facing Off World Model Backbones: RNNs, Transformers, and S4
- Pre-training Contextualized World Models with In-the-wild Videos for Reinforcement Learning
- Conformal Prediction for Uncertainty-Aware Planning with Diffusion Dynamics Model
- Describe, Explain, Plan and Select: Interactive Planning with LLMs Enables Open-World Multi-Task Agents
- minecraft
- Large Language Models as Commonsense Knowledge for Large-Scale Task Planning
-
NeurIPS 2022
- When to Update Your Model: Constrained Model-based Reinforcement Learning
- Data-Driven Model-Based Optimization via Invariant Representation Learning
- Learning to Attack Federated Learning: A Model-based Reinforcement Learning Attack Framework
- Model-based Lifelong Reinforcement Learning with Bayesian Exploration
- Model-Based Imitation Learning for Urban Driving
- Model-based Safe Deep Reinforcement Learning via a Constrained Proximal Policy Optimization Algorithm
- Bidirectional Learning for Offline Infinite-width Model-based Optimization
- A Unified Framework for Alternating Offline Model Training and Policy Learning
- Model-Based Offline Reinforcement Learning with Pessimism-Modulated Dynamics Belief
- Double Check Your State Before Trusting It: Confidence-Aware Bidirectional Offline Model-Based Imagination
- Model-Based Opponent Modeling
- Mingling Foresight with Imagination: Model-Based Cooperative Multi-Agent Reinforcement Learning
- MoCoDA: Model-based Counterfactual Data Augmentation
- 2D Navigation - Sweep](https://github.com/spitis/mrl/blob/master/envs/customfetch/custom_fetch.py#L1699)
- Plan To Predict: Learning an Uncertainty-Foreseeing Model For Model-Based Reinforcement Learning
- Joint Model-Policy Optimization of a Lower Bound for Model-Based RL
- gridworld - py), [ROBEL manipulation](https://github.com/google-research/robel)
- RAMBO-RL: Robust Adversarial Model-Based Offline Reinforcement Learning
- Conservative Dual Policy Optimization for Efficient Model-Based Reinforcement Learning
- Bayesian Optimistic Optimization: Optimistic Exploration for Model-based Reinforcement Learning
- Model-based RL with Optimistic Posterior Sampling: Structural Conditions and Sample Complexity
- Exponential Family Model-Based Reinforcement Learning via Score Matching
- Deep Hierarchical Planning from Pixels
- Continuous MDP Homomorphisms and Homomorphic Policy Gradient
- CARLA
- mpe - research/football)
- safety gym
- StarCraft II - research/football), [Multi-Agent Discrete MuJoCo](https://github.com/schroederdewitt/multiagent_mujoco)
-
ICML 2023
- Mastering the Unsupervised Reinforcement Learning Benchmark from Pixels
- Reparameterized Policy Learning for Multimodal Trajectory Optimization
- Live in the Moment: Learning Dynamics Model Adapted to Evolving Policy
- Predictable MDP Abstraction for Unsupervised Model-Based RL
- Investigating the Role of Model-Based Learning in Exploration and Transfer
- The Virtues of Laziness in Model-based RL: A Unified Objective and Algorithms
- Helicopter, WideTree, Linear Dynamical System, Maze - py)
- The Benefits of Model-Based Generalization in Reinforcement Learning
- ProcMaze, ButtonGrid, PanFlute
- STEERING: Stein Information Directed Exploration for Model-Based Reinforcement Learning
- DeepSea
- Model-based Reinforcement Learning with Scalable Composite Policy Gradient Estimators
- Reinforcement Learning with History Dependent Dynamic Contexts
- MovieLens dataset
- Model-Bellman Inconsistency for Model-based Offline Reinforcement Learning
- Simplified Temporal Consistency Reinforcement Learning
- Curious Replay for Model-based Adaptation
- On Many-Actions Policy Gradient
- Posterior Sampling for Deep Reinforcement Learning
- Model-based Offline Reinforcement Learning with Count-based Conservatism
- Meta-World - py)
- Crafter
-
ICLR 2023
- Transformers are Sample-Efficient World Models
- Conservative Bayesian Model-Based Value Expansion for Offline Policy Optimization
- User-Interactive Offline Reinforcement Learning
- CLARE: Conservative Model-Based Reward Learning for Offline Inverse Reinforcement Learning
- Efficient Offline Policy Optimization with a Learned Model
- Efficient Planning in a Compact Latent Action Space
- Is Model Ensemble Necessary? Model-based RL via a Single Model with Lipschitz Regularized Value Function
- MoDem: Accelerating Visual Model-Based Reinforcement Learning with Demonstrations
- Simplifying Model-based RL: Learning Representations, Latent-space Models, and Policies with One Objective
- The Benefits of Model-Based Generalization in Reinforcement Learning
- Diminishing Return of Value Expansion Methods in Model-Based Reinforcement Learning
- Planning Goals for Exploration
- Making Better Decision by Directly Planning in Continuous Control
- Latent Variable Representation for Reinforcement Learning
- SpeedyZero: Mastering Atari with Limited Data and Time
- Transformer-based World Models Are Happy With 100k Interactions
- On the Feasibility of Cross-Task Transfer with Model-Based Reinforcement Learning
- Become a Proficient Player with Limited Data through Watching Pure Videos
- EUCLID: Towards Efficient Unsupervised Reinforcement Learning with Multi-choice Dynamics Model
- Choreographer: Learning and Adapting Skills in Imagination
- brax
- URLB benchmark
- adroit - world](https://github.com/rlworkgroup/metaworld), [deepmind control suite](https://github.com/deepmind/dm_control)
-
ICLR 2024
- Differentiable Trajectory Optimization as a Policy Class for Reinforcement and Imitation Learning
- DMBP: Diffusion model based predictor for robust offline reinforcement learning against state observation perturbations
- TD-MPC2: Scalable, Robust World Models for Continuous Control
- Robust Model Based Reinforcement Learning Using L1 Adaptive Control
- Mastering Memory Tasks with World Models
- Learning Hierarchical World Models with Adaptive Temporal Abstractions from Discrete Latent Dynamics
- COPlanner: Plan to Roll Out Conservatively but to Explore Optimistically for Model-Based RL
- Efficient Multi-agent Reinforcement Learning by Planning
- MAMBA: an Effective World Model Approach for Meta-Reinforcement Learning
- Point Robot Navigation, Escape Room
- Reward-Consistent Dynamics Models are Strongly Generalizable for Offline Reinforcement Learning
- DreamSmooth: Improving Model-based Reinforcement Learning via Reward Smoothing
- Informed POMDP: Leveraging Additional Information in Model-Based RL
- varying mountain hike
- Privileged Sensing Scaffolds Reinforcement Learning
- bsuite - maze)
- gymnasium robotics
- MiniHack
- smac
- robodesk - dynamics/)
- Policy Rehearsing: Training Generalizable Policies for Reinforcement Learning
- Efficient Dynamics Modeling in Interactive Environments with Koopman Theory
- Combining Spatial and Temporal Abstraction in Planning for Better Generalization
- Learning Unsupervised World Models for Autonomous Driving via Discrete Diffusion
- NuScenes
-
Classic Model-Based RL Papers
- PILCO: A Model-Based and Data-Efficient Approach to Policy Search
- Learning Complex Neural Network Policies with Trajectory Optimization
- Learning Continuous Control Policies by Stochastic Value Gradients
- Value Prediction Network
- Sample-Efficient Reinforcement Learning with Stochastic Ensemble Value Expansion
- Recurrent World Models Facilitate Policy Evolution
- Deep Reinforcement Learning in a Handful of Trials using Probabilistic Dynamics Models
- When to Trust Your Model: Model-Based Policy Optimization
- Algorithmic Framework for Model-based Deep Reinforcement Learning with Theoretical Guarantees
- Model-Ensemble Trust-Region Policy Optimization
- Dream to Control: Learning Behaviors by Latent Imagination
- Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model
- Exploring Model-based Planning with Policy Networks
- Dyna, an integrated architecture for learning, planning, and reacting
-
ICLR 2021
- d4rl dataset
- RL Unplugged(RLU) - berkeley/d4rl)
- Deployment-Efficient Reinforcement Learning via Model-Based Offline Optimization
- Control-Aware Representations for Model-based Reinforcement Learning
- Mastering Atari with Discrete World Models
- Model-Based Visual Planning with Self-Supervised Functional Distances
- Model-Based Offline Planning
- Offline Model-Based Optimization via Normalized Maximum Likelihood Estimation
- On the role of planning in model-based deep reinforcement learning
- Representation Balancing Offline Model-based Reinforcement Learning
- Model-based micro-data reinforcement learning: what are the crucial model properties and which model to choose?
- RL Unplugged(RLU) - berkeley/d4rl)
-
ICML 2022
- DreamerPro: Reconstruction-Free Model-Based Reinforcement Learning with Prototypical Representations
- Denoised MDPs: Learning World Models Better Than the World Itself
- Model-based Meta Reinforcement Learning using Graph Structured Surrogate Models and Amortized Policy Search
- Towards Adaptive Model-Based Reinforcement Learning
- Efficient Model-based Multi-agent Reinforcement Learning via Optimistic Equilibrium Computation
- Regularizing a Model-based Policy Stationary Distribution to Stabilize Offline Reinforcement Learning
- Design-Bench: Benchmarks for Data-Driven Offline Model-Based Optimization
- Temporal Difference Learning for Model Predictive Control
- Regularizing a Model-based Policy Stationary Distribution to Stabilize Offline Reinforcement Learning
- Design-Bench Benchmark Tasks
- GridWorldLoCA, ReacherLoCA, MountaincarLoCA
- Efficient Model-based Multi-agent Reinforcement Learning via Optimistic Equilibrium Computation
- SMART
- Design-Bench: Benchmarks for Data-Driven Offline Model-Based Optimization
- Model-based Meta Reinforcement Learning using Graph Structured Surrogate Models and Amortized Policy Search
- Denoised MDPs: Learning World Models Better Than the World Itself
- Temporal Difference Learning for Model Predictive Control
- Towards Adaptive Model-Based Reinforcement Learning
-
ICLR 2022
- Revisiting Design Choices in Offline Model Based Reinforcement Learning
- Value Gradient weighted Model-Based Reinforcement Learning
- Planning in Stochastic Environments with a Learned Model
- Policy improvement by planning with Gumbel
- Model-Based Offline Meta-Reinforcement Learning with Regularization
- On-Policy Model Errors in Reinforcement Learning
- A Relational Intervention Approach for Unsupervised Dynamics Generalization in Model-Based Reinforcement Learning
- Information Prioritization through Empowerment in Visual Model-based RL
- Transfer RL across Observation Feature Spaces via Model-Based Regularization
- Learning State Representations via Retracing in Reinforcement Learning
- Model-augmented Prioritized Experience Replay
- Evaluating Model-Based Planning and Planner Amortization for Continuous Control
- Gradient Information Matters in Policy Optimization by Back-propagating through Model
- Pareto Policy Pool for Model-based Offline Reinforcement Learning
- Pessimistic Model-based Offline Reinforcement Learning under Partial Coverage
- Know Thyself: Transferable Visual Control Policies Through Robot-Awareness
- pybullet
-
ICML 2021
- sparse metaworld tasks
- Conservative Objective Models for Effective Offline Model-Based Optimization
- Continuous-Time Model-Based Reinforcement Learning
- Model-Based Reinforcement Learning via Latent-Space Collocation
- Model-Free and Model-Based Policy Evaluation when Causality is Uncertain
- Muesli: Combining Improvements in Policy Optimization
- Vector Quantized Models for Planning
- chess datasets
- PC-MLP: Model-based Reinforcement Learning with Policy Cover Guided Exploration
- Temporal Predictive Coding For Model-Based Planning In Latent Space
- Model-based Reinforcement Learning for Continuous Control with Posterior Sampling
- A Sharp Analysis of Model-based Reinforcement Learning with Self-Play
- Vector Quantized Models for Planning
- ope-tools
- pendulum, cartPole and acrobot
- design-bench
- sparse metaworld tasks
-
Other
- Mastering Diverse Domains through World Models
- Sample-Efficient Learning to Solve a Real-World Labyrinth Game Using Data-Augmented Model-Based Reinforcement Learning
- Theoretically Guaranteed Policy Improvement Distilled from Model-Based Planning
- World Models via Policy-Guided Trajectory Diffusion
- Model-Based Epistemic Variance of Values for Risk-Aware Policy Optimization
- Masked Trajectory Models for Prediction, Representation, and Control
-
ICML 2024
- rlbench
- HarmonyDream: Task Harmonization Inside World Models
- 3D-VLA: A 3D Vision-Language-Action Generative World Model
- CompeteAI: Understanding the Competition Behaviors in Large Language Model-based Agents
- Model-based Reinforcement Learning for Parameterized Action Spaces
- platform, goal, hard goal, catch point, hard move
- Learning Latent Dynamic Robust Representations for World Models
- AD3: Implicit Action is the Key for World Models to Distinguish the Diverse Visual Distractors
- Hieros: Hierarchical Imagination on Structured State Space Sequence World Models
- Improving Token-Based World Models with Parallel Observation Prediction
- Do Transformer World Models Give Better Policy Gradients?
- Dr. Strategy: Model-Based Generalist Agents with Strategic Dreaming
- Towards Robust Model-Based Reinforcement Learning Against Adversarial Corruption
- Model-based Reinforcement Learning for Confounded POMDPs
-
NeurIPS 2024
- d4rl
- Deterministic Uncertainty Propagation for Improved Model-Based Offline Reinforcement Learning
- WorldCoder, a Model-Based LLM Agent: Building World Models by Writing Code and Interacting with the Environment
- sokoban - Foundation/Minigrid), [alfworld](https://github.com/alfworld/alfworld)
- The Edge-of-Reach Problem in Offline Model-Based Reinforcement Learning
- BECAUSE: Bilinear Causal Representation for Generalizable Offline Model-based Reinforcement Learning
- list - Foundation/Minigrid), [crash](https://github.com/Farama-Foundation/HighwayEnv)
- Model-Based Transfer Learning for Contextual Reinforcement Learning
- Rethinking Model-based, Policy-based, and Value-based Reinforcement Learning via the Lens of Representation Complexity
-
-
A Taxonomy of Model-Based RL Algorithms
-
Tutorial
-
Codebase
Programming Languages
Sub Categories
Keywords
reinforcement-learning
10
mujoco
4
python
4
robotics
3
simulation
3
model-based-reinforcement-learning
2
environment
2
machine-learning
2
starcraft-ii
2
autonomous-driving
1
minecraft
1
deep-learning
1
artificial-intelligence
1
pybullet
1
openai-gym
1
multiagent-systems
1
benchmark
1
gymnasium
1
d4rl
1
paper
1
multi-task
1
meta-rl
1
benchmark-environments
1
physics-simulation
1
sokoban
1
openai
1
gym
1
starcraft-ii-replays
1
deepmind
1
blizzard-api
1
smac
1
self-play
1
reinforcement-learning-algorithms
1
r2d2
1
pytorch-rl
1
offline-rl
1
multiagent-reinforcement-learning
1
minigrid
1
inverse-reinforcement-learning
1
impala
1
imitation-learning
1
exploration-exploitation
1
drl
1
distributed-system
1
distributed-reinforcement-learning
1
atari
1
simulator
1
jax
1