An open API service indexing awesome lists of open source software.

https://github.com/knightnemo/Awesome-World-Models

A Curated List of Awesome Works in World Modeling, Aiming to Serve as a One-stop Resource for Researchers, Practitioners, and Enthusiasts Interested in World Modeling.
https://github.com/knightnemo/Awesome-World-Models

List: Awesome-World-Models

Last synced: 7 months ago
JSON representation

A Curated List of Awesome Works in World Modeling, Aiming to Serve as a One-stop Resource for Researchers, Practitioners, and Enthusiasts Interested in World Modeling.

Awesome Lists containing this project

README

          

# 🌍 Awesome World Models

[![Awesome](https://cdn.rawgit.com/sindresorhus/awesome/d7305f38d29fed78fa85652e3a63e154dd8e8829/media/badge.svg)](https://github.com/sindresorhus/awesome) [![GitHub stars](https://img.shields.io/github/stars/knightnemo/Awesome-World-Models?style=social)](https://github.com/knightnemo/Awesome-World-Models/stargazers) [![License](https://img.shields.io/badge/License-CC0_1.0-blue.svg)](LICENSE.txt) [![PRs Welcome](https://img.shields.io/badge/PRs-welcome-brightgreen.svg)](CONTRIBUTING.md)

**📜 A Curated List of Amazing Works in World Modeling, spanning applications in Embodied AI, Autonomous Driving, Natural Language Processing and Agents.**
*Based on [Awesome-World-Model-for-Autonomous-Driving](https://github.com/LMD0311/Awesome-World-Model) and [Awesome-World-Model-for-Robotics](https://github.com/leofan90/Awesome-World-Models)*.


Awesome World Models

*Photo Credit: [Gemini-Nano-Banana🍌](https://aistudio.google.com/models/gemini-2-5-flash-image)*.

---

## 🚩 News & Updates
_Major updates and announcements are shown below. Scroll for full timeline._

🗺️ **[2025-10] Enhanced Visual Navigation** — Introduced badge system for papers! All entries now display [![arXiv](https://img.shields.io/badge/arXiv-Paper-b31b1b.svg)](#) [![Website](https://img.shields.io/badge/Website-Link-blue)](#) [![Code](https://img.shields.io/badge/Code-GitHub-green)](#) for quick access to resources.

🔥 **[2025-10] Repository Launch** — Awesome World Models is now live! We're building a comprehensive collection spanning Embodied AI, Autonomous Driving, NLP, and more. See [CONTRIBUTING.md](CONTRIBUTING.md) for how to contribute.

💡 **[Ongoing] Community Contributions Welcome** — Help us maintain the most up-to-date world models resource! Submit papers via PR or contact us at [email](mailto:siqiaohuang981@gmail.com).

⭐ **[Ongoing] Support This Project** — If you find this useful, please [cite](#citation) our work and give us a star. Share with your research community!

---
## Overview

- 🎯 [Aim of the project](#aim-of-the-project)
- 📚 [Definition of World Models](#definition-of-world-models)
- 📖 [Surveys of World Models](#surveys-of-world-models)
- 🎮 [World Models for Game Simulation](#world-models-for-game-simulation)
- 🚗 [World Models for Autonomous Driving](#world-models-for-autonomous-driving)
- 🤖 [World Models for Embodied AI](#world-models-for-embodied-ai)
- 🔬 [World Models for Science](#world-models-for-science)
- 💭 [Positions on World Models](#positions-on-world-models)
- 📐 [Theory & World Models Explainability](#theory--world-models-explainability)
- 🛠️ [General Approaches to World Models](#general-approaches-to-world-models)
- 📊 [Evaluating World Models](#evaluating-world-models)
- 🙏 [Acknowledgements](#acknowledgements)
- 📝 [Citation](#citation)

---

## Aim of the Project

World Models have become a hot topic in both research and industry, attracting unprecedented attention from the AI community and beyond. However, due to the **interdisciplinary nature** of the field (_and because the term "world model" simply sounds amazing_), the concept has been used with varying definitions across different domains.


Awesome World Models

This repository aims to:

- 🔍 **Organize** the rapidly growing body of world model research across multiple application domains
- 🗺️ **Provide** a minimalist map of how world models are utilized in different fields (Embodied AI, Autonomous Driving, NLP, etc.)
- 🤝 **Bridge** the gap between different communities working on world models with varying perspectives
- 📚 **Serve** as a one-stop resource for researchers, practitioners, and enthusiasts interested in world modeling
- 🚀 **Track** the latest developments and breakthroughs in this exciting field

Whether you're a researcher looking for related work, a practitioner seeking implementation references, or simply curious about world models, we hope this curated list helps you navigate the landscape!

---

## Definition of World Models
While world models' outreach has been expanded again and again, it is widely adopted that the original sources of world models come from these two papers:
* [⭐️] **World Models**, World Models. [![arXiv](https://img.shields.io/badge/arXiv-1803.10122-b31b1b.svg)](https://arxiv.org/abs/1803.10122) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://worldmodels.github.io/)
* [⭐️] **Yann Lecun's Speech**, "A Path Towards Autonomous Machine Intelligence". [![OpenReview](https://img.shields.io/badge/OpenReview-Paper-8E44AD.svg)](https://openreview.net/pdf?id=BZ5a1r-kVsf)

Some other great blogposts on world models include:
- [⭐️] **Towards Video World Models**, "Towards Video World Models". [![Blog](https://img.shields.io/badge/Blog-Link-orange)](https://www.xunhuang.me/blogs/world_model.html)
- **Status of World Models in 2025**, "Beyond the Hype: How I See World Models Evolving in 2025". [![Blog](https://img.shields.io/badge/Blog-Link-orange)](https://knightnemo.github.io/blog/posts/wm_2025/)
- [⭐️] **Jim Fan's tweet**. [![Blog](https://img.shields.io/badge/Blog-Link-orange)](https://x.com/DrJimFan/status/1709947595525951787)

---
## Surveys of World Models

### 1. World Models and Video Generation:
- [⭐️] **Is Sora a World Simulator**, "Is Sora a World Simulator? A Comprehensive Survey on General World Models and Beyond". [![arXiv](https://img.shields.io/badge/arXiv-2405.03520-b31b1b.svg)](https://arxiv.org/abs/2405.03520) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://github.com/GigaAI-research/General-World-Models-Survey)
- **Physics Cognition in Video Generation**, "Exploring the Evolution of Physics Cognition in Video Generation: A Survey". [![arXiv](https://img.shields.io/badge/arXiv-2503.21765-b31b1b.svg)](https://arxiv.org/abs/2503.21765) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://github.com/minnie-lin/Awesome-Physics-Cognition-based-Video-Generation)

### 2. World Models and 3D Generation:
- [⭐️] **3D and 4D World Modeling: A Survey**, "3D and 4D World Modeling: A Survey". [![arXiv](https://img.shields.io/badge/arXiv-2509.07996-b31b1b.svg)](https://arxiv.org/abs/2509.07996)
- [⭐️] **Understanding World or Predicting Future?**, "Understanding World or Predicting Future? A Comprehensive Survey of World Models". [![arXiv](https://img.shields.io/badge/arXiv-2411.14499-b31b1b.svg)](https://arxiv.org/abs/2411.14499)
- **From 2D to 3D Cognition**, "From 2D to 3D Cognition: A Brief Survey of General World Models". [![arXiv](https://img.shields.io/badge/arXiv-2506.20134-b31b1b.svg)](https://arxiv.org/abs/2506.20134)

### 3. World Models and Embodied Artificial Intelligence:
- [⭐️] **World Models for Embodied AI**, "A Comprehensive Survey on World Models for Embodied AI". [![arXiv](https://img.shields.io/badge/arXiv-2510.16732-b31b1b.svg)](https://arxiv.org/abs/2510.16732) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://github.com/Li-Zn-H/AwesomeWorldModels)
- **World Models and Physical Simulation**, "A Survey: Learning Embodied Intelligence from Physical Simulators and World Models". [![arXiv](https://img.shields.io/badge/arXiv-2507.00917-b31b1b.svg)](https://arxiv.org/abs/2507.00917) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://github.com/NJU3DV-LoongGroup/Embodied-World-Models-Survey)
- **Embodied AI Agents: Modeling the World**, "Embodied AI Agents: Modeling the World". [![arXiv](https://img.shields.io/badge/arXiv-2506.22355-b31b1b.svg)](https://arxiv.org/abs/2506.22355)
- **Aligning Cyber Space with Physical World**, "Aligning Cyber Space with Physical World: A Comprehensive Survey on Embodied AI". [![arXiv](https://img.shields.io/badge/arXiv-2407.06886-b31b1b.svg)](https://arxiv.org/abs/2407.06886) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://github.com/HCPLab-SYSU/Embodied_AI_Paper_List)

### 4. World Models for Autonomous Driving:
- [⭐️] **A Survey of World Models for Autonomous Driving**, "A Survey of World Models for Autonomous Driving". [![arXiv](https://img.shields.io/badge/arXiv-2501.11260-b31b1b.svg)](https://arxiv.org/abs/2501.11260)
- **World Models for Autonomous Driving: An Initial Survey**, "World Models for Autonomous Driving: An Initial Survey". [![arXiv](https://img.shields.io/badge/arXiv-2403.02622-b31b1b.svg)](https://arxiv.org/abs/2403.02622)
- **Interplay Between Video Generation and World Models in Autonomous Driving**, "Exploring the Interplay Between Video Generation and World Models in Autonomous Driving: A Survey". [![arXiv](https://img.shields.io/badge/arXiv-2411.02914-b31b1b.svg)](https://arxiv.org/abs/2411.02914)

### 5. Other Good Surveys:
- **From Masks to Worlds**, "From Masks to Worlds: A Hitchhiker's Guide to World Models". [![arXiv](https://img.shields.io/badge/arXiv-2510.20668-b31b1b.svg)](https://arxiv.org/abs/2510.20668) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://github.com/M-E-AGI-Lab/Awesome-World-Models)
- **The Safety Challenge of World Models**, "The Safety Challenge of World Models for Embodied AI Agents: A Review". [![arXiv](https://img.shields.io/badge/arXiv-2510.05865-b31b1b.svg)](https://arxiv.org/abs/2510.05865)
- **World Models in AI: Like a Child**, "World Models in Artificial Intelligence: Sensing, Learning, and Reasoning Like a Child". [![arXiv](https://img.shields.io/badge/arXiv-2503.15168-b31b1b.svg)](https://arxiv.org/abs/2503.15168)
- **World Model Safety**, "World Models: The Safety Perspective". [![arXiv](https://img.shields.io/badge/arXiv-2411.07690-b31b1b.svg)](https://arxiv.org/abs/2411.07690)
- **Model-based reinforcement learning**: "A survey on model-based reinforcement learning". [![Website](https://img.shields.io/badge/Website-Link-blue)](https://link.springer.com/article/10.1007/s11432-022-3696-5)

---

## World Models for Game Simulation
Pixel Space:
- [⭐️] **GameNGen**, "Diffusion Models Are Real-Time Game Engines". [![arXiv](https://img.shields.io/badge/arXiv-2408.14837-b31b1b.svg)](https://arxiv.org/abs/2408.14837)
- [⭐️] **DIAMOND**, "Diffusion for World Modeling: Visual Details Matter in Atari". [![arXiv](https://img.shields.io/badge/arXiv-2405.12399-b31b1b.svg)](https://arxiv.org/abs/2405.12399) [![Code](https://img.shields.io/badge/Code-GitHub-green)](https://github.com/eloialonso/diamond)
- **MineWorld**, "MineWorld: a Real-Time and Open-Source Interactive World Model on Minecraft". [![arXiv](https://img.shields.io/badge/arXiv-2504.07257-b31b1b.svg)](https://arxiv.org/abs/2504.07257) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://aka.ms/mineworld)
- **Oasis**, "Oasis: A Universe in a Transformer". [![Website](https://img.shields.io/badge/Website-Link-blue)](https://oasis-model.github.io/)
- **AnimeGamer**, "AnimeGamer: Infinite Anime Life Simulation with Next Game State Prediction". [![arXiv](https://img.shields.io/badge/arXiv-2504.01014-b31b1b.svg)](http://arxiv.org/abs/2504.01014)[![Website](https://img.shields.io/badge/Website-Link-blue)](https://howe125.github.io/AnimeGamer.github.io/)
- [⭐️] **Matrix-Game**, "Matrix-Game: Interactive World Foundation Model." [![arXiv](https://img.shields.io/badge/arXiv-2506.18701-b31b1b.svg)](https://arxiv.org/abs/2506.18701) [![Code](https://img.shields.io/badge/Code-GitHub-green)](https://github.com/SkyworkAI/Matrix-Game)
- [⭐️] **Matrix-Game 2.0**, Matrix-Game 2.0: An Open-Source, Real-Time, and Streaming Interactive World Model. [![arXiv](https://img.shields.io/badge/arXiv-2508.13009-b31b1b.svg)](https://arxiv.org/abs/2508.13009) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://matrix-game-v2.github.io/)
- **RealPlay**, "From Virtual Games to Real-World Play". [![arXiv](https://img.shields.io/badge/arXiv-2506.18901-b31b1b.svg)](https://arxiv.org/abs/2506.18901) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://wenqsun.github.io/RealPlay/) [![Code](https://img.shields.io/badge/Code-GitHub-green)](https://github.com/wenqsun/Real-Play)
- **GameFactory**, "GameFactory: Creating New Games with Generative Interactive Videos". [![arXiv](https://img.shields.io/badge/arXiv-2501.08325-b31b1b.svg)](http://arxiv.org/abs/2501.08325) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://yujiwen.github.io/gamefactory/) [![Code](https://img.shields.io/badge/Code-GitHub-green)](https://github.com/KwaiVGI/GameFactory)
- **WORLDMEM**, "Worldmem: Long-term Consistent World Simulation with Memory". [![arXiv](https://img.shields.io/badge/arXiv-2504.12369-b31b1b.svg)](http://arxiv.org/abs/2504.12369) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://xizaoqu.github.io/worldmem/) [![Code](https://img.shields.io/badge/Code-GitHub-green)](https://github.com/xizaoqu/WorldMem)

3D Mesh Space:
- [⭐️] **HunyuanWorld 1.0**, HunyuanWorld 1.0: Generating Immersive, Explorable, and Interactive 3D Worlds from Words or Pixels. [![arXiv](https://img.shields.io/badge/arXiv-2507.21809-b31b1b.svg)](https://arxiv.org/abs/2507.21809) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://3d-models.hunyuan.tencent.com/world/) [![Code](https://img.shields.io/badge/Code-GitHub-green)](https://github.com/Tencent-Hunyuan/HunyuanWorld-1.0)
- [⭐️] **Matrix-3D**, Matrix-3D: Omnidirectional Explorable 3D World Generation. [![arXiv](https://img.shields.io/badge/arXiv-2508.08086-b31b1b.svg)](https://arxiv.org/abs/2508.08086) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://matrix-3d.github.io)

---
## World Models for Autonomous Driving
_Refer to https://github.com/LMD0311/Awesome-World-Model for full list._

> [!NOTE]
> 📢 [Call for Maintenance] The repo creator is no expert of autonomous driving, so this is a more-than-concise list of works without classification. We anticipate community effort on turning this section cleaner and more well-sorted.

- [⭐️] **Cosmos-Drive-Dreams**, "Cosmos-Drive-Dreams: Scalable Synthetic Driving Data Generation with World Foundation Models". [![arXiv](https://img.shields.io/badge/arXiv-2506.09042-b31b1b.svg)](https://arxiv.org/abs/2506.09042) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://research.nvidia.com/labs/toronto-ai/cosmos_drive_dreams)
- [⭐️] **GAIA-2**, "GAIA-2: A Controllable Multi-View Generative World Model for Autonomous Driving". [![arXiv](https://img.shields.io/badge/arXiv-2503.20523-b31b1b.svg)](https://arxiv.org/abs/2503.20523) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://wayve.ai/thinking/gaia-2)
- **Copilot4D**, "Copilot4D: Learning Unsupervised World Models for Autonomous Driving via Discrete Diffusion". [![arXiv](https://img.shields.io/badge/arXiv-2311.01017-b31b1b.svg)](https://arxiv.org/abs/2311.01017)
- **OmniNWM**: "OmniNWM: Omniscient Driving Navigation World Models". [![arXiv](https://img.shields.io/badge/arXiv-2510.18313-b31b1b.svg)](https://arxiv.org/abs/2510.18313) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://arlo0o.github.io/OmniNWM/)
- **GAIA-1**, "Introducing GAIA-1: A Cutting-Edge Generative AI Model for Autonomy". [![arXiv](https://img.shields.io/badge/arXiv-2309.17080-b31b1b.svg)](https://arxiv.org/abs/2309.17080) [![Blog](https://img.shields.io/badge/Blog-Link-orange)](https://wayve.ai/thinking/introducing-gaia1/)
* **PWM**, "From Forecasting to Planning: Policy World Model for Collaborative State-Action Prediction". [![arXiv](https://img.shields.io/badge/arXiv-2510.19654-b31b1b.svg)](https://arxiv.org/abs/2510.19654) [![Code](https://img.shields.io/badge/Code-GitHub-green)](https://github.com/6550Zhao/Policy-World-Model)
* **Dream4Drive**, "Rethinking Driving World Model as Synthetic Data Generator for Perception Tasks". [![arXiv](https://img.shields.io/badge/arXiv-2510.19195-b31b1b.svg)](https://arxiv.org/abs/2510.19195) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://wm-research.github.io/Dream4Drive/)
* **SparseWorld**, "SparseWorld: A Flexible, Adaptive, and Efficient 4D Occupancy World Model Powered by Sparse and Dynamic Queries". [![arXiv](https://img.shields.io/badge/arXiv-2510.17482-b31b1b.svg)](https://arxiv.org/abs/2510.17482) [![Code](https://img.shields.io/badge/Code-GitHub-green)](https://github.com/MSunDYY/SparseWorld)
* **DriveVLA-W0**: "DriveVLA-W0: World Models Amplify Data Scaling Law in Autonomous Driving". [![arXiv](https://img.shields.io/badge/arXiv-2510.12796-b31b1b.svg)](https://arxiv.org/abs/2510.12796) [![Code](https://img.shields.io/badge/Code-GitHub-green)](https://github.com/BraveGroup/DriveVLA-W0)
* "Enhancing Physical Consistency in Lightweight World Models". [![arXiv](https://img.shields.io/badge/arXiv-2509.12437-b31b1b.svg)](https://arxiv.org/abs/2509.12437)
* **IRL-VLA**: "IRL-VLA: Training an Vision-Language-Action Policy via Reward World Model". [![arXiv](https://img.shields.io/badge/arXiv-2508.06571-b31b1b.svg)](https://arxiv.org/abs/2508.06571) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://lidarcrafter.github.io) [![Code](https://img.shields.io/badge/Code-GitHub-green)](https://github.com/lidarcrafter/toolkit)
* **LiDARCrafter**: "LiDARCrafter: Dynamic 4D World Modeling from LiDAR Sequences". [![arXiv](https://img.shields.io/badge/arXiv-2508.03692-b31b1b.svg)](https://arxiv.org/abs/2508.03692) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://lidarcrafter.github.io) [![Code](https://img.shields.io/badge/Code-GitHub-green)](https://github.com/lidarcrafter/toolkit)
* **FASTopoWM**: "FASTopoWM: Fast-Slow Lane Segment Topology Reasoning with Latent World Models". [![arXiv](https://img.shields.io/badge/arXiv-2507.23325-b31b1b.svg)](https://arxiv.org/abs/2507.23325) [![Code](https://img.shields.io/badge/Code-GitHub-green)](https://github.com/YimingYang23/FASTopoWM)
* **Orbis**: "Orbis: Overcoming Challenges of Long-Horizon Prediction in Driving World Models". [![arXiv](https://img.shields.io/badge/arXiv-2507.13162-b31b1b.svg)](https://arxiv.org/abs/2507.13162) [![Code](https://img.shields.io/badge/Code-GitHub-green)](https://lmb-freiburg.github.io/orbis.github.io/)
* "World Model-Based End-to-End Scene Generation for Accident Anticipation in Autonomous Driving". [![arXiv](https://img.shields.io/badge/arXiv-2507.12762-b31b1b.svg)](https://arxiv.org/abs/2507.12762)
* **NRSeg**: "NRSeg: Noise-Resilient Learning for BEV Semantic Segmentation via Driving World Models" [![arXiv](https://img.shields.io/badge/arXiv-2507.04002-b31b1b.svg)](https://arxiv.org/abs/2507.04002) [![Code](https://img.shields.io/badge/Code-GitHub-green)](https://github.com/lynn-yu/NRSeg)
* **World4Drive**: "World4Drive: End-to-End Autonomous Driving via Intention-aware Physical Latent World Model". [![arXiv](https://img.shields.io/badge/arXiv-2507.00603-b31b1b.svg)](https://arxiv.org/abs/2507.00603) [![Code](https://img.shields.io/badge/Code-GitHub-green)](https://github.com/ucaszyp/World4Drive)
* **Epona**: "Epona: Autoregressive Diffusion World Model for Autonomous Driving". [![arXiv](https://img.shields.io/badge/arXiv-2506.24113-b31b1b.svg)](https://arxiv.org/abs/2506.24113) [![Code](https://img.shields.io/badge/Code-GitHub-green)](https://kevin-thu.github.io/Epona/)
* "Towards foundational LiDAR world models with efficient latent flow matching". [![arXiv](https://img.shields.io/badge/arXiv-2506.23434-b31b1b.svg)](https://arxiv.org/abs/2506.23434)
* **SceneDiffuser++**: "SceneDiffuser++: City-Scale Traffic Simulation via a Generative World Model". [![arXiv](https://img.shields.io/badge/arXiv-2506.21976-b31b1b.svg)](https://arxiv.org/abs/2506.21976)
* **COME**: "COME: Adding Scene-Centric Forecasting Control to Occupancy World Model" [![arXiv](https://img.shields.io/badge/arXiv-2506.13260-b31b1b.svg)](https://arxiv.org/abs/2506.13260) [![Code](https://img.shields.io/badge/Code-GitHub-green)](https://github.com/synsin0/COME)
* **STAGE**: "STAGE: A Stream-Centric Generative World Model for Long-Horizon Driving-Scene Simulation". [![arXiv](https://img.shields.io/badge/arXiv-2506.13138-b31b1b.svg)](https://arxiv.org/abs/2506.13138)
* **ReSim**: "ReSim: Reliable World Simulation for Autonomous Driving". [![arXiv](https://img.shields.io/badge/arXiv-2506.09981-b31b1b.svg)](https://arxiv.org/abs/2506.09981) [![Code](https://img.shields.io/badge/Code-GitHub-green)](https://github.com/OpenDriveLab/ReSim) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://opendrivelab.com/ReSim)
* "Ego-centric Learning of Communicative World Models for Autonomous Driving". [![arXiv](https://img.shields.io/badge/arXiv-2506.08149-b31b1b.svg)](https://arxiv.org/abs/2506.08149)
* **Dreamland**: "Dreamland: Controllable World Creation with Simulator and Generative Models". [![arXiv](https://img.shields.io/badge/arXiv-2506.08006-b31b1b.svg)](https://arxiv.org/abs/2506.08006) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://metadriverse.github.io/dreamland/)
* **LongDWM**: "LongDWM: Cross-Granularity Distillation for Building a Long-Term Driving World Model". [![arXiv](https://img.shields.io/badge/arXiv-2506.01546-b31b1b.svg)](https://arxiv.org/abs/2506.01546) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://wang-xiaodong1899.github.io/longdwm/)
* **GeoDrive**: "GeoDrive: 3D Geometry-Informed Driving World Model with Precise Action Control". [![arXiv](https://img.shields.io/badge/arXiv-2505.22421-b31b1b.svg)](https://arxiv.org/abs/2505.22421) [![Code](https://img.shields.io/badge/Code-GitHub-green)](https://github.com/antonioo-c/GeoDrive)
* **FutureSightDrive**: "FutureSightDrive: Thinking Visually with Spatio-Temporal CoT for Autonomous Driving". [![arXiv](https://img.shields.io/badge/arXiv-2505.17685-b31b1b.svg)](https://arxiv.org/abs/2505.17685) [![Code](https://img.shields.io/badge/Code-GitHub-green)](https://github.com/MIV-XJTU/FSDrive)
* **Raw2Drive**: "Raw2Drive: Reinforcement Learning with Aligned World Models for End-to-End Autonomous Driving (in CARLA v2)". [![arXiv](https://img.shields.io/badge/arXiv-2505.16394-b31b1b.svg)](https://arxiv.org/abs/2505.16394)
* **VL-SAFE**: "VL-SAFE: Vision-Language Guided Safety-Aware Reinforcement Learning with World Models for Autonomous Driving". [![arXiv](https://img.shields.io/badge/arXiv-2505.16377-b31b1b.svg)](https://arxiv.org/abs/2505.16377) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://ys-qu.github.io/vlsafe-website/)
* **PosePilot**: "PosePilot: Steering Camera Pose for Generative World Models with Self-supervised Depth". [![arXiv](https://img.shields.io/badge/arXiv-2505.01729-b31b1b.svg)](https://arxiv.org/abs/2505.01729)
* "World Model-Based Learning for Long-Term Age of Information Minimization in Vehicular Networks". [![arXiv](https://img.shields.io/badge/arXiv-2505.01712-b31b1b.svg)](https://arxiv.org/abs/2505.01712)
* "Learning to Drive from a World Model". [![arXiv](https://img.shields.io/badge/arXiv-2504.19077-b31b1b.svg)](https://arxiv.org/abs/2504.19077)
* **DriVerse**: "DriVerse: Navigation World Model for Driving Simulation via Multimodal Trajectory Prompting and Motion Alignment". [![arXiv](https://img.shields.io/badge/arXiv-2504.18576-b31b1b.svg)](https://arxiv.org/abs/2504.18576)
* "End-to-End Driving with Online Trajectory Evaluation via BEV World Model". [![arXiv](https://img.shields.io/badge/arXiv-2504.01941-b31b1b.svg)](https://arxiv.org/abs/2504.01941) [![Code](https://img.shields.io/badge/Code-GitHub-green)](https://github.com/liyingyanUCAS/WoTE)
* "Knowledge Graphs as World Models for Semantic Material-Aware Obstacle Handling in Autonomous Vehicles". [![arXiv](https://img.shields.io/badge/arXiv-2503.21232-b31b1b.svg)](https://arxiv.org/abs/2503.21232)
* **MiLA**: "MiLA: Multi-view Intensive-fidelity Long-term Video Generation World Model for Autonomous Driving". [![arXiv](https://img.shields.io/badge/arXiv-2503.15875-b31b1b.svg)](https://arxiv.org/abs/2503.15875) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://github.com/xiaomi-mlab/mila.github.io)
* **SimWorld**: "SimWorld: A Unified Benchmark for Simulator-Conditioned Scene Generation via World Model". [![arXiv](https://img.shields.io/badge/arXiv-2503.13952-b31b1b.svg)](https://arxiv.org/abs/2503.13952) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://github.com/Li-Zn-H/SimWorld)
* **UniFuture**: "Seeing the Future, Perceiving the Future: A Unified Driving World Model for Future Generation and Perception". [![arXiv](https://img.shields.io/badge/arXiv-2503.13587-b31b1b.svg)](https://arxiv.org/abs/2503.13587) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://github.com/dk-liang/UniFuture)
* **EOT-WM**: "Other Vehicle Trajectories Are Also Needed: A Driving World Model Unifies Ego-Other Vehicle Trajectories in Video Latent Space". [![arXiv](https://img.shields.io/badge/arXiv-2503.09215-b31b1b.svg)](https://arxiv.org/abs/2503.09215)
* "Temporal Triplane Transformers as Occupancy World Models". [![arXiv](https://img.shields.io/badge/arXiv-2503.07338-b31b1b.svg)](https://arxiv.org/abs/2503.07338)
* **InDRiVE**: "InDRiVE: Intrinsic Disagreement based Reinforcement for Vehicle Exploration through Curiosity Driven Generalized World Model". [![arXiv](https://img.shields.io/badge/arXiv-2503.05573-b31b1b.svg)](https://arxiv.org/abs/2503.05573)
* **MaskGWM**: "MaskGWM: A Generalizable Driving World Model with Video Mask Reconstruction". [![arXiv](https://img.shields.io/badge/arXiv-2502.11663-b31b1b.svg)](https://arxiv.org/abs/2502.11663)
* **Dream to Drive**: "Dream to Drive: Model-Based Vehicle Control Using Analytic World Models". [![arXiv](https://img.shields.io/badge/arXiv-2502.10012-b31b1b.svg)](https://arxiv.org/abs/2502.10012)
* "Semi-Supervised Vision-Centric 3D Occupancy World Model for Autonomous Driving". [![arXiv](https://img.shields.io/badge/arXiv-2502.07309-b31b1b.svg)](https://arxiv.org/abs/2502.07309)
* "Dream to Drive with Predictive Individual World Model". [![arXiv](https://img.shields.io/badge/arXiv-2501.16733-b31b1b.svg)](https://arxiv.org/abs/2501.16733) [![Code](https://img.shields.io/badge/Code-GitHub-green)](https://github.com/gaoyinfeng/PIWM)
* **HERMES**: "HERMES: A Unified Self-Driving World Model for Simultaneous 3D Scene Understanding and Generation". [![arXiv](https://img.shields.io/badge/arXiv-2501.14729-b31b1b.svg)](https://arxiv.org/abs/2501.14729)
* **AdaWM**: "AdaWM: Adaptive World Model based Planning for Autonomous Driving". [![arXiv](https://img.shields.io/badge/arXiv-2501.13072-b31b1b.svg)](https://arxiv.org/abs/2501.13072)
* **AD-L-JEPA**: "AD-L-JEPA: Self-Supervised Spatial World Models with Joint Embedding Predictive Architecture for Autonomous Driving with LiDAR Data". [![arXiv](https://img.shields.io/badge/arXiv-2501.04969-b31b1b.svg)](https://arxiv.org/abs/2501.04969)
* **DrivingWorld**: "DrivingWorld: Constructing World Model for Autonomous Driving via Video GPT". [![arXiv](https://img.shields.io/badge/arXiv-2412.19505-b31b1b.svg)](https://arxiv.org/abs/2412.19505) [![Code](https://img.shields.io/badge/Code-GitHub-green)](https://github.com/YvanYin/DrivingWorld) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://huxiaotaostasy.github.io/DrivingWorld/index.html)
* **DrivingGPT**: "DrivingGPT: Unifying Driving World Modeling and Planning with Multi-modal Autoregressive Transformers". [![arXiv](https://img.shields.io/badge/arXiv-2412.18607-b31b1b.svg)](https://arxiv.org/abs/2412.18607) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://rogerchern.github.io/DrivingGPT/)
* "An Efficient Occupancy World Model via Decoupled Dynamic Flow and Image-assisted Training". [![arXiv](https://img.shields.io/badge/arXiv-2412.13772-b31b1b.svg)](https://arxiv.org/abs/2412.13772)
* **GEM**: "GEM: A Generalizable Ego-Vision Multimodal World Model for Fine-Grained Ego-Motion, Object Dynamics, and Scene Composition Control". [![arXiv](https://img.shields.io/badge/arXiv-2412.11198-b31b1b.svg)](https://arxiv.org/abs/2412.11198) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://vita-epfl.github.io/GEM.github.io/)
* **GaussianWorld**: "GaussianWorld: Gaussian World Model for Streaming 3D Occupancy Prediction". [![arXiv](https://img.shields.io/badge/arXiv-2412.04380-b31b1b.svg)](https://arxiv.org/abs/2412.04380) [![Code](https://img.shields.io/badge/Code-GitHub-green)](https://github.com/zuosc19/GaussianWorld)
* **Doe-1**: "Doe-1: Closed-Loop Autonomous Driving with Large World Model". [![arXiv](https://img.shields.io/badge/arXiv-2412.09627-b31b1b.svg)](https://arxiv.org/abs/2412.09627) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://wzzheng.net/Doe/) [![Code](https://img.shields.io/badge/Code-GitHub-green)](https://github.com/wzzheng/Doe)
* "Physical Informed Driving World Model". [![arXiv](https://img.shields.io/badge/arXiv-2412.08410-b31b1b.svg)](https://arxiv.org/abs/2412.08410) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://metadrivescape.github.io/papers_project/DrivePhysica/page.html)
* **InfiniCube**: "InfiniCube: Unbounded and Controllable Dynamic 3D Driving Scene Generation with World-Guided Video Models". [![arXiv](https://img.shields.io/badge/arXiv-2412.03934-b31b1b.svg)](https://arxiv.org/abs/2412.03934) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://research.nvidia.com/labs/toronto-ai/infinicube/)
* **InfinityDrive**: "InfinityDrive: Breaking Time Limits in Driving World Models". [![arXiv](https://img.shields.io/badge/arXiv-2412.01522-b31b1b.svg)](https://arxiv.org/abs/2412.01522) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://metadrivescape.github.io/papers_project/InfinityDrive/page.html)
* **ReconDreamer**: "ReconDreamer: Crafting World Models for Driving Scene Reconstruction via Online Restoration". [![arXiv](https://img.shields.io/badge/arXiv-2411.19548-b31b1b.svg)](https://arxiv.org/abs/2411.19548) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://recondreamer.github.io/)
* **Imagine-2-Drive**: "Imagine-2-Drive: High-Fidelity World Modeling in CARLA for Autonomous Vehicles". [![arXiv](https://img.shields.io/badge/arXiv-2411.10171-b31b1b.svg)](https://arxiv.org/abs/2411.10171) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://anantagrg.github.io/Imagine-2-Drive.github.io/)
* **DynamicCity**: "DynamicCity: Large-Scale 4D Occupancy Generation from Dynamic Scenes". [![arXiv](https://img.shields.io/badge/arXiv-2410.18084-b31b1b.svg)](https://arxiv.org/abs/2410.18084) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://dynamic-city.github.io) [![Code](https://img.shields.io/badge/Code-GitHub-green)](https://github.com/3DTopia/DynamicCity)
* **DriveDreamer4D**: "World Models Are Effective Data Machines for 4D Driving Scene Representation". [![arXiv](https://img.shields.io/badge/arXiv-2410.13571-b31b1b.svg)](https://arxiv.org/abs/2410.13571) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://drivedreamer4d.github.io/)
* **DOME**: "Taming Diffusion Model into High-Fidelity Controllable Occupancy World Model". [![arXiv](https://img.shields.io/badge/arXiv-2410.10429-b31b1b.svg)](https://arxiv.org/abs/2410.10429) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://gusongen.github.io/DOME)
* **SSR**: "Does End-to-End Autonomous Driving Really Need Perception Tasks?". [![arXiv](https://img.shields.io/badge/arXiv-2409.18341-b31b1b.svg)](https://arxiv.org/abs/2409.18341) [![Code](https://img.shields.io/badge/Code-GitHub-green)](https://github.com/PeidongLi/SSR)
* "Mitigating Covariate Shift in Imitation Learning for Autonomous Vehicles Using Latent Space Generative World Models". [![arXiv](https://img.shields.io/badge/arXiv-2409.16663-b31b1b.svg)](https://arxiv.org/abs/2409.16663)
* **LatentDriver**: "Learning Multiple Probabilistic Decisions from Latent World Model in Autonomous Driving". [![arXiv](https://img.shields.io/badge/arXiv-2409.15730-b31b1b.svg)](https://arxiv.org/abs/2409.15730) [![Code](https://img.shields.io/badge/Code-GitHub-green)](https://github.com/Sephirex-X/LatentDriver)
* **RenderWorld**: "World Model with Self-Supervised 3D Label". [![arXiv](https://img.shields.io/badge/arXiv-2409.11356-b31b1b.svg)](https://arxiv.org/abs/2409.11356)
* **OccLLaMA**: "An Occupancy-Language-Action Generative World Model for Autonomous Driving". [![arXiv](https://img.shields.io/badge/arXiv-2409.03272-b31b1b.svg)](https://arxiv.org/abs/2409.03272)
* **DriveGenVLM**: "Real-world Video Generation for Vision Language Model based Autonomous Driving". [![arXiv](https://img.shields.io/badge/arXiv-2408.16647-b31b1b.svg)](https://arxiv.org/abs/2408.16647)
* **Drive-OccWorld**: "Driving in the Occupancy World: Vision-Centric 4D Occupancy Forecasting and Planning via World Models for Autonomous Driving". [![arXiv](https://img.shields.io/badge/arXiv-2408.14197-b31b1b.svg)](https://arxiv.org/abs/2408.14197)
* **CarFormer**: "Self-Driving with Learned Object-Centric Representations". [![arXiv](https://img.shields.io/badge/arXiv-2407.15843-b31b1b.svg)](https://arxiv.org/abs/2407.15843) [![Code](https://img.shields.io/badge/Code-GitHub-green)](https://kuis-ai.github.io/CarFormer/)
* **BEVWorld**: "A Multimodal World Model for Autonomous Driving via Unified BEV Latent Space". [![arXiv](https://img.shields.io/badge/arXiv-2407.05679-b31b1b.svg)](https://arxiv.org/abs/2407.05679) [![Code](https://img.shields.io/badge/Code-GitHub-green)](https://github.com/zympsyche/BevWorld)
* **TOKEN**: "Tokenize the World into Object-level Knowledge to Address Long-tail Events in Autonomous Driving". [![arXiv](https://img.shields.io/badge/arXiv-2407.00959-b31b1b.svg)](https://arxiv.org/abs/2407.00959)
* **UMAD**: "Unsupervised Mask-Level Anomaly Detection for Autonomous Driving". [![arXiv](https://img.shields.io/badge/arXiv-2406.06370-b31b1b.svg)](https://arxiv.org/abs/2406.06370)
* **SimGen**: "Simulator-conditioned Driving Scene Generation". [![arXiv](https://img.shields.io/badge/arXiv-2406.09386-b31b1b.svg)](https://arxiv.org/abs/2406.09386) [![Code](https://img.shields.io/badge/Code-GitHub-green)](https://metadriverse.github.io/simgen/)
* **AdaptiveDriver**: "Planning with Adaptive World Models for Autonomous Driving". [![arXiv](https://img.shields.io/badge/arXiv-2406.10714-b31b1b.svg)](https://arxiv.org/abs/2406.10714) [![Code](https://img.shields.io/badge/Code-GitHub-green)](https://arunbalajeev.github.io/world_models_planning/world_model_paper.html)
* **UnO**: "Unsupervised Occupancy Fields for Perception and Forecasting". [![arXiv](https://img.shields.io/badge/arXiv-2406.08691-b31b1b.svg)](https://arxiv.org/abs/2406.08691) [![Code](https://img.shields.io/badge/Code-GitHub-green)](https://waabi.ai/research/uno)
* **LAW**: "Enhancing End-to-End Autonomous Driving with Latent World Model". [![arXiv](https://img.shields.io/badge/arXiv-2406.08481-b31b1b.svg)](https://arxiv.org/abs/2406.08481) [![Code](https://img.shields.io/badge/Code-GitHub-green)](https://github.com/BraveGroup/LAW)
* **Delphi**: "Unleashing Generalization of End-to-End Autonomous Driving with Controllable Long Video Generation". [![arXiv](https://img.shields.io/badge/arXiv-2406.01349-b31b1b.svg)](https://arxiv.org/abs/2406.01349) [![Code](https://img.shields.io/badge/Code-GitHub-green)](https://github.com/westlake-autolab/Delphi)
* **OccSora**: "4D Occupancy Generation Models as World Simulators for Autonomous Driving". [![arXiv](https://img.shields.io/badge/arXiv-2405.20337-b31b1b.svg)](https://arxiv.org/abs/2405.20337) [![Code](https://img.shields.io/badge/Code-GitHub-green)](https://github.com/wzzheng/OccSora)
* **MagicDrive3D**: "Controllable 3D Generation for Any-View Rendering in Street Scenes". [![arXiv](https://img.shields.io/badge/arXiv-2405.14475-b31b1b.svg)](https://arxiv.org/abs/2405.14475) [![Code](https://img.shields.io/badge/Code-GitHub-green)](https://gaoruiyuan.com/magicdrive3d/)
* **Vista**: "A Generalizable Driving World Model with High Fidelity and Versatile Controllability". [![arXiv](https://img.shields.io/badge/arXiv-2405.17398-b31b1b.svg)](https://arxiv.org/abs/2405.17398) [![Code](https://img.shields.io/badge/Code-GitHub-green)](https://github.com/OpenDriveLab/Vista)
* **CarDreamer**: "Open-Source Learning Platform for World Model based Autonomous Driving". [![arXiv](https://img.shields.io/badge/arXiv-2405.09111-b31b1b.svg)](https://arxiv.org/abs/2405.09111) [![Code](https://img.shields.io/badge/Code-GitHub-green)](https://github.com/ucd-dare/CarDreamer)
* **DriveSim**: "Probing Multimodal LLMs as World Models for Driving". [![arXiv](https://img.shields.io/badge/arXiv-2405.05956-b31b1b.svg)](https://arxiv.org/abs/2405.05956) [![Code](https://img.shields.io/badge/Code-GitHub-green)](https://github.com/sreeramsa/DriveSim)
* **DriveWorld**: "4D Pre-trained Scene Understanding via World Models for Autonomous Driving". [![arXiv](https://img.shields.io/badge/arXiv-2405.04390-b31b1b.svg)](https://arxiv.org/abs/2405.04390)
* **LidarDM**: "Generative LiDAR Simulation in a Generated World". [![arXiv](https://img.shields.io/badge/arXiv-2404.02903-b31b1b.svg)](https://arxiv.org/abs/2404.02903) [![Code](https://img.shields.io/badge/Code-GitHub-green)](https://github.com/vzyrianov/lidardm)
* **SubjectDrive**: "Scaling Generative Data in Autonomous Driving via Subject Control". [![arXiv](https://img.shields.io/badge/arXiv-2403.19438-b31b1b.svg)](https://arxiv.org/abs/2403.19438) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://subjectdrive.github.io/)
* **DriveDreamer-2**: "LLM-Enhanced World Models for Diverse Driving Video Generation". [![arXiv](https://img.shields.io/badge/arXiv-2403.06845-b31b1b.svg)](https://arxiv.org/abs/2403.06845) [![Code](https://img.shields.io/badge/Code-GitHub-green)](https://drivedreamer2.github.io/)
* **Think2Drive**: "Efficient Reinforcement Learning by Thinking in Latent World Model for Quasi-Realistic Autonomous Driving". [![arXiv](https://img.shields.io/badge/arXiv-2402.16720-b31b1b.svg)](https://arxiv.org/abs/2402.16720)
* **MARL-CCE**: "Modelling Competitive Behaviors in Autonomous Driving Under Generative World Model". [![arXiv](https://img.shields.io/badge/arXiv-Paper-b31b1b.svg)](https://www.ecva.net/papers/eccv_2024/papers_ECCV/papers/05085.pdf) [![Code](https://img.shields.io/badge/Code-GitHub-green)](https://github.com/qiaoguanren/MARL-CCE)
* **GenAD**: "Generalized Predictive Model for Autonomous Driving". [![arXiv](https://img.shields.io/badge/arXiv-2403.09630-b31b1b.svg)](https://arxiv.org/abs/2403.09630) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://github.com/OpenDriveLab/DriveAGI?tab=readme-ov-file#genad-dataset-opendv-youtube)
* **GenAD**: "Generative End-to-End Autonomous Driving". [![arXiv](https://img.shields.io/badge/arXiv-2402.11502-b31b1b.svg)](https://arxiv.org/abs/2402.11502) [![Code](https://img.shields.io/badge/Code-GitHub-green)](https://github.com/wzzheng/GenAD)
* **NeMo**: "Neural Volumetric World Models for Autonomous Driving". [![arXiv](https://img.shields.io/badge/arXiv-Paper-b31b1b.svg)](https://www.ecva.net/papers/eccv_2024/papers_ECCV/papers/02571.pdf)
* **MARL-CCE**: "Modelling-Competitive-Behaviors-in-Autonomous-Driving-Under-Generative-World-Model". [![Code](https://img.shields.io/badge/Code-GitHub-green)](https://github.com/qiaoguanren/MARL-CCE)
* **ViDAR**: "Visual Point Cloud Forecasting enables Scalable Autonomous Driving". [![arXiv](https://img.shields.io/badge/arXiv-2312.17655-b31b1b.svg)](https://arxiv.org/abs/2312.17655) [![Code](https://img.shields.io/badge/Code-GitHub-green)](https://github.com/OpenDriveLab/ViDAR)
* **Drive-WM**: "Driving into the Future: Multiview Visual Forecasting and Planning with World Model for Autonomous Driving". [![arXiv](https://img.shields.io/badge/arXiv-2311.17918-b31b1b.svg)](https://arxiv.org/abs/2311.17918) [![Code](https://img.shields.io/badge/Code-GitHub-green)](https://github.com/BraveGroup/Drive-WM)
* **Cam4DOCC**: "Benchmark for Camera-Only 4D Occupancy Forecasting in Autonomous Driving Applications". [![arXiv](https://img.shields.io/badge/arXiv-2311.17663-b31b1b.svg)](https://arxiv.org/abs/2311.17663) [![Code](https://img.shields.io/badge/Code-GitHub-green)](https://github.com/haomo-ai/Cam4DOcc)
* **Panacea**: "Panoramic and Controllable Video Generation for Autonomous Driving". [![arXiv](https://img.shields.io/badge/arXiv-2311.16813-b31b1b.svg)](https://arxiv.org/abs/2311.16813) [![Code](https://img.shields.io/badge/Code-GitHub-green)](https://panacea-ad.github.io/)
* **OccWorld**: "Learning a 3D Occupancy World Model for Autonomous Driving". [![arXiv](https://img.shields.io/badge/arXiv-2311.16038-b31b1b.svg)](https://arxiv.org/abs/2311.16038) [![Code](https://img.shields.io/badge/Code-GitHub-green)](https://github.com/wzzheng/OccWorld)

* **DrivingDiffusion**: "Layout-Guided multi-view driving scene video generation with latent diffusion model". [![arXiv](https://img.shields.io/badge/arXiv-2310.07771-b31b1b.svg)](https://arxiv.org/abs/2310.07771) [![Code](https://img.shields.io/badge/Code-GitHub-green)](https://github.com/shalfun/DrivingDiffusion)
* **SafeDreamer**: "Safe Reinforcement Learning with World Models". [![arXiv](https://img.shields.io/badge/arXiv-Paper-b31b1b.svg)](https://openreview.net/forum?id=tsE5HLYtYg) [![Code](https://img.shields.io/badge/Code-GitHub-green)](https://github.com/PKU-Alignment/SafeDreamer)
* **MagicDrive**: "Street View Generation with Diverse 3D Geometry Control". [![arXiv](https://img.shields.io/badge/arXiv-2310.02601-b31b1b.svg)](https://arxiv.org/abs/2310.02601) [![Code](https://img.shields.io/badge/Code-GitHub-green)](https://github.com/cure-lab/MagicDrive)
* **DriveDreamer**: "Towards Real-world-driven World Models for Autonomous Driving". [![arXiv](https://img.shields.io/badge/arXiv-2309.09777-b31b1b.svg)](https://arxiv.org/abs/2309.09777) [![Code](https://img.shields.io/badge/Code-GitHub-green)](https://github.com/JeffWang987/DriveDreamer)
* **SEM2**: "Enhance Sample Efficiency and Robustness of End-to-end Urban Autonomous Driving via Semantic Masked World Model". [![arXiv](https://img.shields.io/badge/arXiv-Paper-b31b1b.svg)](https://ieeexplore.ieee.org/abstract/document/10538211/)

* **COMPARATIVE STUDY OF WORLD MODELS**: "COMPARATIVE STUDY OF WORLD MODELS, NVAE- BASED HIERARCHICAL MODELS, AND NOISYNET- AUGMENTED MODELS IN CARRACING-V2". [![OpenReview](https://img.shields.io/badge/OpenReview-Paper-8E44AD.svg)](https://openreview.net/group?id=ICLR.cc/2025/Workshop/World_Models#tab-accept) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://sites.google.com/view/worldmodel-iclr2025/accepted-papers)

* **Knowledge Graphs as World Models**: "Knowledge Graphs as World Models for Material-Aware Obstacle Handling in Autonomous Vehicles". [![OpenReview](https://img.shields.io/badge/OpenReview-Paper-8E44AD.svg)](https://openreview.net/group?id=ICLR.cc/2025/Workshop/World_Models#tab-accept) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://sites.google.com/view/worldmodel-iclr2025/accepted-papers)

* **Uncertainty Modeling**: "Uncertainty Modeling in Autonomous Vehicle Trajectory Prediction: A Comprehensive Survey". [![OpenReview](https://img.shields.io/badge/OpenReview-Paper-8E44AD.svg)](https://openreview.net/group?id=ICML.cc/2025/Workshop/World_Models#tab-accept) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://worldmodelbench.github.io/)

* **Divide and Merge**: "Divide and Merge: Motion and Semantic Learning in End-to-End Autonomous Driving". [![OpenReview](https://img.shields.io/badge/OpenReview-Paper-8E44AD.svg)](https://openreview.net/group?id=NeurIPS.cc/2025/Workshop/EWM#tab-accept-oral) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://embodied-world-models.github.io/)

* **RDAR**: "RDAR: Reward-Driven Agent Relevance Estimation for Autonomous Driving". [![OpenReview](https://img.shields.io/badge/OpenReview-Paper-8E44AD.svg)](https://openreview.net/group?id=NeurIPS.cc/2025/Workshop/EWM#tab-accept-oral) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://embodied-world-models.github.io/)

## World Models for Embodied AI
### 1. Foundation Embodied World Models
- [⭐️] **Genie Envisioner**: "Genie Envisioner: A Unified World Foundation Platform for Robotic Manipulation". [![arXiv](https://img.shields.io/badge/arXiv-2508.05635-b31b1b.svg)](https://arxiv.org/abs/2508.05635) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://genie-envisioner.github.io/)
- [⭐️] **WoW**, "WoW: Towards a World omniscient World model Through Embodied Interaction". [![arXiv](https://img.shields.io/badge/arXiv-2509.22642-b31b1b.svg)](https://arxiv.org/abs/2509.22642) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://wow-world-model.github.io) [![Code](https://img.shields.io/badge/Code-GitHub-green)](https://github.com/wow-world-model/wow-world-model)
- **UnifoLM-WMA-0**, "UnifoLM-WMA-0: A World-Model-Action (WMA) Framework under UnifoLM Family". [![Website](https://img.shields.io/badge/Website-Link-blue)](https://unigen-x.github.io/unifolm-world-model-action.github.io/) [![Code](https://img.shields.io/badge/Code-GitHub-green)](https://github.com/unitreerobotics/unifolm-world-model-action/tree/main)
- [⭐️] **iVideoGPT**, "iVideoGPT: Interactive VideoGPTs are Scalable World Models". [![arXiv](https://img.shields.io/badge/arXiv-2405.15223-b31b1b.svg)](https://arxiv.org/abs/2405.15223)[![Website](https://img.shields.io/badge/Website-Link-blue)](https://thuml.github.io/iVideoGPT/)
* **Direct Robot Configuration Space Construction**: "Direct Robot Configuration Space Construction using Convolutional Encoder-Decoders". [![OpenReview](https://img.shields.io/badge/OpenReview-Paper-8E44AD.svg)](https://openreview.net/group?id=ICML.cc/2025/Workshop/World_Models#tab-accept) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://physical-world-modeling.github.io/)

* **ViPRA**: "ViPRA: Video Prediction for Robot Actions". [![OpenReview](https://img.shields.io/badge/OpenReview-Paper-8E44AD.svg)](https://openreview.net/group?id=NeurIPS.cc/2025/Workshop/EWM#tab-accept-oral) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://embodied-world-models.github.io/)

* **ROPES**: "ROPES: Robotic Pose Estimation via Score-based Causal Representation Learning". [![OpenReview](https://img.shields.io/badge/OpenReview-Paper-8E44AD.svg)](https://openreview.net/group?id=NeurIPS.cc/2025/Workshop/EWM#tab-accept-oral) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://embodied-world-models.github.io/)

### 2. World Models for Manipulation
- [⭐️] **FLARE**, "FLARE: Robot Learning with Implicit World Modeling". [![arXiv](https://img.shields.io/badge/arXiv-2505.15659-b31b1b.svg)](http://arxiv.org/abs/2505.15659) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://research.nvidia.com/labs/gear/flare/)
- [⭐️] **Enerverse**, "EnerVerse: Envisioning Embodied Future Space for Robotics Manipulation". [![arXiv](https://img.shields.io/badge/arXiv-2501.01895-b31b1b.svg)](http://arxiv.org/abs/2501.01895) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://sites.google.com/view/enerverse)
- [⭐️] **AgiBot-World**, "AgiBot World Colosseo: A Large-scale Manipulation Platform for Scalable and Intelligent Embodied Systems". [![arXiv](https://img.shields.io/badge/arXiv-2503.06669-b31b1b.svg)](https://arxiv.org/abs/2503.06669) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://agibot-world.com/) [![Code](https://img.shields.io/badge/Code-GitHub-green)](https://github.com/OpenDriveLab/AgiBot-World)
- [⭐️] **DyWA**: "DyWA: Dynamics-adaptive World Action Model for Generalizable Non-prehensile Manipulation" [![arXiv](https://img.shields.io/badge/arXiv-2503.16806-b31b1b.svg)](https://arxiv.org/abs/2503.16806) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://pku-epic.github.io/DyWA/)
- [⭐️] **TesserAct**, "TesserAct: Learning 4D Embodied World Models". [![arXiv](https://img.shields.io/badge/arXiv-2504.20995-b31b1b.svg)](https://arxiv.org/abs/2504.20995) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://tesseractworld.github.io/)
- [⭐️] **DreamGen**: "DreamGen: Unlocking Generalization in Robot Learning through Video World Models". [![arXiv](https://img.shields.io/badge/arXiv-2505.12705-b31b1b.svg)](https://arxiv.org/abs/2505.12705) [![Code](https://img.shields.io/badge/Code-GitHub-green)](https://github.com/nvidia/GR00T-dreams)
- [⭐️] **HiP**, "Compositional Foundation Models for Hierarchical Planning". [![arXiv](https://img.shields.io/badge/arXiv-2309.08587-b31b1b.svg)](http://arxiv.org/abs/2309.08587) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://hierarchical-planning-foundation-model.github.io/)
- **PAR**: "Physical Autoregressive Model for Robotic Manipulation without Action Pretraining". [![arXiv](https://img.shields.io/badge/arXiv-2508.09822-b31b1b.svg)](https://arxiv.org/abs/2508.09822) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://songzijian1999.github.io/PAR_ProjectPage/)
- **iMoWM**: "iMoWM: Taming Interactive Multi-Modal World Model for Robotic Manipulation". [![arXiv](https://img.shields.io/badge/arXiv-2510.07313-b31b1b.svg)](https://arxiv.org/abs/2510.07313) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://xingyoujun.github.io/imowm/)
- **WristWorld**: "WristWorld: Generating Wrist-Views via 4D World Models for Robotic Manipulation". [![arXiv](https://img.shields.io/badge/arXiv-2510.07313-b31b1b.svg)](https://arxiv.org/abs/2510.07313)
- "A Recipe for Efficient Sim-to-Real Transfer in Manipulation with Online Imitation-Pretrained World Models". [![arXiv](https://img.shields.io/badge/arXiv-2510.02538-b31b1b.svg)](https://arxiv.org/abs/2510.02538)
- **EMMA**: "EMMA: Generalizing Real-World Robot Manipulation via Generative Visual Transfer". [![arXiv](https://img.shields.io/badge/arXiv-2509.22407-b31b1b.svg)](https://arxiv.org/abs/2509.22407)
- **PhysTwin**, "PhysTwin: Physics-Informed Reconstruction and Simulation of Deformable Objects from Videos". [![arXiv](https://img.shields.io/badge/arXiv-2503.17973-b31b1b.svg)](http://arxiv.org/abs/2503.17973) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://jianghanxiao.github.io/phystwin-web/) [![Code](https://img.shields.io/badge/Code-GitHub-green)](https://github.com/Jianghanxiao/PhysTwin)
- [⭐️] **KeyWorld**: "KeyWorld: Key Frame Reasoning Enables Effective and Efficient World Models". [![arXiv](https://img.shields.io/badge/arXiv-2509.21027-b31b1b.svg)](https://arxiv.org/abs/2509.21027)
- **World4RL**: "World4RL: Diffusion World Models for Policy Refinement with Reinforcement Learning for Robotic Manipulation". [![arXiv](https://img.shields.io/badge/arXiv-2509.19080-b31b1b.svg)](https://arxiv.org/abs/2509.19080)
- [⭐️] **SAMPO**: "SAMPO:Scale-wise Autoregression with Motion PrOmpt for generative world models". [![arXiv](https://img.shields.io/badge/arXiv-2509.15536-b31b1b.svg)](https://arxiv.org/abs/2509.15536)
- **PhysicalAgent**: "PhysicalAgent: Towards General Cognitive Robotics with Foundation World Models". [![arXiv](https://img.shields.io/badge/arXiv-2509.13903-b31b1b.svg)](https://arxiv.org/abs/2509.13903)
- "Empowering Multi-Robot Cooperation via Sequential World Models". [![arXiv](https://img.shields.io/badge/arXiv-2509.13095-b31b1b.svg)](https://arxiv.org/abs/2509.13095)
- [⭐️] "Learning Primitive Embodied World Models: Towards Scalable Robotic Learning". [![arXiv](https://img.shields.io/badge/arXiv-2508.20840-b31b1b.svg)](https://arxiv.org/pdf/2508.20840) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://qiaosun22.github.io/PrimitiveWorld/)
- [⭐️] **GWM**: "GWM: Towards Scalable Gaussian World Models for Robotic Manipulation". [![arXiv](https://img.shields.io/badge/arXiv-2508.17600-b31b1b.svg)](https://arxiv.org/abs/2508.17600) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://gaussian-world-model.github.io/)
- [⭐️] **Flow-as-Action**, "Latent Policy Steering with Embodiment-Agnostic Pretrained World Models". [![arXiv](https://img.shields.io/badge/arXiv-2507.13340-b31b1b.svg)](https://arxiv.org/abs/2507.13340)
- **EmbodieDreamer**: "EmbodieDreamer: Advancing Real2Sim2Real Transfer for Policy Training via Embodied World Modeling". [![arXiv](https://img.shields.io/badge/arXiv-2507.05198-b31b1b.svg)](https://arxiv.org/pdf/2507.05198) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://embodiedreamer.github.io/)
- **RoboScape**: "RoboScape: Physics-informed Embodied World Model". [![arXiv](https://img.shields.io/badge/arXiv-2506.23135-b31b1b.svg)](https://arxiv.org/abs/2506.23135) [![Code](https://img.shields.io/badge/Code-GitHub-green)](https://github.com/tsinghua-fib-lab/RoboScape)
- **FWM**, "Factored World Models for Zero-Shot Generalization in Robotic Manipulation". [![arXiv](https://img.shields.io/badge/arXiv-2202.05333-b31b1b.svg)](http://arxiv.org/abs/2202.05333)
- [⭐️] **ParticleFormer**: "ParticleFormer: A 3D Point Cloud World Model for Multi-Object, Multi-Material Robotic Manipulation". [![arXiv](https://img.shields.io/badge/arXiv-2506.23126-b31b1b.svg)](https://arxiv.org/abs/2506.23126) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://particleformer.github.io/)
- **ManiGaussian++**: "ManiGaussian++: General Robotic Bimanual Manipulation with Hierarchical Gaussian World Model". [![arXiv](https://img.shields.io/badge/arXiv-2506.19842-b31b1b.svg)](https://arxiv.org/abs/2506.19842) [![Code](https://img.shields.io/badge/Code-GitHub-green)](https://github.com/April-Yz/ManiGaussian_Bimanual)
- **ReOI**: "Reimagination with Test-time Observation Interventions: Distractor-Robust World Model Predictions for Visual Model Predictive Control". [![arXiv](https://img.shields.io/badge/arXiv-2506.16565-b31b1b.svg)](https://arxiv.org/abs/2506.16565)
- **GAF**: "GAF: Gaussian Action Field as a Dynamic World Model for Robotic Manipulation". [![arXiv](https://img.shields.io/badge/arXiv-2506.14135-b31b1b.svg)](https://arxiv.org/abs/2506.14135) [![Website](https://img.shields.io/badge/Website-Link-blue)](http://chaiying1.github.io/GAF.github.io/project_page/)
- "Prompting with the Future: Open-World Model Predictive Control with Interactive Digital Twins". [![arXiv](https://img.shields.io/badge/arXiv-2506.13761-b31b1b.svg)](https://arxiv.org/abs/2506.13761) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://prompting-with-the-future.github.io/)
- "Time-Aware World Model for Adaptive Prediction and Control". [![arXiv](https://img.shields.io/badge/arXiv-2506.08441-b31b1b.svg)](https://arxiv.org/abs/2506.08441)
- [⭐️] **3DFlowAction**: "3DFlowAction: Learning Cross-Embodiment Manipulation from 3D Flow World Model". [![arXiv](https://img.shields.io/badge/arXiv-2506.06199-b31b1b.svg)](https://arxiv.org/abs/2506.06199)
- [⭐️] **ORV**: "ORV: 4D Occupancy-centric Robot Video Generation". [![arXiv](https://img.shields.io/badge/arXiv-2506.03079-b31b1b.svg)](https://arxiv.org/abs/2506.03079) [![Code](https://img.shields.io/badge/Code-GitHub-green)](https://github.com/OrangeSodahub/ORV) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://orangesodahub.github.io/ORV/)
- [⭐️] **WoMAP**: "WoMAP: World Models For Embodied Open-Vocabulary Object Localization". [![arXiv](https://img.shields.io/badge/arXiv-2506.01600-b31b1b.svg)](https://arxiv.org/abs/2506.01600)
- "Sparse Imagination for Efficient Visual World Model Planning". [![arXiv](https://img.shields.io/badge/arXiv-2506.01392-b31b1b.svg)](https://arxiv.org/abs/2506.01392)
- [⭐️] **OSVI-WM**: "OSVI-WM: One-Shot Visual Imitation for Unseen Tasks using World-Model-Guided Trajectory Generation". [![arXiv](https://img.shields.io/badge/arXiv-2505.20425-b31b1b.svg)](https://arxiv.org/abs/2505.20425)
- [⭐️] **LaDi-WM**: "LaDi-WM: A Latent Diffusion-based World Model for Predictive Manipulation". [![arXiv](https://img.shields.io/badge/arXiv-2505.11528-b31b1b.svg)](https://arxiv.org/abs/2505.11528)
- **FlowDreamer**: "FlowDreamer: A RGB-D World Model with Flow-based Motion Representations for Robot Manipulation". [![arXiv](https://img.shields.io/badge/arXiv-2505.10075-b31b1b.svg)](https://arxiv.org/abs/2505.10075) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://sharinka0715.github.io/FlowDreamer/)
- **PIN-WM**: "PIN-WM: Learning Physics-INformed World Models for Non-Prehensile Manipulation". [![arXiv](https://img.shields.io/badge/arXiv-2504.16693-b31b1b.svg)](https://arxiv.org/abs/2504.16693)
- **RoboMaster**, "Learning Video Generation for Robotic Manipulation with Collaborative Trajectory Control". [![arXiv](https://img.shields.io/badge/arXiv-2506.01943-b31b1b.svg)](http://arxiv.org/abs/2506.01943) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://fuxiao0719.github.io/projects/robomaster/) [![Code](https://img.shields.io/badge/Code-GitHub-green)](https://github.com/KwaiVGI/RoboMaster)
- **ManipDreamer**: "ManipDreamer: Boosting Robotic Manipulation World Model with Action Tree and Visual Guidance". [![arXiv](https://img.shields.io/badge/arXiv-2504.16464-b31b1b.svg)](https://arxiv.org/abs/2504.16464)
- [⭐️] **AdaWorld**: "AdaWorld: Learning Adaptable World Models with Latent Actions" [![arXiv](https://img.shields.io/badge/arXiv-2503.18938-b31b1b.svg)](https://arxiv.org/abs/2503.18938) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://adaptable-world-model.github.io/)
- "Towards Suturing World Models: Learning Predictive Models for Robotic Surgical Tasks" [![arXiv](https://img.shields.io/badge/arXiv-2503.12531-b31b1b.svg)](https://arxiv.org/abs/2503.12531) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://mkturkcan.github.io/suturingmodels/)
- [⭐️] **EVA**: "EVA: An Embodied World Model for Future Video Anticipation". [![arXiv](https://img.shields.io/badge/arXiv-2410.15461-b31b1b.svg)](https://arxiv.org/abs/2410.15461) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://sites.google.com/view/eva-publi)
- "Representing Positional Information in Generative World Models for Object Manipulation". [![arXiv](https://img.shields.io/badge/arXiv-2409.12005-b31b1b.svg)](https://arxiv.org/abs/2409.12005)
- **DexSim2Real$^2$**: "DexSim2Real$^2: Building Explicit World Model for Precise Articulated Object Dexterous Manipulation". [![arXiv](https://img.shields.io/badge/arXiv-2409.08750-b31b1b.svg)](https://arxiv.org/abs/2409.08750)
- "Physically Embodied Gaussian Splatting: A Realtime Correctable World Model for Robotics". [![arXiv](https://img.shields.io/badge/arXiv-2406.10788-b31b1b.svg)](https://arxiv.org/abs/2406.10788) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://embodied-gaussians.github.io/)
- [⭐️] **LUMOS**: "LUMOS: Language-Conditioned Imitation Learning with World Models". [![arXiv](https://img.shields.io/badge/arXiv-2503.10370-b31b1b.svg)](https://arxiv.org/abs/2503.10370) [![Website](https://img.shields.io/badge/Website-Link-blue)](http://lumos.cs.uni-freiburg.de/)
- [⭐️] "Object-Centric World Model for Language-Guided Manipulation" [![arXiv](https://img.shields.io/badge/arXiv-2503.06170-b31b1b.svg)](https://arxiv.org/abs/2503.06170)
- [⭐️] **DEMO^3**: "Multi-Stage Manipulation with Demonstration-Augmented Reward, Policy, and World Model Learning" [![arXiv](https://img.shields.io/badge/arXiv-2503.01837-b31b1b.svg)](https://arxiv.org/abs/2503.01837) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://adrialopezescoriza.github.io/demo3/)
- "Strengthening Generative Robot Policies through Predictive World Modeling". [![arXiv](https://img.shields.io/badge/arXiv-2502.00622-b31b1b.svg)](https://arxiv.org/abs/2502.00622) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://computationalrobotics.seas.harvard.edu/GPC)
- **RoboHorizon**: "RoboHorizon: An LLM-Assisted Multi-View World Model for Long-Horizon Robotic Manipulation. [![arXiv](https://img.shields.io/badge/arXiv-2501.06605-b31b1b.svg)](https://arxiv.org/abs/2501.06605)
- **Dream to Manipulate**: "Dream to Manipulate: Compositional World Models Empowering Robot Imitation Learning with Imagination". [![arXiv](https://img.shields.io/badge/arXiv-2412.14957-b31b1b.svg)](https://arxiv.org/abs/2412.14957) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://leobarcellona.github.io/DreamToManipulate/)
- [⭐️] **RoboDreamer**: "RoboDreamer: Learning Compositional World Models for Robot Imagination". [![arXiv](https://img.shields.io/badge/arXiv-2404.12377-b31b1b.svg)](https://arxiv.org/abs/2404.12377) [![Code](https://img.shields.io/badge/Code-GitHub-green)](https://robovideo.github.io/)
- **ManiGaussian**: "ManiGaussian: Dynamic Gaussian Splatting for Multi-task Robotic Manipulation". [![arXiv](https://img.shields.io/badge/arXiv-2403.08321-b31b1b.svg)](https://arxiv.org/abs/2403.08321) [![Code](https://img.shields.io/badge/Code-GitHub-green)](https://guanxinglu.github.io/ManiGaussian/)
- [⭐️] **WHALE**: "WHALE: Towards Generalizable and Scalable World Models for Embodied Decision-making". [![arXiv](https://img.shields.io/badge/arXiv-2411.05619-b31b1b.svg)](https://arxiv.org/abs/2411.05619)
- [⭐️] **VisualPredicator**: "VisualPredicator: Learning Abstract World Models with Neuro-Symbolic Predicates for Robot Planning". [![arXiv](https://img.shields.io/badge/arXiv-2410.23156-b31b1b.svg)](https://arxiv.org/abs/2410.23156)
- [⭐️] "Multi-Task Interactive Robot Fleet Learning with Visual World Models". [![arXiv](https://img.shields.io/badge/arXiv-2410.22689-b31b1b.svg)](https://arxiv.org/abs/2410.22689) [![Code](https://img.shields.io/badge/Code-GitHub-green)](https://ut-austin-rpl.github.io/sirius-fleet/)
- **PIVOT-R**: "PIVOT-R: Primitive-Driven Waypoint-Aware World Model for Robotic Manipulation". [![arXiv](https://img.shields.io/badge/arXiv-2410.10394-b31b1b.svg)](https://arxiv.org/pdf/2410.10394)
- **Video2Action**, "Grounding Video Models to Actions through Goal Conditioned Exploration". [![arXiv](https://img.shields.io/badge/arXiv-2411.07223-b31b1b.svg)](http://arxiv.org/abs/2411.07223) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://video-to-action.github.io/) [![Code](https://img.shields.io/badge/Code-GitHub-green)](https://github.com/video-to-action/video-to-action-release)
- **Diffuser**, "Planning with Diffusion for Flexible Behavior Synthesis". [![arXiv](https://img.shields.io/badge/arXiv-2205.09991-b31b1b.svg)](http://arxiv.org/abs/2205.09991)
- **Decision Diffuser**, "Is Conditional Generative Modeling all you need for Decision-Making?". [![arXiv](https://img.shields.io/badge/arXiv-2211.15657-b31b1b.svg)](http://arxiv.org/abs/2211.15657)
- **Potential Based Diffusion Motion Planning**, "Potential Based Diffusion Motion Planning". [![arXiv](https://img.shields.io/badge/arXiv-2407.06169-b31b1b.svg)](http://arxiv.org/abs/2407.06169)
* **GRIM**: "GRIM: Task-Oriented Grasping with Conditioning on Generative Examples". [![OpenReview](https://img.shields.io/badge/OpenReview-Paper-8E44AD.svg)](https://openreview.net/group?id=ICML.cc/2025/Workshop/World_Models#tab-accept) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://physical-world-modeling.github.io/)

* **World4Omni**: "World4Omni: A Zero-Shot Framework from Image Generation World Model to Robotic Manipulation". [![OpenReview](https://img.shields.io/badge/OpenReview-Paper-8E44AD.svg)](https://openreview.net/group?id=ICML.cc/2025/Workshop/World_Models#tab-accept) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://physical-world-modeling.github.io/)

* **In-Context Policy Iteration**: "In-Context Policy Iteration for Dynamic Manipulation". [![OpenReview](https://img.shields.io/badge/OpenReview-Paper-8E44AD.svg)](https://openreview.net/group?id=NeurIPS.cc/2025/Workshop/EWM#tab-accept-oral) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://embodied-world-models.github.io/)

* **HDFlow**: "HDFlow: Hierarchical Diffusion-Flow Planning for Long-horizon Robotic Assembly". [![OpenReview](https://img.shields.io/badge/OpenReview-Paper-8E44AD.svg)](https://openreview.net/group?id=NeurIPS.cc/2025/Workshop/EWM#tab-accept-oral) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://embodied-world-models.github.io/)

* **Mobile Manipulation with Active Inference**: "Mobile Manipulation with Active Inference for Long-Horizon Rearrangement Tasks". [![OpenReview](https://img.shields.io/badge/OpenReview-Paper-8E44AD.svg)](https://openreview.net/group?id=NeurIPS.cc/2025/Workshop/EWM#tab-accept-oral) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://embodied-world-models.github.io/)

### 3. World Models for Navigation
- [⭐️] **NWM**, "Navigation World Models". [![arXiv](https://img.shields.io/badge/arXiv-2412.03572-b31b1b.svg)](https://arxiv.org/abs/2412.03572) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://www.amirbar.net/nwm/)
- [⭐️] **MindJourney**: "MindJourney: Test-Time Scaling with World Models for Spatial Reasoning". [![arXiv](https://img.shields.io/badge/arXiv-2507.12508-b31b1b.svg)](https://arxiv.org/abs/2507.12508) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://umass-embodied-agi.github.io/MindJourney)
* **Test-Time Scaling**: "Test-Time Scaling with World Models for Spatial Reasoning". [![arXiv](https://img.shields.io/badge/arXiv-2507.12508-b31b1b.svg)](https://arxiv.org/abs/2507.12508) [![OpenReview](https://img.shields.io/badge/OpenReview-Paper-8E44AD.svg)](https://openreview.net/group?id=NeurIPS.cc/2025/Workshop/EWM#tab-accept-oral) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://umass-embodied-agi.github.io/MindJourney/)

* **Scaling Inference-Time Search**: "Scaling Inference-Time Search with Vision Value Model for Improved Visual Comprehension". [![OpenReview](https://img.shields.io/badge/OpenReview-Paper-8E44AD.svg)](https://openreview.net/group?id=ICLR.cc/2025/Workshop/World_Models#tab-accept) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://sites.google.com/view/worldmodel-iclr2025/accepted-papers)

* **FalconWing**: "FalconWing: An Ultra-Light Fixed-Wing Platform for Indoor Aerial Applications". [![OpenReview](https://img.shields.io/badge/OpenReview-Paper-8E44AD.svg)](https://openreview.net/group?id=NeurIPS.cc/2025/Workshop/EWM#tab-accept-oral) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://embodied-world-models.github.io/)

* **Foundation Models as World Models**: "Foundation Models as World Models: A Foundational Study in Text-Based GridWorlds". [![OpenReview](https://img.shields.io/badge/OpenReview-Paper-8E44AD.svg)](https://openreview.net/group?id=NeurIPS.cc/2025/Workshop/EWM#tab-accept-oral) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://embodied-world-models.github.io/)

* **Geosteering Through the Lens of Decision Transformers**: "Geosteering Through the Lens of Decision Transformers: Toward Embodied Sequence Decision-Making". [![OpenReview](https://img.shields.io/badge/OpenReview-Paper-8E44AD.svg)](https://openreview.net/group?id=NeurIPS.cc/2025/Workshop/EWM#tab-accept-oral) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://embodied-world-models.github.io/)

* **Latent Weight Diffusion**: "Latent Weight Diffusion: Generating reactive policies instead of trajectories". [![OpenReview](https://img.shields.io/badge/OpenReview-Paper-8E44AD.svg)](https://openreview.net/group?id=NeurIPS.cc/2025/Workshop/EWM#tab-accept-oral) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://embodied-world-models.github.io/)

* **Abstract Sim2Real**: "Abstract Sim2Real through Approximate Information States". [![OpenReview](https://img.shields.io/badge/OpenReview-Paper-8E44AD.svg)](https://openreview.net/group?id=NeurIPS.cc/2025/Workshop/EWM#tab-accept-oral) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://embodied-world-models.github.io/)

* **FLAM**: "FLAM: Scaling Latent Action Models with Factorization". [![OpenReview](https://img.shields.io/badge/OpenReview-Paper-8E44AD.svg)](https://openreview.net/group?id=NeurIPS.cc/2025/Workshop/EWM#tab-accept-oral) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://embodied-world-models.github.io/)

- **NavMorph**: "NavMorph: A Self-Evolving World Model for Vision-and-Language Navigation in Continuous Environments". [![arXiv](https://img.shields.io/badge/arXiv-2506.23468-b31b1b.svg)](https://arxiv.org/abs/2506.23468) [![Code](https://img.shields.io/badge/Code-GitHub-green)](https://github.com/Feliciaxyao/NavMorph)
- **Unified World Models**: "Unified World Models: Memory-Augmented Planning and Foresight for Visual Navigation". [![arXiv](https://img.shields.io/badge/arXiv-2510.08713-b31b1b.svg)](https://arxiv.org/abs/2510.08713) [[code](https://github.com/F1y1113/UniWM)]
- **RECON**, "Rapid Exploration for Open-World Navigation with Latent Goal Models". [![arXiv](https://img.shields.io/badge/arXiv-2104.05859-b31b1b.svg)](http://arxiv.org/abs/2104.05859) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://sites.google.com/view/recon-robot)
- **WMNav**: "WMNav: Integrating Vision-Language Models into World Models for Object Goal Navigation". [![arXiv](https://img.shields.io/badge/arXiv-2503.02247-b31b1b.svg)](https://arxiv.org/abs/2503.02247) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://b0b8k1ng.github.io/WMNav/)
- **NaVi-WM**, "Deductive Chain-of-Thought Augmented Socially-aware Robot Navigation World Model". [![arXiv](https://img.shields.io/badge/arXiv-2510.23509-b31b1b.svg)](https://arxiv.org/abs/2510.23509) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://sites.google.com/view/NaviWM)
- **AIF**, "Deep Active Inference with Diffusion Policy and Multiple Timescale World Model for Real-World Exploration and Navigation". [![arXiv](https://img.shields.io/badge/arXiv-2510.23258-b31b1b.svg)](https://arxiv.org/abs/2510.23258)
- "Kinodynamic Motion Planning for Mobile Robot Navigation across Inconsistent World Models". [![arXiv](https://img.shields.io/badge/arXiv-2509.26339-b31b1b.svg)](https://arxiv.org/abs/2509.26339)
- "World Model Implanting for Test-time Adaptation of Embodied Agents". [![arXiv](https://img.shields.io/badge/arXiv-2509.03956-b31b1b.svg)](https://arxiv.org/abs/2509.03956)
- "Imaginative World Modeling with Scene Graphs for Embodied Agent Navigation". [![arXiv](https://img.shields.io/badge/arXiv-2508.06990-b31b1b.svg)](https://arxiv.org/abs/2508.06990)
- [⭐️] **Persistent Embodied World Models**, "Learning 3D Persistent Embodied World Models". [![arXiv](https://img.shields.io/badge/arXiv-2505.05495-b31b1b.svg)](https://arxiv.org/abs/2505.05495)
- "Perspective-Shifted Neuro-Symbolic World Models: A Framework for Socially-Aware Robot Navigation" [![arXiv](https://img.shields.io/badge/arXiv-2503.20425-b31b1b.svg)](https://arxiv.org/abs/2503.20425)
- **X-MOBILITY**: "X-MOBILITY: End-To-End Generalizable Navigation via World Modeling". [![arXiv](https://img.shields.io/badge/arXiv-2410.17491-b31b1b.svg)](https://arxiv.org/abs/2410.17491)
- **MWM**, "Masked World Models for Visual Control". [![arXiv](https://img.shields.io/badge/arXiv-2206.14244-b31b1b.svg)](http://arxiv.org/abs/2206.14244) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://sites.google.com/view/mwm-rl) [![Code](https://img.shields.io/badge/Code-GitHub-green)](https://github.com/younggyoseo/MWM)

### 4. World Models for Locomotion
Locomotion:
- [⭐️] **Ego-VCP**, "Ego-Vision World Model for Humanoid Contact Planning". [![arXiv](https://img.shields.io/badge/arXiv-2510.11682-b31b1b.svg)](https://arxiv.org/abs/2510.11682) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://ego-vcp.github.io/) [![Code](https://img.shields.io/badge/Code-GitHub-green)](https://github.com/HybridRobotics/Ego-VCP)
- [⭐️] **RWM-O**, "Offline Robotic World Model: Learning Robotic Policies without a Physics Simulator". [![arXiv](https://img.shields.io/badge/arXiv-2504.16680-b31b1b.svg)](https://arxiv.org/abs/2504.16680)
- [⭐️] **DWL**: "Advancing Humanoid Locomotion: Mastering Challenging Terrains with Denoising World Model Learning". [![arXiv](https://img.shields.io/badge/arXiv-2408.14472-b31b1b.svg)](https://arxiv.org/abs/2408.14472)
- **HRSSM**: "Learning Latent Dynamic Robust Representations for World Models". [![arXiv](https://img.shields.io/badge/arXiv-2405.06263-b31b1b.svg)](https://arxiv.org/abs/2405.06263) [![Code](https://img.shields.io/badge/Code-GitHub-green)](https://github.com/bit1029public/HRSSM)
- **WMP**: "World Model-based Perception for Visual Legged Locomotion". [![arXiv](https://img.shields.io/badge/arXiv-2409.16784-b31b1b.svg)](https://arxiv.org/abs/2409.16784) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://wmp-loco.github.io/)
- **TrajWorld**, "Trajectory World Models for Heterogeneous Environments". [![arXiv](https://img.shields.io/badge/arXiv-2502.01366-b31b1b.svg)](https://arxiv.org/abs/2502.01366) [![Code](https://img.shields.io/badge/Code-GitHub-green)](https://github.com/thuml/TrajWorld)
- **Puppeteer**: "Hierarchical World Models as Visual Whole-Body Humanoid Controllers". [![arXiv](https://img.shields.io/badge/arXiv-2405.18418-b31b1b.svg)](https://arxiv.org/abs/2405.18418) [![Code](https://img.shields.io/badge/Code-GitHub-green)](https://nicklashansen.com/rlpuppeteer)
- **ProTerrain**: "ProTerrain: Probabilistic Physics-Informed Rough Terrain World Modeling". [![arXiv](https://img.shields.io/badge/arXiv-2510.19364-b31b1b.svg)](https://arxiv.org/abs/2510.19364)
- **Occupancy World Model**, "Occupancy World Model for Robots". [![arXiv](https://img.shields.io/badge/arXiv-2505.05512-b31b1b.svg)](https://arxiv.org/abs/2505.05512)
- [⭐️] "Accelerating Model-Based Reinforcement Learning with State-Space World Models". [![arXiv](https://img.shields.io/badge/arXiv-2502.20168-b31b1b.svg)](https://arxiv.org/abs/2502.20168)
- [⭐️] "Learning Humanoid Locomotion with World Model Reconstruction". [![arXiv](https://img.shields.io/badge/arXiv-2502.16230-b31b1b.svg)](https://arxiv.org/abs/2502.16230)
- [⭐️] **Robotic World Model**: "Robotic World Model: A Neural Network Simulator for Robust Policy Optimization in Robotics. [![arXiv](https://img.shields.io/badge/arXiv-2501.10100-b31b1b.svg)](https://arxiv.org/abs/2501.10100)

Loco-Manipulation:
- [⭐️] **1X World Model**, 1X World Model. [![Blog](https://img.shields.io/badge/Blog-Link-orange)](https://www.1x.tech/discover/1x-world-model)
- [⭐️] **GROOT-Dreams**, "Dream Come True — NVIDIA Isaac GR00T-Dreams Advances Robot Training With Synthetic Data and Neural Simulation". [![Blog](https://img.shields.io/badge/Blog-Link-orange)](https://blogs.nvidia.com/blog/nvidia-gtc-washington-dc-2025-news/#gr00t-dreams)
- **Humanoid World Models**: "Humanoid World Models: Open World Foundation Models for Humanoid Robotics". [![arXiv](https://img.shields.io/badge/arXiv-2506.01182-b31b1b.svg)](https://arxiv.org/abs/2506.01182)
- **Ego-Agent**, "EgoAgent: A Joint Predictive Agent Model in Egocentric Worlds". [![arXiv](https://img.shields.io/badge/arXiv-2502.05857-b31b1b.svg)](https://arxiv.org/abs/2502.05857)
- **D^2PO**, "World Modeling Makes a Better Planner: Dual Preference Optimization for Embodied Task Planning" [![arXiv](https://img.shields.io/badge/arXiv-2503.10480-b31b1b.svg)](https://arxiv.org/abs/2503.10480)
- **COMBO**: "COMBO: Compositional World Models for Embodied Multi-Agent Cooperation. [![arXiv](https://img.shields.io/badge/arXiv-2404.10775-b31b1b.svg)](https://arxiv.org/abs/2404.10775) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://vis-www.cs.umass.edu/combo/) [![Code](https://img.shields.io/badge/Code-GitHub-green)](https://github.com/UMass-Foundation-Model/COMBO)
* **Scalable Humanoid Whole-Body Control**: "Scalable Humanoid Whole-Body Control via Differentiable Neural Network Dynamics". [![OpenReview](https://img.shields.io/badge/OpenReview-Paper-8E44AD.svg)](https://openreview.net/group?id=ICLR.cc/2025/Workshop/World_Models#tab-accept) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://sites.google.com/view/worldmodel-iclr2025/accepted-papers)

* **HuWo**: "HuWo: Building Physical Interaction World Models for Humanoid Robot Locomotion". [![OpenReview](https://img.shields.io/badge/OpenReview-Paper-8E44AD.svg)](https://openreview.net/group?id=ICLR.cc/2025/Workshop/World_Models#tab-accept) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://sites.google.com/view/worldmodel-iclr2025/accepted-papers)

* **Bridging the Sim-to-Real Gap**: "Bridging the Sim-to-Real Gap in Humanoid Dynamics via Learned Nonlinear Operators". [![OpenReview](https://img.shields.io/badge/OpenReview-Paper-8E44AD.svg)](https://openreview.net/group?id=NeurIPS.cc/2025/Workshop/EWM#tab-accept-oral) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://embodied-world-models.github.io/)

### 5. World Models x VLAs
Unifying World Models and VLAs in one model:
- [⭐️] **CoT-VLA**: "CoT-VLA: Visual Chain-of-Thought Reasoning for Vision-Language-Action Models". [![arXiv](https://img.shields.io/badge/arXiv-2503.22020-b31b1b.svg)](https://arxiv.org/abs/2503.22020) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://cot-vla.github.io/)
- [⭐️] **UP-VLA**, "UP-VLA: A Unified Understanding and Prediction Model for Embodied Agent". [![arXiv](https://img.shields.io/badge/arXiv-2501.18867-b31b1b.svg)](https://arxiv.org/abs/2501.18867) [![Code](https://img.shields.io/badge/Code-GitHub-green)](https://github.com/CladernyJorn/UP-VLA)
- [⭐️] **VPP**, "Video Prediction Policy: A Generalist Robot Policy with Predictive Visual Representations". [![arXiv](https://img.shields.io/badge/arXiv-2412.14803-b31b1b.svg)](https://arxiv.org/abs/2412.14803) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://video-prediction-policy.github.io)
- [⭐️] **FLARE**: "FLARE: Robot Learning with Implicit World Modeling". [![arXiv](https://img.shields.io/badge/arXiv-2505.15659-b31b1b.svg)](https://arxiv.org/abs/2505.15659) [![Code](https://img.shields.io/badge/Code-GitHub-green)](https://github.com/NVIDIA/Isaac-GR00T) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://research.nvidia.com/labs/gear/flare)
- [⭐️] **MinD**: "MinD: Unified Visual Imagination and Control via Hierarchical World Models". [![arXiv](https://img.shields.io/badge/arXiv-2506.18897-b31b1b.svg)](https://arxiv.org/abs/2506.18897) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://manipulate-in-dream.github.io/)
- [⭐️] **DreamVLA**, "DreamVLA: A Vision-Language-Action Model Dreamed with Comprehensive World Knowledge". [![arXiv](https://img.shields.io/badge/arXiv-2507.04447-b31b1b.svg)](https://arxiv.org/abs/2507.04447) [![Code](https://img.shields.io/badge/Code-GitHub-green)](https://github.com/Zhangwenyao1/DreamVLA) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://zhangwenyao1.github.io/DreamVLA/)
- [⭐️] **WorldVLA**: "WorldVLA: Towards Autoregressive Action World Model". [![arXiv](https://img.shields.io/badge/arXiv-2506.21539-b31b1b.svg)](https://arxiv.org/abs/2506.21539) [![Code](https://img.shields.io/badge/Code-GitHub-green)](https://github.com/alibaba-damo-academy/WorldVLA)
- **3D-VLA**: "3D-VLA: A 3D Vision-Language-Action Generative World Model". [![arXiv](https://img.shields.io/badge/arXiv-2403.09631-b31b1b.svg)](https://arxiv.org/abs/2403.09631)
- **LAWM**: "Latent Action Pretraining Through World Modeling". [![arXiv](https://img.shields.io/badge/arXiv-2509.18428-b31b1b.svg)](https://arxiv.org/abs/2509.18428) [![Code](https://img.shields.io/badge/Code-GitHub-green)](https://github.com/baheytharwat/lawm)
- [⭐️] **UniVLA**: "UniVLA: Unified Vision-Language-Action Model". [![arXiv](https://img.shields.io/badge/arXiv-2506.19850-b31b1b.svg)](https://arxiv.org/abs/2506.19850) [![Code](https://img.shields.io/badge/Code-GitHub-green)](https://robertwyq.github.io/univla.github.)
- [⭐️] **dVLA**, "dVLA: Diffusion Vision-Language-Action Model with Multimodal Chain-of-Thought". [![arXiv](https://img.shields.io/badge/arXiv-2509.25681-b31b1b.svg)](https://arxiv.org/abs/2509.25681)
- [⭐️] **Vidar**, "Vidar: Embodied Video Diffusion Model for Generalist Manipulation". [![arXiv](https://img.shields.io/badge/arXiv-2507.12898-b31b1b.svg)](https://arxiv.org/pdf/2507.12898)
- [⭐️] **UD-VLA**, "Unified Diffusion VLA: Vision-Language-Action Model via Joint Discrete Denoising Diffusion Process". [![arXiv](https://img.shields.io/badge/arXiv-2511.01718-b31b1b.svg)](https://arxiv.org/abs/2511.01718) [![Code](https://img.shields.io/badge/Code-GitHub-green)](https://github.com/OpenHelix-Team/UD-VLA) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://irpn-eai.github.io/UD-VLA.github.io/)
- **Goal-VLA**: "Goal-VLA: Image-Generative VLMs as Object-Centric World Models Empowering Zero-shot Robot Manipulation". [![arXiv](https://img.shields.io/badge/arXiv-2506.23919-b31b1b.svg)](https://arxiv.org/abs/2506.23919) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://nus-lins-lab.github.io/goalvlaweb/)

Combining World Models and VLAs:
- [⭐️] **Ctrl-World**: "Ctrl-World: A Controllable Generative World Model for Robot Manipulation". [![arXiv](https://img.shields.io/badge/arXiv-2510.10125-b31b1b.svg)](https://arxiv.org/pdf/2510.10125) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://ctrl-world.github.io/) [![Code](https://img.shields.io/badge/Code-GitHub-green)](https://github.com/Robert-gyj/Ctrl-World)
- **VLA-RFT**: "VLA-RFT: Vision-Language-Action Reinforcement Fine-tuning with Verified Rewards in World Simulators". [![arXiv](https://img.shields.io/badge/arXiv-2510.00406-b31b1b.svg)](https://arxiv.org/abs/2510.00406)
- **World-Env**: "World-Env: Leveraging World Model as a Virtual Environment for VLA Post-Training". [![arXiv](https://img.shields.io/badge/arXiv-2509.24948-b31b1b.svg)](https://arxiv.org/abs/2509.24948)
- [⭐️] **Self-Improving Embodied Foundation Models**, "Self-Improving Embodied Foundation Models". [![arXiv](https://img.shields.io/badge/arXiv-2509.15155-b31b1b.svg)](https://arxiv.org/abs/2509.15155)
- **GigaBrain-0**, GigaBrain-0: A World Model-Powered Vision-Language-Action Model. [![arXiv](https://img.shields.io/badge/arXiv-2510.19430-b31b1b.svg)](https://arxiv.org/abs/2510.19430) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://gigabrain0.github.io/)
* **NinA**: "NinA: Normalizing Flows in Action. Training VLA Models with Normalizing Flows". [![OpenReview](https://img.shields.io/badge/OpenReview-Paper-8E44AD.svg)](https://openreview.net/group?id=NeurIPS.cc/2025/Workshop/EWM#tab-accept-oral) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://embodied-world-models.github.io/)

* **Ada-Diffuser**: "Ada-Diffuser: Latent-Aware Adaptive Diffusion for Decision-Making". [![OpenReview](https://img.shields.io/badge/OpenReview-Paper-8E44AD.svg)](https://openreview.net/group?id=NeurIPS.cc/2025/Workshop/EWM#tab-accept-oral) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://embodied-world-models.github.io/)

* **Steering Diffusion Policies**: "Steering Diffusion Policies with Value-Guided Denoising". [![OpenReview](https://img.shields.io/badge/OpenReview-Paper-8E44AD.svg)](https://openreview.net/group?id=NeurIPS.cc/2025/Workshop/EWM#tab-accept-oral) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://embodied-world-models.github.io/)

* **SPUR**: "SPUR: Scaling Reward Learning from Human Demonstrations". [![OpenReview](https://img.shields.io/badge/OpenReview-Paper-8E44AD.svg)](https://openreview.net/group?id=NeurIPS.cc/2025/Workshop/EWM#tab-accept-oral) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://embodied-world-models.github.io/)

* **A Smooth Sea Never Made a Skilled SAILOR**: "A Smooth Sea Never Made a Skilled SAILOR: Robust Imitation via Learning to Search". [![OpenReview](https://img.shields.io/badge/OpenReview-Paper-8E44AD.svg)](https://openreview.net/group?id=NeurIPS.cc/2025/Workshop/EWM#tab-accept-oral) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://embodied-world-models.github.io/)

* **RADI**: "RADI: LLMs as World Models for Robotic Action Decomposition and Imagination". [![OpenReview](https://img.shields.io/badge/OpenReview-Paper-8E44AD.svg)](https://openreview.net/group?id=ICLR.cc/2025/Workshop/World_Models#tab-accept) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://sites.google.com/view/worldmodel-iclr2025/accepted-papers)
- **WMPO**: "WMPO: World Model-based Policy Optimization for Vision-Language-Action Models". [![arXiv](https://img.shields.io/badge/arXiv-2511.09515-b31b1b.svg)](https://arxiv.org/abs/2511.09515) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://wm-po.github.io)

### 6. World Models x Policy Learning
This subsection focuses on general policy learning methods in embodied intelligence via leveraging world models.
- [⭐️] **UWM**, "Unified World Models: Coupling Video and Action Diffusion for Pretraining on Large Robotic Datasets". [![arXiv](https://img.shields.io/badge/arXiv-2504.02792-b31b1b.svg)](https://arxiv.org/abs/2504.02792) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://weirdlabuw.github.io/uwm/)
- [⭐️] **UVA**, Unified Video Action Model. [![arXiv](https://img.shields.io/badge/arXiv-2503.00200-b31b1b.svg)](https://arxiv.org/abs/2503.00200) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://unified-video-action-model.github.io/) [![Code](https://img.shields.io/badge/Code-GitHub-green)](https://github.com/ShuangLI59/unified_video_action)
- **DiWA**, "DiWA: Diffusion Policy Adaptation with World Models". [![arXiv](https://img.shields.io/badge/arXiv-2508.03645-b31b1b.svg)](https://arxiv.org/abs/2508.03645) [![Code](https://img.shields.io/badge/Code-GitHub-green)](https://diwa.cs.uni-freiburg.de)
- [⭐️] **Dreamerv4**, "Training Agents Inside of Scalable World Models". [![arXiv](https://img.shields.io/badge/arXiv-2509.24527-b31b1b.svg)](https://arxiv.org/abs/2509.24527) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://danijar.com/project/dreamer4/)
* **Latent Action Learning Requires Supervision**: "Latent Action Learning Requires Supervision in the Presence of Distractors". [![OpenReview](https://img.shields.io/badge/OpenReview-Paper-8E44AD.svg)](https://openreview.net/group?id=ICLR.cc/2025/Workshop/World_Models#tab-accept) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://sites.google.com/view/worldmodel-iclr2025/accepted-papers)

* **Beyond Experience**: "Beyond Experience: Fictive Learning as an Inherent Advantage of World Models". [![OpenReview](https://img.shields.io/badge/OpenReview-Paper-8E44AD.svg)](https://openreview.net/group?id=NeurIPS.cc/2025/Workshop/EWM#tab-accept-oral) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://embodied-world-models.github.io/)

* **Robotic World Model**: "Robotic World Model: A Neural Network Simulator for Robust Policy Optimization in Robotics". [![OpenReview](https://img.shields.io/badge/OpenReview-Paper-8E44AD.svg)](https://openreview.net/group?id=NeurIPS.cc/2025/Workshop/EWM#tab-accept-oral) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://embodied-world-models.github.io/)

* **Sim-to-Real Contact-Rich Pivoting**: "Sim-to-Real Contact-Rich Pivoting via Optimization-Guided RL with Vision and Touch". [![OpenReview](https://img.shields.io/badge/OpenReview-Paper-8E44AD.svg)](https://openreview.net/group?id=NeurIPS.cc/2025/Workshop/EWM#tab-accept-oral) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://embodied-world-models.github.io/)

* **Hierarchical Task Environments**: "Hierarchical Task Environments as the Next Frontier for Embodied World Models in Robot Soccer". [![OpenReview](https://img.shields.io/badge/OpenReview-Paper-8E44AD.svg)](https://openreview.net/group?id=NeurIPS.cc/2025/Workshop/EWM#tab-accept-oral) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://embodied-world-models.github.io/)

### 7. World Models for Policy evaluation
Real-world policy evaluation is expensive and noisy. The promise of world models is by accurately capturing environment dynamics, it can serve as a surrogate evaluation environment with high correlation to the policy performance in the real world. Before world models, the role for that was simulators:
- [⭐️] **Simpler**, "Evaluating Real-World Robot Manipulation Policies in Simulation". [![arXiv](https://img.shields.io/badge/arXiv-2405.05941-b31b1b.svg)](https://arxiv.org/abs/2405.05941) [![Code](https://img.shields.io/badge/Code-GitHub-green)](https://github.com/simpler-env/SimplerEnv)

For World Model Evaluation:
- [⭐️] **WorldGym**, "WorldGym: Evaluating Robot Policies in a World Model". [![arXiv](https://img.shields.io/badge/arXiv-2506.00613-b31b1b.svg)](https://arxiv.org/abs/2506.00613) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://world-model-eval.github.io)
- [⭐️] **WorldEval**: "WorldEval: World Model as Real-World Robot Policies Evaluator". [![arXiv](https://img.shields.io/badge/arXiv-2505.19017-b31b1b.svg)](https://arxiv.org/abs/2505.19017) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://worldeval.github.io)
- [⭐️] **WoW!**: "WOW!: World Models in a Closed-Loop World". [![OpenReview](https://img.shields.io/badge/OpenReview-Paper-8E44AD.svg)](https://openreview.net/pdf/e6aed49462d9e080633e727436cc95a0a8d61c57.pdf) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://wow202509.github.io/WOW_project_page/)
- **Cosmos-Surg-dVRK**: "Cosmos-Surg-dVRK: World Foundation Model-based Automated Online Evaluation of Surgical Robot Policy Learning". [![arXiv](https://img.shields.io/badge/arXiv-2510.16240-b31b1b.svg)](https://arxiv.org/abs/2510.16240)
---

## World Models for Science
Natural Science:

- [⭐️] **CellFlux**, "CellFlux: Simulating Cellular Morphology Changes via Flow Matching". [![arXiv](https://img.shields.io/badge/arXiv-2502.09775-b31b1b.svg)](https://arxiv.org/abs/2502.09775)[![Website](https://img.shields.io/badge/Website-Link-blue)](https://yuhui-zh15.github.io/CellFlux/).
- **CheXWorld**, "CheXWorld: Exploring Image World Modeling for Radiograph Representation Learning". [![arXiv](https://img.shields.io/badge/arXiv-2504.13820-b31b1b.svg)](http://arxiv.org/abs/2504.13820)[![Code](https://img.shields.io/badge/Code-GitHub-green)](https://github.com/LeapLabTHU/CheXWorld)
- **EchoWorld**: "EchoWorld: Learning Motion-Aware World Models for Echocardiography Probe Guidance". [![arXiv](https://img.shields.io/badge/arXiv-2504.13065-b31b1b.svg)](https://arxiv.org/abs/2504.13065) [![Code](https://img.shields.io/badge/Code-GitHub-green)](https://github.com/LeapLabTHU/EchoWorld)
- **ODesign**, "ODesign: A World Model for Biomolecular Interaction Design." [![arXiv](https://img.shields.io/badge/arXiv-2510.22304-b31b1b.svg)](https://arxiv.org/pdf/2510.22304) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://odesign.lglab.ac.cn)
- [⭐️] **SFP**, "Spatiotemporal Forecasting as Planning: A Model-Based Reinforcement Learning Approach with Generative World Models". [![arXiv](https://img.shields.io/badge/arXiv-2510.04020-b31b1b.svg)](https://arxiv.org/abs/2510.04020)
- **Xray2Xray**, "Xray2Xray: World Model from Chest X-rays with Volumetric Context". [![arXiv](https://img.shields.io/badge/arXiv-2506.19055-b31b1b.svg)](https://arxiv.org/abs/2506.19055)
- [⭐️] **Medical World Model**: "Medical World Model: Generative Simulation of Tumor Evolution for Treatment Planning". [![arXiv](https://img.shields.io/badge/arXiv-2506.02327-b31b1b.svg)](https://arxiv.org/abs/2506.02327)
- **Surgical Vision World Model**, "Surgical Vision World Model". [![arXiv](https://img.shields.io/badge/arXiv-2503.02904-b31b1b.svg)](https://arxiv.org/abs/2503.02904)

Social Science:
- **Social World Models**, "Social World Models". [![arXiv](https://img.shields.io/badge/arXiv-2509.00559-b31b1b.svg)](https://arxiv.org/abs/2509.00559)
- "Social World Model-Augmented Mechanism Design Policy Learning". [![arXiv](https://img.shields.io/badge/arXiv-2510.19270-b31b1b.svg)](https://arxiv.org/abs/2510.19270)
- **SocioVerse**, "SocioVerse: A World Model for Social Simulation Powered by LLM Agents and A Pool of 10 Million Real-World Users". [![arXiv](https://img.shields.io/badge/arXiv-2504.10157-b31b1b.svg)](http://arxiv.org/abs/2504.10157) [![Code](https://img.shields.io/badge/Code-GitHub-green)](https://github.com/FudanDISC/SocioVerse)

* **Effectively Designing 2-Dimensional Sequence Models**: "Effectively Designing 2-Dimensional Sequence Models for Multivariate Time Series". [![OpenReview](https://img.shields.io/badge/OpenReview-Paper-8E44AD.svg)](https://openreview.net/group?id=ICLR.cc/2025/Workshop/World_Models#tab-accept) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://sites.google.com/view/worldmodel-iclr2025/accepted-papers)

* **A Virtual Reality-Integrated System**: "A Virtual Reality-Integrated System for Behavioral Analysis in Neurological Decline". [![OpenReview](https://img.shields.io/badge/OpenReview-Paper-8E44AD.svg)](https://openreview.net/group?id=ICLR.cc/2025/Workshop/World_Models#tab-accept) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://sites.google.com/view/worldmodel-iclr2025/accepted-papers)

* **TwinMarket**: "TwinMarket: A Scalable Behavioral and Social Simulation for Financial Markets". [![OpenReview](https://img.shields.io/badge/OpenReview-Paper-8E44AD.svg)](https://openreview.net/group?id=ICLR.cc/2025/Workshop/World_Models#tab-accept) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://sites.google.com/view/worldmodel-iclr2025/accepted-papers)

* **Latent Representation Encoding**: "Latent Representation Encoding and Multimodal Biomarkers for Post-Stroke Speech Assessment". [![OpenReview](https://img.shields.io/badge/OpenReview-Paper-8E44AD.svg)](https://openreview.net/group?id=ICLR.cc/2025/Workshop/World_Models#tab-accept) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://sites.google.com/view/worldmodel-iclr2025/accepted-papers)

* **Reconstructing Dynamics**: "Reconstructing Dynamics from Steady Spatial Patterns with Partial Observations". [![OpenReview](https://img.shields.io/badge/OpenReview-Paper-8E44AD.svg)](https://openreview.net/group?id=ICLR.cc/2025/Workshop/World_Models#tab-accept) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://sites.google.com/view/worldmodel-iclr2025/accepted-papers)

* **SP: Learning Physics from Sparse Observations**: "SP: Learning Physics from Sparse Observations — Three Pitfalls of PDE-Constrained Diffusion Models". [![OpenReview](https://img.shields.io/badge/OpenReview-Paper-8E44AD.svg)](https://openreview.net/group?id=ICML.cc/2025/Workshop/World_Models#tab-accept) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://physical-world-modeling.github.io/)

* **SP: Continuous Autoregressive Generation**: "SP: Continuous Autoregressive Generation with Mixture of Gaussians". [![OpenReview](https://img.shields.io/badge/OpenReview-Paper-8E44AD.svg)](https://openreview.net/group?id=ICML.cc/2025/Workshop/World_Models#tab-accept) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://physical-world-modeling.github.io/)

* **EquiReg**: "EquiReg: Symmetry-Driven Regularization for Physically Grounded Diffusion-based Inverse Solvers". [![OpenReview](https://img.shields.io/badge/OpenReview-Paper-8E44AD.svg)](https://openreview.net/group?id=ICML.cc/2025/Workshop/World_Models#tab-accept) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://physical-world-modeling.github.io/)

* **Neural Modular World Model**: "Neural Modular World Model". [![OpenReview](https://img.shields.io/badge/OpenReview-Paper-8E44AD.svg)](https://openreview.net/group?id=ICML.cc/2025/Workshop/World_Models#tab-accept) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://physical-world-modeling.github.io/)

* **Bidding for Influence**: "Bidding for Influence: Auction-Driven Diffusion Image Generation". [![OpenReview](https://img.shields.io/badge/OpenReview-Paper-8E44AD.svg)](https://openreview.net/group?id=ICML.cc/2025/Workshop/World_Models#tab-accept) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://physical-world-modeling.github.io/)

* **PINT**: "PINT: Physics-Informed Neural Time Series Models with Applications to Long-term Inference on WeatherBench 2m-Temperature Data". [![OpenReview](https://img.shields.io/badge/OpenReview-Paper-8E44AD.svg)](https://openreview.net/group?id=ICLR.cc/2025/Workshop/World_Models#tab-accept) [![Website](https://img.shields.io/badge/Website-Link-blue)](https://sites.google.com/view/worldmodel-iclr2025/accepted-papers)