Awesome-Physics-Cognition-based-Video-Generation

A comprehensive list of papers investigating physical cognition in video generation, including papers, codes, and related websites.
https://github.com/minnie-lin/Awesome-Physics-Cognition-based-Video-Generation

Last synced: about 19 hours ago
JSON representation

📖 Table of Contents
- Surveys
  - A Survey of Interactive Generative Video - b31b1b.svg)](https://arxiv.org/abs/2504.21853)|-|-|Apr., 2025 |
  - Grounding Creativity in Physics: A Brief Survey of Physical Priors in AIGC - b31b1b.svg)](https://arxiv.org/abs/2502.07007)|-|-|Feb., 2025|
  - Generative Physical AI in Vision: A Survey - b31b1b.svg)](https://arxiv.org/abs/2501.10928)|[![Star](https://img.shields.io/github/stars/BestJunYu/Awesome-Physics-aware-Generation.svg?style=social&label=Star)](https://github.com/BestJunYu/Awesome-Physics-aware-Generation)|-|Jan., 2025|
  - Digital Gene: Learning about the Physical World through Analytic Concepts - b31b1b.svg)](https://arxiv.org/abs/2504.04170)|-|-|Apr., 2025 |
  - Simulating the Real World: A Unified Survey of Multimodal Generative Models - b31b1b.svg)](https://arxiv.org/abs/2503.04641)|[![Star](https://img.shields.io/github/stars/ALEEEHU/World-Simulator.svg?style=social&label=Star)](https://github.com/ALEEEHU/World-Simulator)|-|Mar., 2025|
  - Physics-Informed Computer Vision: A Review and Perspectives - b31b1b.svg)](https://arxiv.org/abs/2305.18035)|-|-|ACM Computing Surveys, 2024|
- Basic Schematic Perception for Generation
  - ReVision: High-Quality, Low-Cost Video Generation with Explicit 3D Physics Modeling for Complex Motion and Interaction - b31b1b.svg)](https://arxiv.org/abs/2504.21855)|-|[![Website](https://img.shields.io/badge/Website-9cf)](https://revision-video.github.io/)|Apr., 2025|
  - MotionDirector: Motion Customization of Text-to-Video Diffusion Models - b31b1b.svg)](https://arxiv.org/abs/2310.08465) | [![Star](https://img.shields.io/github/stars/showlab/MotionDirector.svg?style=social&label=Star)](https://github.com/showlab/MotionDirector) | [![Website](https://img.shields.io/badge/Website-9cf)](https://showlab.github.io/MotionDirector/) | ECCV, 2024, Oral |
  - MOFA-Video: Controllable Image Animation via Generative Motion Field Adaptions in Frozen Image-to-Video Diffusion Model - b31b1b.svg)](https://arxiv.org/abs/2405.20222) | [![Star](https://img.shields.io/github/stars/MyNiuuu/MOFA-Video.svg?style=social&label=Star)](https://github.com/MyNiuuu/MOFA-Video) | [![Website](https://img.shields.io/badge/Website-9cf)](https://myniuuu.github.io/MOFA_Video/) | ECCV, 2024 |
  - DragAnything: Motion Control for Anything using Entity Representation - b31b1b.svg)](https://arxiv.org/abs/2403.07420) | [![Star](https://img.shields.io/github/stars/showlab/DragAnything.svg?style=social&label=Star)](https://github.com/showlab/DragAnything) | [![Website](https://img.shields.io/badge/Website-9cf)](https://weijiawu.github.io/draganything_page/) | ECCV, 2024 |
  - TC4D: Trajectory-Conditioned Text-to-4D Generation - b31b1b.svg)](https://arxiv.org/abs/2403.17920) | [![Star](https://img.shields.io/github/stars/sherwinbahmani/tc4d.svg?style=social&label=Star)](https://github.com/sherwinbahmani/tc4d) | [![Website](https://img.shields.io/badge/Website-9cf)](https://sherwinbahmani.github.io/tc4d/) | ECCV, 2024 |
  - Any2Caption:Interpreting Any Condition to Caption for Controllable Video Generation - b31b1b.svg)](https://arxiv.org/abs/2503.24379)|[![Star](https://img.shields.io/github/stars/ChocoWu/Any2Caption.svg?style=social&label=Star)](https://github.com/ChocoWu/Any2Caption)|[![Website](https://img.shields.io/badge/Website-9cf)](https://sqwu.top/Any2Cap/)|Mar., 2025|
  - Motion Modes: What Could Happen Next? - b31b1b.svg)](https://arxiv.org/abs/2412.00148) | - | [![Website](https://img.shields.io/badge/Website-9cf)](https://motionmodes.github.io/) | CVPR, 2025 |
  - Video Creation by Demonstration - b31b1b.svg)](https://arxiv.org/abs/2412.09551) | - | [![Website](https://img.shields.io/badge/Website-9cf)](https://delta-diffusion.github.io/) | Dec., 2024 |
  - InterDyn: Controllable Interactive Dynamics with Video Diffusion Models - b31b1b.svg)](https://arxiv.org/abs/2412.11785) | - | [![Website](https://img.shields.io/badge/Website-9cf)](https://interdyn.is.tue.mpg.de/) | Dec., 2024 |
  - GenLit: Reformulating Single-Image Relighting as Video Generation - b31b1b.svg)](https://arxiv.org/abs/2412.11224) | - | [![Website](https://img.shields.io/badge/Website-9cf)](https://genlit-probingi2v.github.io/) | Dec., 2024 |
  - AnimateAnything: Consistent and Controllable Animation for Video Generation - b31b1b.svg)](https://arxiv.org/abs/2411.10836) | [![Star](https://img.shields.io/github/stars/yu-shaonian/AnimateAnything.svg?style=social&label=Star)](https://github.com/yu-shaonian/AnimateAnything) | [![Website](https://img.shields.io/badge/Website-9cf)](https://yu-shaonian.github.io/Animate_Anything/) | Nov., 2024 |
  - LumiSculpt: A Consistency Lighting Control Network for Video Generation - b31b1b.svg)](https://arxiv.org/abs/2410.22979) | - | - | Oct., 2024 |
  - MotionDirector: Motion Customization of Text-to-Video Diffusion Models - b31b1b.svg)](https://arxiv.org/abs/2310.08465) | [![Star](https://img.shields.io/github/stars/showlab/MotionDirector.svg?style=social&label=Star)](https://github.com/showlab/MotionDirector) | [![Website](https://img.shields.io/badge/Website-9cf)](https://showlab.github.io/MotionDirector/) | ECCV, 2024, Oral |
  - MotionCtrl: A Unified and Flexible Motion Controller for Video Generation - b31b1b.svg)](https://arxiv.org/abs/2312.03641) | [![Star](https://img.shields.io/github/stars/TencentARC/MotionCtrl.svg?style=social&label=Star)](https://github.com/TencentARC/MotionCtrl) | [![Website](https://img.shields.io/badge/Website-9cf)](https://wzhouxiff.github.io/projects/MotionCtrl/) | SIGGRAPH, 2024 |
  - MOFA-Video: Controllable Image Animation via Generative Motion Field Adaptions in Frozen Image-to-Video Diffusion Model - b31b1b.svg)](https://arxiv.org/abs/2405.20222) | [![Star](https://img.shields.io/github/stars/MyNiuuu/MOFA-Video.svg?style=social&label=Star)](https://github.com/MyNiuuu/MOFA-Video) | [![Website](https://img.shields.io/badge/Website-9cf)](https://myniuuu.github.io/MOFA_Video/) | ECCV, 2024 |
  - DragAnything: Motion Control for Anything using Entity Representation - b31b1b.svg)](https://arxiv.org/abs/2403.07420) | [![Star](https://img.shields.io/github/stars/showlab/DragAnything.svg?style=social&label=Star)](https://github.com/showlab/DragAnything) | [![Website](https://img.shields.io/badge/Website-9cf)](https://weijiawu.github.io/draganything_page/) | ECCV, 2024 |
  - TC4D: Trajectory-Conditioned Text-to-4D Generation - b31b1b.svg)](https://arxiv.org/abs/2403.17920) | [![Star](https://img.shields.io/github/stars/sherwinbahmani/tc4d.svg?style=social&label=Star)](https://github.com/sherwinbahmani/tc4d) | [![Website](https://img.shields.io/badge/Website-9cf)](https://sherwinbahmani.github.io/tc4d/) | ECCV, 2024 |
  - Adding Conditional Control to Text-to-Image Diffusion Models - b31b1b.svg)](https://arxiv.org/abs/2302.05543) | [![Star](https://img.shields.io/github/stars/lllyasviel/ControlNet.svg?style=social&label=Star)](https://github.com/lllyasviel/ControlNet) | - | ICCV, 2023, Best Paper Award |
  - IllumiCraft: Unified Geometry and Illumination Diffusion for Controllable Video Generation - b31b1b.svg)](https://arxiv.org/abs/2506.03150)|[![Star](https://img.shields.io/github/stars/yuanze-lin/IllumiCraft.svg?style=social&label=Star)](https://github.com/yuanze-lin/IllumiCraft)|[![Website](https://img.shields.io/badge/Website-9cf)](https://yuanze-lin.me/IllumiCraft_page/)|Jun., 2025|
  - MotionPro: A Precise Motion Controller for Image-to-Video Generation - b31b1b.svg)](https://arxiv.org/abs/2505.20287)|[![Star](https://img.shields.io/github/stars/HiDream-ai/MotionPro.svg?style=social&label=Star)](https://github.com/HiDream-ai/MotionPro)|[![Website](https://img.shields.io/badge/Website-9cf)](https://zhw-zhang.github.io/MotionPro-page/)|May, 2025, CVPR|
  - MOFA-Video: Controllable Image Animation via Generative Motion Field Adaptions in Frozen Image-to-Video Diffusion Model - b31b1b.svg)](https://arxiv.org/abs/2405.20222) | [![Star](https://img.shields.io/github/stars/MyNiuuu/MOFA-Video.svg?style=social&label=Star)](https://github.com/MyNiuuu/MOFA-Video) | [![Website](https://img.shields.io/badge/Website-9cf)](https://myniuuu.github.io/MOFA_Video/) | ECCV, 2024 |
  - DragAnything: Motion Control for Anything using Entity Representation - b31b1b.svg)](https://arxiv.org/abs/2403.07420) | [![Star](https://img.shields.io/github/stars/showlab/DragAnything.svg?style=social&label=Star)](https://github.com/showlab/DragAnything) | [![Website](https://img.shields.io/badge/Website-9cf)](https://weijiawu.github.io/draganything_page/) | ECCV, 2024 |
  - MotionDirector: Motion Customization of Text-to-Video Diffusion Models - b31b1b.svg)](https://arxiv.org/abs/2310.08465) | [![Star](https://img.shields.io/github/stars/showlab/MotionDirector.svg?style=social&label=Star)](https://github.com/showlab/MotionDirector) | [![Website](https://img.shields.io/badge/Website-9cf)](https://showlab.github.io/MotionDirector/) | ECCV, 2024, Oral |
  - TC4D: Trajectory-Conditioned Text-to-4D Generation - b31b1b.svg)](https://arxiv.org/abs/2403.17920) | [![Star](https://img.shields.io/github/stars/sherwinbahmani/tc4d.svg?style=social&label=Star)](https://github.com/sherwinbahmani/tc4d) | [![Website](https://img.shields.io/badge/Website-9cf)](https://sherwinbahmani.github.io/tc4d/) | ECCV, 2024 |
  - MotionDirector: Motion Customization of Text-to-Video Diffusion Models - b31b1b.svg)](https://arxiv.org/abs/2310.08465) | [![Star](https://img.shields.io/github/stars/showlab/MotionDirector.svg?style=social&label=Star)](https://github.com/showlab/MotionDirector) | [![Website](https://img.shields.io/badge/Website-9cf)](https://showlab.github.io/MotionDirector/) | ECCV, 2024, Oral |
  - MOFA-Video: Controllable Image Animation via Generative Motion Field Adaptions in Frozen Image-to-Video Diffusion Model - b31b1b.svg)](https://arxiv.org/abs/2405.20222) | [![Star](https://img.shields.io/github/stars/MyNiuuu/MOFA-Video.svg?style=social&label=Star)](https://github.com/MyNiuuu/MOFA-Video) | [![Website](https://img.shields.io/badge/Website-9cf)](https://myniuuu.github.io/MOFA_Video/) | ECCV, 2024 |
  - DragAnything: Motion Control for Anything using Entity Representation - b31b1b.svg)](https://arxiv.org/abs/2403.07420) | [![Star](https://img.shields.io/github/stars/showlab/DragAnything.svg?style=social&label=Star)](https://github.com/showlab/DragAnything) | [![Website](https://img.shields.io/badge/Website-9cf)](https://weijiawu.github.io/draganything_page/) | ECCV, 2024 |
  - TC4D: Trajectory-Conditioned Text-to-4D Generation - b31b1b.svg)](https://arxiv.org/abs/2403.17920) | [![Star](https://img.shields.io/github/stars/sherwinbahmani/tc4d.svg?style=social&label=Star)](https://github.com/sherwinbahmani/tc4d) | [![Website](https://img.shields.io/badge/Website-9cf)](https://sherwinbahmani.github.io/tc4d/) | ECCV, 2024 |
  - MotionDirector: Motion Customization of Text-to-Video Diffusion Models - b31b1b.svg)](https://arxiv.org/abs/2310.08465) | [![Star](https://img.shields.io/github/stars/showlab/MotionDirector.svg?style=social&label=Star)](https://github.com/showlab/MotionDirector) | [![Website](https://img.shields.io/badge/Website-9cf)](https://showlab.github.io/MotionDirector/) | ECCV, 2024, Oral |
  - MOFA-Video: Controllable Image Animation via Generative Motion Field Adaptions in Frozen Image-to-Video Diffusion Model - b31b1b.svg)](https://arxiv.org/abs/2405.20222) | [![Star](https://img.shields.io/github/stars/MyNiuuu/MOFA-Video.svg?style=social&label=Star)](https://github.com/MyNiuuu/MOFA-Video) | [![Website](https://img.shields.io/badge/Website-9cf)](https://myniuuu.github.io/MOFA_Video/) | ECCV, 2024 |
  - DragAnything: Motion Control for Anything using Entity Representation - b31b1b.svg)](https://arxiv.org/abs/2403.07420) | [![Star](https://img.shields.io/github/stars/showlab/DragAnything.svg?style=social&label=Star)](https://github.com/showlab/DragAnything) | [![Website](https://img.shields.io/badge/Website-9cf)](https://weijiawu.github.io/draganything_page/) | ECCV, 2024 |
  - TC4D: Trajectory-Conditioned Text-to-4D Generation - b31b1b.svg)](https://arxiv.org/abs/2403.17920) | [![Star](https://img.shields.io/github/stars/sherwinbahmani/tc4d.svg?style=social&label=Star)](https://github.com/sherwinbahmani/tc4d) | [![Website](https://img.shields.io/badge/Website-9cf)](https://sherwinbahmani.github.io/tc4d/) | ECCV, 2024 |
  - Towards Physical Understanding in Video Generation: A 3D Point Regularization Approach - b31b1b.svg)](https://arxiv.org/abs/2502.03639) | - | [![Website](https://img.shields.io/badge/Website-9cf)](https://snap-research.github.io/PointVidGen/) | Feb, 2025 |
  - SG-I2V: Self-Guided Trajectory Control in Image-to-Video Generation - b31b1b.svg)](https://arxiv.org/abs/2411.04989) | [![Star](https://img.shields.io/github/stars/Kmcode1/SG-I2V.svg?style=social&label=Star)](https://github.com/Kmcode1/SG-I2V) | [![Website](https://img.shields.io/badge/Website-9cf)](https://kmcode1.github.io/Projects/SG-I2V/) | ICLR, 2025 |
  - TrackGo: A Flexible and Efficient Method for Controllable Video Generation - b31b1b.svg)](https://arxiv.org/abs/2408.11475) | - | - | AAAI, 2025 |
  - 3DTrajMaster: Mastering 3D Trajectory for Multi-Entity Motion in Video Generation - b31b1b.svg)](https://arxiv.org/abs/2403.17920) | [![Star](https://img.shields.io/github/stars/KwaiVGI/3DTrajMaster.svg?style=social&label=Star)](https://github.com/KwaiVGI/3DTrajMaster) | [![Website](https://img.shields.io/badge/Website-9cf)](https://fuxiao0719.github.io/projects/3dtrajmaster/) | ICLR, 2025 |
  - Generative Photography: Scene-Consistent Camera Control for Realistic Text-to-Image Synthesis - b31b1b.svg)](https://arxiv.org/abs/2412.02168)|[![Star](https://img.shields.io/github/stars/pandayuanyu/generative-photography.svg?style=social&label=Star)](https://github.com/pandayuanyu/generative-photography)|[![Website](https://img.shields.io/badge/Website-9cf)](https://generative-photography.github.io/project/)| CVPR, 2025|
  - Lux Post Facto: Learning Portrait Performance Relighting with Conditional Video Diffusion and a Hybrid Dataset - b31b1b.svg)](https://arxiv.org/abs/2503.14485) | | [![Website](https://img.shields.io/badge/Website-9cf)](https://www.eyelinestudios.com/research/luxpostfacto.html) | CVPR,2025 |
  - Identity-Preserving Text-to-Video Generation by Frequency Decomposition - b31b1b.svg)](https://arxiv.org/abs/2411.17440) | [![Star](https://img.shields.io/github/stars/PKU-YuanGroup/ConsisID.svg?style=social&label=Star)](https://github.com/PKU-YuanGroup/ConsisID) | [![Website](https://img.shields.io/badge/Website-9cf)](https://pku-yuangroup.github.io/ConsisID/) | CVPR, 2025 |
  - Motion Prompting: Controlling Video Generation with Motion Trajectories - b31b1b.svg)](https://arxiv.org/abs/2412.02700) | - | [![Website](https://img.shields.io/badge/Website-9cf)](https://motion-prompting.github.io/) | CVPR, 2025 (Oral) |
  - Spectral Motion Alignment for Video Motion Transfer using Diffusion Models - b31b1b.svg)](https://arxiv.org/abs/2403.15249) | [![Star](https://img.shields.io/github/stars/geonyeong-park/Spectral-Motion-Alignment.svg?style=social&label=Star)](https://github.com/geonyeong-park/Spectral-Motion-Alignment) | [![Website](https://img.shields.io/badge/Website-9cf)](https://geonyeong-park.github.io/spectral-motion-alignment/) | AAAI, 2025 |
  - LeviTor: 3D Trajectory Oriented Image-to-Video Synthesis - b31b1b.svg)](https://arxiv.org/abs/2412.15214) | [![Star](https://img.shields.io/github/stars/qiuyu96/LeviTor.svg?style=social&label=Star)](https://github.com/qiuyu96/LeviTor) | [![Website](https://img.shields.io/badge/Website-9cf)](https://ppetrichor.github.io/levitor.github.io/) | Dec., 2024 |
  - Motion Dreamer: Realizing Physically Coherent Video Generation through Scene-Aware Motion Reasoning - b31b1b.svg)](https://arxiv.org/abs/2412.00547) | [![Star](https://img.shields.io/github/stars/EnVision-Research/MotionDreamer.svg?style=social&label=Star)](https://github.com/EnVision-Research/MotionDreamer) | [![Website](https://img.shields.io/badge/Website-9cf)](https://envision-research.github.io/MotionDreamer/) | Nov., 2024 |
  - InTraGen: Trajectory-controlled Video Generation for Object Interactions - b31b1b.svg)](https://arxiv.org/abs/2411.16804) | [![Star](https://img.shields.io/github/stars/insait-institute/InTraGen.svg?style=social&label=Star)](https://github.com/insait-institute/InTraGen) | - | Nov., 2024 |
  - DreamVideo-2: Zero-Shot Subject-Driven Video Customization with Precise Motion Control - b31b1b.svg)](https://arxiv.org/abs/2410.13830) | - | [![Website](https://img.shields.io/badge/Website-9cf)](https://dreamvideo2.github.io/) | Oct., 2024 |
  - Tora: Trajectory-oriented Diffusion Transformer for Video Generation - b31b1b.svg)](https://arxiv.org/abs/2407.21705) | [![Star](https://img.shields.io/github/stars/alibaba/Tora.svg?style=social&label=Star)](https://github.com/alibaba/Tora) | [![Website](https://img.shields.io/badge/Website-9cf)](https://ali-videoai.github.io/tora_video/) | Jul., 2024 |
  - UniAnimate: Taming Unified Video Diffusion Models for Consistent Human Image Animation - b31b1b.svg)](https://arxiv.org/abs/2406.01188) | [![Star](https://img.shields.io/github/stars/ali-vilab/UniAnimate.svg?style=social&label=Star)](https://github.com/ali-vilab/UniAnimate) | [![Website](https://img.shields.io/badge/Website-9cf)](https://unianimate.github.io/) | Jun., 2024 |
  - Image Conductor: Precision Control for Interactive Video Synthesis - b31b1b.svg)](https://arxiv.org/abs/2406.15339) | [![Star](https://img.shields.io/github/stars/liyaowei-stu/ImageConductor.svg?style=social&label=Star)](https://github.com/liyaowei-stu/ImageConductor) | [![Website](https://img.shields.io/badge/Website-9cf)](https://liyaowei-stu.github.io/project/ImageConductor/) | Jun., 2024 |
  - Motion Inversion for Video Customization - b31b1b.svg)](https://arxiv.org/abs/2403.20193) | [![Star](https://img.shields.io/github/stars/EnVision-Research/MotionInversion.svg?style=social&label=Star)](https://github.com/EnVision-Research/MotionInversion) | [![Website](https://img.shields.io/badge/Website-9cf)](https://wileewang.github.io/MotionInversion/) | Mar., 2024 |
  - VMC: Video Motion Customization using Temporal Attention Adaption for Text-to-Video Diffusion Models - b31b1b.svg)](https://arxiv.org/abs/2312.00845) | [![Star](https://img.shields.io/github/stars/HyeonHo99/Video-Motion-Customization.svg?style=social&label=Star)](https://github.com/HyeonHo99/Video-Motion-Customization) | [![Website](https://img.shields.io/badge/Website-9cf)](https://video-motion-customization.github.io/) | CVPR, 2024 |
  - MotionDirector: Motion Customization of Text-to-Video Diffusion Models - b31b1b.svg)](https://arxiv.org/abs/2310.08465) | [![Star](https://img.shields.io/github/stars/showlab/MotionDirector.svg?style=social&label=Star)](https://github.com/showlab/MotionDirector) | [![Website](https://img.shields.io/badge/Website-9cf)](https://showlab.github.io/MotionDirector/) | ECCV, 2024, Oral |
  - FlowVid: Taming Imperfect Optical Flows for Consistent Video-to-Video Synthesis - b31b1b.svg)](https://arxiv.org/abs/2312.17681) | - | [![Website](https://img.shields.io/badge/Website-9cf)](https://jeff-liangf.github.io/projects/flowvid/) | CVPR, 2024, Highlight |
  - Generative Image Dynamics - b31b1b.svg)](https://arxiv.org/abs/2309.07906) | [![Star](https://img.shields.io/github/stars/fltwr/generative-image-dynamics.svg?style=social&label=Star)](https://github.com/fltwr/generative-image-dynamics) | [![Website](https://img.shields.io/badge/Website-9cf)](https://generative-dynamics.github.io/) | CVPR, 2024, Best Paper Award |
  - MOFA-Video: Controllable Image Animation via Generative Motion Field Adaptions in Frozen Image-to-Video Diffusion Model - b31b1b.svg)](https://arxiv.org/abs/2405.20222) | [![Star](https://img.shields.io/github/stars/MyNiuuu/MOFA-Video.svg?style=social&label=Star)](https://github.com/MyNiuuu/MOFA-Video) | [![Website](https://img.shields.io/badge/Website-9cf)](https://myniuuu.github.io/MOFA_Video/) | ECCV, 2024 |
  - Motion-I2V: Consistent and Controllable Image-to-Video Generation with Explicit Motion Modeling - b31b1b.svg)](https://arxiv.org/abs/2401.15977) | [![Star](https://img.shields.io/github/stars/EnVision-Research/MotionDreamer.svg?style=social&label=Star)](https://github.com/G-U-N/Motion-I2V) | [![Website](https://img.shields.io/badge/Website-9cf)](https://xiaoyushi97.github.io/Motion-I2V/) | SIGGRAPH, 2024 |
  - DragAnything: Motion Control for Anything using Entity Representation - b31b1b.svg)](https://arxiv.org/abs/2403.07420) | [![Star](https://img.shields.io/github/stars/showlab/DragAnything.svg?style=social&label=Star)](https://github.com/showlab/DragAnything) | [![Website](https://img.shields.io/badge/Website-9cf)](https://weijiawu.github.io/draganything_page/) | ECCV, 2024 |
  - FLATTEN: optical FLow-guided ATTENtion for consistent text-to-video editing - b31b1b.svg)](https://arxiv.org/abs/2310.05922) | [![Star](https://img.shields.io/github/stars/yrcong/flatten.svg?style=social&label=Star)](https://github.com/yrcong/flatten) | [![Website](https://img.shields.io/badge/Website-9cf)](https://flatten-video-editing.github.io/) | ICLR, 2024 |
  - Animate Anyone: Consistent and Controllable Image-to-Video Synthesis for Character Animation - b31b1b.svg)](https://arxiv.org/abs/2311.17117) | [![Star](https://img.shields.io/github/stars/HumanAIGC/AnimateAnyone.svg?style=social&label=Star)](https://github.com/HumanAIGC/AnimateAnyone) | [![Website](https://img.shields.io/badge/Website-9cf)](https://humanaigc.github.io/animate-anyone/) | CVPR, 2024 |
  - Direct-a-Video: Customized Video Generation with User-Directed Camera Movement and Object Motion - b31b1b.svg)](https://arxiv.org/abs/2402.03162) | [![Star](https://img.shields.io/github/stars/ysy31415/direct_a_video.svg?style=social&label=Star)](https://github.com/ysy31415/direct_a_video) | [![Website](https://img.shields.io/badge/Website-9cf)](https://direct-a-video.github.io/) | SIGGRAPH, 2024 |
  - Motion-I2V: Consistent and Controllable Image-to-Video Generation with Explicit Motion Modeling - b31b1b.svg)](https://arxiv.org/abs/2401.15977) | - | [![Website](https://img.shields.io/badge/Website-9cf)](https://direct-a-video.github.io/) | SIGGRAPH, 2024 |
  - TC4D: Trajectory-Conditioned Text-to-4D Generation - b31b1b.svg)](https://arxiv.org/abs/2403.17920) | [![Star](https://img.shields.io/github/stars/sherwinbahmani/tc4d.svg?style=social&label=Star)](https://github.com/sherwinbahmani/tc4d) | [![Website](https://img.shields.io/badge/Website-9cf)](https://sherwinbahmani.github.io/tc4d/) | ECCV, 2024 |
  - Control-A-Video: Controllable Text-to-Video Diffusion Models with Motion Prior and Reward Feedback Learnin - b31b1b.svg)](https://arxiv.org/abs/2305.13840) | [![Star](https://img.shields.io/github/stars/Weifeng-Chen/control-a-video.svg?style=social&label=Star)](https://github.com/Weifeng-Chen/control-a-video) | [![Website](https://img.shields.io/badge/Website-9cf)](https://controlavideo.github.io/) | May., 2023 |
  - VideoComposer: Compositional Video Synthesis with Motion Controllability - b31b1b.svg)](https://arxiv.org/abs/2306.02018) | [![Star](https://img.shields.io/github/stars/ali-vilab/videocomposer.svg?style=social&label=Star)](https://github.com/insait-institute/InTraGen) | [![Website](https://img.shields.io/badge/Website-9cf)](https://videocomposer.github.io/) | NeurIPS, 2023 |
  - TC4D: Trajectory-Conditioned Text-to-4D Generation - b31b1b.svg)](https://arxiv.org/abs/2403.17920) | [![Star](https://img.shields.io/github/stars/sherwinbahmani/tc4d.svg?style=social&label=Star)](https://github.com/sherwinbahmani/tc4d) | [![Website](https://img.shields.io/badge/Website-9cf)](https://sherwinbahmani.github.io/tc4d/) | ECCV, 2024 |
  - DragAnything: Motion Control for Anything using Entity Representation - b31b1b.svg)](https://arxiv.org/abs/2403.07420) | [![Star](https://img.shields.io/github/stars/showlab/DragAnything.svg?style=social&label=Star)](https://github.com/showlab/DragAnything) | [![Website](https://img.shields.io/badge/Website-9cf)](https://weijiawu.github.io/draganything_page/) | ECCV, 2024 |
  - MotionDirector: Motion Customization of Text-to-Video Diffusion Models - b31b1b.svg)](https://arxiv.org/abs/2310.08465) | [![Star](https://img.shields.io/github/stars/showlab/MotionDirector.svg?style=social&label=Star)](https://github.com/showlab/MotionDirector) | [![Website](https://img.shields.io/badge/Website-9cf)](https://showlab.github.io/MotionDirector/) | ECCV, 2024, Oral |
  - MOFA-Video: Controllable Image Animation via Generative Motion Field Adaptions in Frozen Image-to-Video Diffusion Model - b31b1b.svg)](https://arxiv.org/abs/2405.20222) | [![Star](https://img.shields.io/github/stars/MyNiuuu/MOFA-Video.svg?style=social&label=Star)](https://github.com/MyNiuuu/MOFA-Video) | [![Website](https://img.shields.io/badge/Website-9cf)](https://myniuuu.github.io/MOFA_Video/) | ECCV, 2024 |
  - MotionDirector: Motion Customization of Text-to-Video Diffusion Models - b31b1b.svg)](https://arxiv.org/abs/2310.08465) | [![Star](https://img.shields.io/github/stars/showlab/MotionDirector.svg?style=social&label=Star)](https://github.com/showlab/MotionDirector) | [![Website](https://img.shields.io/badge/Website-9cf)](https://showlab.github.io/MotionDirector/) | ECCV, 2024, Oral |
  - MOFA-Video: Controllable Image Animation via Generative Motion Field Adaptions in Frozen Image-to-Video Diffusion Model - b31b1b.svg)](https://arxiv.org/abs/2405.20222) | [![Star](https://img.shields.io/github/stars/MyNiuuu/MOFA-Video.svg?style=social&label=Star)](https://github.com/MyNiuuu/MOFA-Video) | [![Website](https://img.shields.io/badge/Website-9cf)](https://myniuuu.github.io/MOFA_Video/) | ECCV, 2024 |
  - DragAnything: Motion Control for Anything using Entity Representation - b31b1b.svg)](https://arxiv.org/abs/2403.07420) | [![Star](https://img.shields.io/github/stars/showlab/DragAnything.svg?style=social&label=Star)](https://github.com/showlab/DragAnything) | [![Website](https://img.shields.io/badge/Website-9cf)](https://weijiawu.github.io/draganything_page/) | ECCV, 2024 |
  - TC4D: Trajectory-Conditioned Text-to-4D Generation - b31b1b.svg)](https://arxiv.org/abs/2403.17920) | [![Star](https://img.shields.io/github/stars/sherwinbahmani/tc4d.svg?style=social&label=Star)](https://github.com/sherwinbahmani/tc4d) | [![Website](https://img.shields.io/badge/Website-9cf)](https://sherwinbahmani.github.io/tc4d/) | ECCV, 2024 |
- Passive Cognition of Physical Knowledge for Generation
  - VLIPP: Towards Physically Plausible Video Generation with Vision and Language Informed Physical Prior - b31b1b.svg)]([![arXiv](https://img.shields.io/badge/arXiv-b31b1b.svg)])|[![Star](https://img.shields.io/github/stars/Madaoer/VLIPP.svg?style=social&label=Star)](https://github.com/Madaoer/VLIPP)|[![Website](https://img.shields.io/badge/Website-9cf)](https://madaoer.github.io/projects/physically_plausible_video_generation/)|Mar., 2025|
  - PhysDreamer: Physics-Based Interaction with 3D Objects via Video Generation - b31b1b.svg)](https://arxiv.org/abs/2404.13026) | [![Star](https://img.shields.io/github/stars/a1600012888/PhysDreamer.svg?style=social&label=Star)](https://github.com/a1600012888/PhysDreamer) | [![Website](https://img.shields.io/badge/Website-9cf)](https://physdreamer.github.io/) | ECCV, 2024 Oral |
  - Feature Splatting: Language-Driven Physics-Based Scene Synthesis and Editing - b31b1b.svg)](https://arxiv.org/abs/2404.01223) | [![Star](https://img.shields.io/github/stars/vuer-ai/feature-splatting-inria.svg?style=social&label=Star)](https://github.com/vuer-ai/feature-splatting-inria) | [![Website](https://img.shields.io/badge/Website-9cf)](https://feature-splatting.github.io/) | ECCV, 2024 |
  - Articulated Kinematics Distillation from Video Diffusion Models - b31b1b.svg)](https://arxiv.org/abs/2504.01204)|-|[![Website](https://img.shields.io/badge/Website-9cf)](https://research.nvidia.com/labs/dir/akd/)|Apr., 2025; CVPR, 2025|
  - RainyGS: Efficient Rain Synthesis with Physically-Based Gaussian Splatting - b31b1b.svg)](https://arxiv.org/abs/2503.21442)|-|[![Website](https://img.shields.io/badge/Website-9cf)](https://pku-vcl-geometry.github.io/RainyGS/)|Mar., 2025; CVPR, 2025|
  - AccidentSim: Generating Physically Realistic Vehicle Collision Videos from Real-World Accident Reports - b31b1b.svg)](https://arxiv.org/abs/2503.20654)|-|[![Website](https://img.shields.io/badge/Website-9cf)](https://accidentsim.github.io/)|Mar., 2025|
  - PhysTwin: Physics-Informed Reconstruction and Simulation of Deformable Objects from Videos - b31b1b.svg)](https://arxiv.org/abs/2503.17973)|[![Star](https://img.shields.io/github/stars/Jianghanxiao/PhysTwin.svg?style=social&label=Star)](https://github.com/Jianghanxiao/PhysTwin)| [![Website](https://img.shields.io/badge/Website-9cf)](https://jianghanxiao.github.io/phystwin-web/)|Mar., 2025|
  - AutoVFX: Physically Realistic Video Editing from Natural Language Instructions - b31b1b.svg)](https://arxiv.org/abs/2411.02394) | [![Star](https://img.shields.io/github/stars/haoyuhsu/autovfx.svg?style=social&label=Star)](https://github.com/haoyuhsu/autovfx) | [![Website](https://img.shields.io/badge/Website-9cf)](https://haoyuhsu.github.io/autovfx-website/) | 3DV, 2025 |
  - FluidNexus: 3D Fluid Reconstruction and Prediction from a Single Video - b31b1b.svg)](https://arxiv.org/abs/2503.04720) | - | [![Website](https://img.shields.io/badge/Website-9cf)](https://yuegao.me/FluidNexus/) | CVPR, 2025 |
  - Phys4DGen: A Physics-Driven Framework for Controllable and Efficient 4D Content Generation from a Single Image - b31b1b.svg)](https://arxiv.org/abs/2411.16800) | - | [![Website](https://img.shields.io/badge/Website-9cf)](https://jiajinglin.github.io/Phys4DGen/) | Nov., 2024 |
  - Automated 3D Physical Simulation of Open-world Scene with Gaussian Splatting - b31b1b.svg)](https://arxiv.org/abs/2411.12789) | - | [![Website](https://img.shields.io/badge/Website-9cf)](https://sim-gs.github.io/) | Nov., 2024 |
  - PhysMotion: Physics-Grounded Dynamics From a Single Image - b31b1b.svg)](https://arxiv.org/abs/2411.17189) | - | [![Website](https://img.shields.io/badge/Website-9cf)](https://supertan0204.github.io/physmotion_website/) | Nov., 2024 |
  - Physics3D: Learning Physical Properties of 3D Gaussians via Video Diffusion - b31b1b.svg)](https://arxiv.org/abs/2406.04338) | [![Star](https://img.shields.io/github/stars/liuff19/Physics3D.svg?style=social&label=Star)](https://github.com/liuff19/Physics3D) | [![Website](https://img.shields.io/badge/Website-9cf)](https://liuff19.github.io/Physics3D/) | Jun., 2024 |
  - MotionCraft: Physics-based Zero-Shot Video Generation - b31b1b.svg)](https://arxiv.org/abs/2405.13557) | [![Star](https://img.shields.io/github/stars/mezzelfo/MotionCraft.svg?style=social&label=Star)](https://github.com/mezzelfo/MotionCraft) | [![Website](https://img.shields.io/badge/Website-9cf)](https://mezzelfo.github.io/MotionCraft/) | Nips, 2024 |
  - PhysDreamer: Physics-Based Interaction with 3D Objects via Video Generation - b31b1b.svg)](https://arxiv.org/abs/2404.13026) | [![Star](https://img.shields.io/github/stars/a1600012888/PhysDreamer.svg?style=social&label=Star)](https://github.com/a1600012888/PhysDreamer) | [![Website](https://img.shields.io/badge/Website-9cf)](https://physdreamer.github.io/) | ECCV, 2024 Oral |
  - Video2Game: Real-time, Interactive, Realistic and Browser-Compatible Environment from a Single Video - b31b1b.svg)](https://arxiv.org/abs/2404.09833) | [![Star](https://img.shields.io/github/stars/video2game/video2game.svg?style=social&label=Star)](https://github.com/video2game/video2game) | [![Website](https://img.shields.io/badge/Website-9cf)](https://video2game.github.io/) | CVPR, 2024 |
  - Feature Splatting: Language-Driven Physics-Based Scene Synthesis and Editing - b31b1b.svg)](https://arxiv.org/abs/2404.01223) | [![Star](https://img.shields.io/github/stars/vuer-ai/feature-splatting-inria.svg?style=social&label=Star)](https://github.com/vuer-ai/feature-splatting-inria) | [![Website](https://img.shields.io/badge/Website-9cf)](https://feature-splatting.github.io/) | ECCV, 2024 |
  - LLM-grounded Video Diffusion Models - b31b1b.svg)](https://arxiv.org/abs/2309.17444) | [![Star](https://img.shields.io/github/stars/TonyLianLong/LLM-groundedVideoDiffusion.svg?style=social&label=Star)](https://github.com/TonyLianLong/LLM-groundedVideoDiffusion) | [![Website](https://img.shields.io/badge/Website-9cf)](https://llm-grounded-video-diffusion.github.io/) | ICLR, 2024 |
  - Compositional 3D-aware Video Generation with LLM Director - b31b1b.svg)](https://arxiv.org/abs/2409.00558) | - | [![Website](https://img.shields.io/badge/Website-9cf)](https://www.microsoft.com/en-us/research/project/compositional-3d-aware-video-generation/) | NIPS, 2024 |
  - GPT4Motion: Scripting Physical Motions in Text-to-Video Generation via Blender-Oriented GPT Planning - b31b1b.svg)](https://arxiv.org/abs/2311.12631) | [![Star](https://img.shields.io/github/stars/jiaxilv/GPT4Motion.svg?style=social&label=Star)](https://github.com/jiaxilv/GPT4Motion) | [![Website](https://img.shields.io/badge/Website-9cf)](https://gpt4motion.github.io/) | CVPR, 2024, workshop |
  - Force Prompting: Video Generation Models Can Learn and Generalize Physics-based Control Signals - b31b1b.svg)](https://arxiv.org/abs/2505.19386)|[![Star](https://img.shields.io/github/stars/brown-palm/force-prompting.svg?style=social&label=Star)](https://github.com/brown-palm/force-prompting)|[![Website](https://img.shields.io/badge/Website-9cf)](https://force-prompting.github.io/)|May, 2025|
  - PhysDreamer: Physics-Based Interaction with 3D Objects via Video Generation - b31b1b.svg)](https://arxiv.org/abs/2404.13026) | [![Star](https://img.shields.io/github/stars/a1600012888/PhysDreamer.svg?style=social&label=Star)](https://github.com/a1600012888/PhysDreamer) | [![Website](https://img.shields.io/badge/Website-9cf)](https://physdreamer.github.io/) | ECCV, 2024 Oral |
  - Feature Splatting: Language-Driven Physics-Based Scene Synthesis and Editing - b31b1b.svg)](https://arxiv.org/abs/2404.01223) | [![Star](https://img.shields.io/github/stars/vuer-ai/feature-splatting-inria.svg?style=social&label=Star)](https://github.com/vuer-ai/feature-splatting-inria) | [![Website](https://img.shields.io/badge/Website-9cf)](https://feature-splatting.github.io/) | ECCV, 2024 |
  - WonderPlay: Dynamic 3D Scene Generation from a Single Image and Actions - b31b1b.svg)](https://arxiv.org/abs/2505.18151)|-|[![Website](https://img.shields.io/badge/Website-9cf)](https://kyleleey.github.io/WonderPlay/)|May, 2025|
  - MAGIC: Motion-Aware Generative Inference via Confidence-Guided LLM - b31b1b.svg)](https://arxiv.org/abs/2505.16456)|-|-|May, 2025|
  - Feature Splatting: Language-Driven Physics-Based Scene Synthesis and Editing - b31b1b.svg)](https://arxiv.org/abs/2404.01223) | [![Star](https://img.shields.io/github/stars/vuer-ai/feature-splatting-inria.svg?style=social&label=Star)](https://github.com/vuer-ai/feature-splatting-inria) | [![Website](https://img.shields.io/badge/Website-9cf)](https://feature-splatting.github.io/) | ECCV, 2024 |
  - C-Drag: Chain-of-Thought Driven Motion Controller for Video Generation - b31b1b.svg)](https://arxiv.org/abs/2502.19868)|[![Star](https://img.shields.io/github/stars/WesLee88524/C-Drag-Official-Repo.svg?style=social&label=Star)](https://github.com/WesLee88524/C-Drag-Official-Repo)|-|Feb., 2025|
  - PhysDreamer: Physics-Based Interaction with 3D Objects via Video Generation - b31b1b.svg)](https://arxiv.org/abs/2404.13026) | [![Star](https://img.shields.io/github/stars/a1600012888/PhysDreamer.svg?style=social&label=Star)](https://github.com/a1600012888/PhysDreamer) | [![Website](https://img.shields.io/badge/Website-9cf)](https://physdreamer.github.io/) | ECCV, 2024 Oral |
  - Dreamland: Controllable World Creation with Simulator and Generative Models - b31b1b.svg)](https://arxiv.org/abs/2506.08006)|-|[![Website](https://img.shields.io/badge/Website-9cf)](https://metadriverse.github.io/dreamland/)|Jun., 2025|
  - Vid2Sim: Generalizable, Video-based Reconstruction of Appearance, Geometry and Physics for Mesh-free Simulation - b31b1b.svg)](https://arxiv.org/abs/2506.06440)|-|[![Website](https://img.shields.io/badge/Website-9cf)](https://czzzzh.github.io/Vid2Sim/)|Jun., 2025, CVPR|
  - Think Before You Diffuse: LLMs-Guided Physics-Aware Video Generation - b31b1b.svg)](https://arxiv.org/abs/2505.21653)|-|[![Website](https://img.shields.io/badge/Website-9cf)](https://bwgzk-keke.github.io/DiffPhy/)|May, 2025|
  - UniPhy: Learning a Unified Constitutive Model for Inverse Physics Simulation - b31b1b.svg)](https://arxiv.org/abs/2505.16971)|[![Star](https://img.shields.io/github/stars/HimangiM/UniPhy_CVPR2025.svg?style=social&label=Star)](https://github.com/HimangiM/UniPhy_CVPR2025)|[![Website](https://img.shields.io/badge/Website-9cf)](https://himangim.github.io/UniPhy/)|May, 2025, CVPR|
  - FinePhys: Fine-grained Human Action Generation by Explicitly Incorporating Physical Laws for Effective Skeletal Guidance - b31b1b.svg)](https://arxiv.org/abs/2505.13437)|[![Star](https://img.shields.io/github/stars/SmartDianLab/FinePhys.svg?style=social&label=Star)](https://github.com/SmartDianLab/FinePhys)|[![Website](https://img.shields.io/badge/Website-9cf)](https://smartdianlab.github.io/projects-FinePhys/)|May, 2025, CVPR|
  - Generating Physically Stable and Buildable LEGO Designs from Text - b31b1b.svg)](https://www.arxiv.org/abs/2505.05469)|[![Star](https://img.shields.io/github/stars/AvaLovelace1/LegoGPT.svg?style=social&label=Star)](https://github.com/AvaLovelace1/LegoGPT/)|[![Website](https://img.shields.io/badge/Website-9cf)](https://avalovelace1.github.io/LegoGPT/)|May, 2025|
  - PhysGen3D: Crafting a Miniature Interactive World from a Single Image - b31b1b.svg)](https://arxiv.org/abs/2503.20746)|[![Star](https://img.shields.io/github/stars/by-luckk/PhysGen3D.svg?style=social&label=Star)](https://github.com/by-luckk/PhysGen3D)|[![Website](https://img.shields.io/badge/Website-9cf)](https://by-luckk.github.io/PhysGen3D/)|Mar., 2025; CVPR, 2025|
  - Synthetic Video Enhances Physical Fidelity in Video Synthesis - b31b1b.svg)](https://www.arxiv.org/abs/2503.20822)|-|[![Website](https://img.shields.io/badge/Website-9cf)](https://kevinz8866.github.io/simulation/)|Mar., 2025|
  - PhysAnimator: Physics-Guided Generative Cartoon Animation - b31b1b.svg)](https://arxiv.org/abs/2501.16550) | - | - | Jan., 2025 |
  - OmniPhysGS: 3D Constitutive Gaussians for General Physics-Based Dynamics Generation - b31b1b.svg)](hhttps://arxiv.org/abs/2501.18982) | - | - | ICLR, 2025 |
  - Gaussian Splashing: Unified Particles for Versatile Motion Synthesis and Rendering - b31b1b.svg)](https://arxiv.org/abs/2401.15318) | - | [![Website](https://img.shields.io/badge/Website-9cf)](https://gaussiansplashing.github.io/) | CVPR, 2025 |
  - Unleashing the potential of multi-modal foundation models and video diffusion for 4d dynamic physical scene simulation - b31b1b.svg)](https://arxiv.org/abs/2411.14423) | - | [![Website](https://img.shields.io/badge/Website-9cf)](https://zhuomanliu.github.io/PhysFlow/) | CVPR, 2025 |
  - DreamPhysics: Learning Physical Properties of Dynamic 3D Gaussians with Video Diffusion Priors - b31b1b.svg)](https://arxiv.org/abs/2406.01476) | [![Star](https://img.shields.io/github/stars/tyhuang0428/DreamPhysics.svg?style=social&label=Star)](https://github.com/tyhuang0428/DreamPhysics) | - | AAAI, 2025 |
  - GauSim: Registering Elastic Objects into Digital World by Gaussian Simulator - b31b1b.svg)](https://arxiv.org/abs/2412.17804) | - | [![Website](https://img.shields.io/badge/Website-9cf)](https://www.mmlab-ntu.com/project/gausim/index.html) | Dec., 2024 |
  - GaussianProperty: Integrating Physical Properties to 3D Gaussians with LMMs - b31b1b.svg)](https://arxiv.org/abs/2412.11258) | [![Star](https://img.shields.io/github/stars/xxlbigbrother/Gaussian-Property.svg?style=social&label=Star)](https://github.com/xxlbigbrother/Gaussian-Property) | [![Website](https://img.shields.io/badge/Website-9cf)](https://gaussian-property.github.io/) | Dec., 2024 |
  - Enhancing Sketch Animation: Text-to-Video Diffusion Models with Temporal Consistency and Rigidity Constraints - b31b1b.svg)](https://arxiv.org/abs/2411.19381) | - | - | Nov., 2024 |
  - PIE-NeRF: Physics-based Interactive Elastodynamics with NeRF - b31b1b.svg)](https://arxiv.org/abs/2311.13099) | [![Star](https://img.shields.io/github/stars/FYTalon/pienerf.svg?style=social&label=Star)](https://github.com/FYTalon/pienerf) | [![Website](https://img.shields.io/badge/Website-9cf)](https://fytalon.github.io/pienerf/) | CVPR, 2024 |
  - VR-GS: A Physical Dynamics-Aware Interactive Gaussian Splatting System in Virtual Reality - b31b1b.svg)](https://arxiv.org/abs/2401.16663) | - | [![Website](https://img.shields.io/badge/Website-9cf)](https://yingjiang96.github.io/VR-GS/) | SIGGRAPH, 2024 |
  - PhysGaussian: Physics-Integrated 3D Gaussians for Generative Dynamics - b31b1b.svg)](https://arxiv.org/abs/2311.12198) | [![Star](https://img.shields.io/github/stars/XPandora/PhysGaussian.svg?style=social&label=Star)](https://github.com/XPandora/PhysGaussian) | [![Website](https://img.shields.io/badge/Website-9cf)]( https://xpandora.github.io/PhysGaussian/ ) | CVPR, 2024 |
  - Trans4D: Realistic Geometry-Aware Transition for Compositional Text-to-4D Synthesis - b31b1b.svg)](https://arxiv.org/abs/2410.07155) | [![Star](https://img.shields.io/github/stars/YangLing0818/Trans4D.svg?style=social&label=Star)](https://github.com/YangLing0818/Trans4D) | - | Oct., 2024 |
  - Phy124: Fast Physics-Driven 4D Content Generation from a Single Image - b31b1b.svg)](https://arxiv.org/abs/2409.07179) | - | - | Sep., 2024 |
  - Kubrick: Multimodal Agent Collaborations for Synthetic Video Generation - b31b1b.svg)](https://arxiv.org/abs/2408.10453) | - | - | Aug., 2024 |
  - Sync4D: Video Guided Controllable Dynamics for Physics-Based 4D Generation - b31b1b.svg)](https://arxiv.org/abs/2405.16849) | - | [![Website](https://img.shields.io/badge/Website-9cf)](https://sync4dphys.github.io/) | May., 2024 |
  - ElastoGen: 4D Generative Elastodynamics - b31b1b.svg)](https://arxiv.org/abs/2405.15056) | - | [![Website](https://img.shields.io/badge/Website-9cf)](https://anunrulybunny.github.io/elastogen/) | May, 2024 |
  - PhysGen: Rigid-Body Physics-Grounded Image-to-Video Generation - b31b1b.svg)](https://arxiv.org/abs/2409.18964) | [![Star](https://img.shields.io/github/stars/stevenlsw/physgen.svg?style=social&label=Star)](https://github.com/stevenlsw/physgen) | [![Website](https://img.shields.io/badge/Website-9cf)](https://stevenlsw.github.io/physgen/) | ECCV, 2024 |
  - PhysDreamer: Physics-Based Interaction with 3D Objects via Video Generation - b31b1b.svg)](https://arxiv.org/abs/2404.13026) | [![Star](https://img.shields.io/github/stars/a1600012888/PhysDreamer.svg?style=social&label=Star)](https://github.com/a1600012888/PhysDreamer) | [![Website](https://img.shields.io/badge/Website-9cf)](https://physdreamer.github.io/) | ECCV, 2024 Oral |
  - Feature Splatting: Language-Driven Physics-Based Scene Synthesis and Editing - b31b1b.svg)](https://arxiv.org/abs/2404.01223) | [![Star](https://img.shields.io/github/stars/vuer-ai/feature-splatting-inria.svg?style=social&label=Star)](https://github.com/vuer-ai/feature-splatting-inria) | [![Website](https://img.shields.io/badge/Website-9cf)](https://feature-splatting.github.io/) | ECCV, 2024 |
  - Neural Material Adaptor for Visual Grounding of Intrinsic Dynamics - b31b1b.svg)](https://arxiv.org/abs/2410.08257) | [![Star](https://img.shields.io/github/stars/XJay18/NeuMA.svg?style=social&label=Star)](https://github.com/XJay18/NeuMA) | [![Website](https://img.shields.io/badge/Website-9cf)](https://xjay18.github.io/projects/neuma.html) | NIPS, 2024 |
  - Dynamic 3D Gaussians: Tracking by Persistent Dynamic View Synthesis - b31b1b.svg)](https://arxiv.org/abs/2308.09713) | [![Star](https://img.shields.io/github/stars/JonathonLuiten/Dynamic3DGaussians.svg?style=social&label=Star)](https://github.com/JonathonLuiten/Dynamic3DGaussians) | [![Website](https://img.shields.io/badge/Website-9cf)](https://dynamic3dgaussians.github.io/) | 3DV, 2024 |
  - DeformGS: Scene Flow in Highly Deformable Scenes for Deformable Object Manipulation - b31b1b.svg)](https://arxiv.org/abs/2312.00583) | [![Star](https://img.shields.io/github/stars/momentum-robotics-lab/deformgs.svg?style=social&label=Star)](https://github.com/momentum-robotics-lab/deformgs) | [![Website](https://img.shields.io/badge/Website-9cf)](https://deformgs.github.io/) | WAFR 2024 |
  - Learning Neural Constitutive Laws From Motion Observations for Generalizable PDE Dynamics - b31b1b.svg)](https://arxiv.org/abs/2304.14369) | [![Star](https://img.shields.io/github/stars/PingchuanMa/NCLaw.svg?style=social&label=Star)](https://github.com/PingchuanMa/NCLaw) | [![Website](https://img.shields.io/badge/Website-9cf)](https://sites.google.com/view/nclaw) | ICML, 2023 |
  - Pac-nerf: Physics Augmented Continuum Neural Radiance Fields for Geometry-Agnostic System Identification - b31b1b.svg)](https://arxiv.org/abs/2303.05512) | [![Star](https://img.shields.io/github/stars/xuan-li/PAC-NeRF.svg?style=social&label=Star)](https://github.com/xuan-li/PAC-NeRF) | [![Website](https://img.shields.io/badge/Website-9cf)](https://sites.google.com/view/PAC-NeRF) | ICLR, 2023, Spotlight |
  - Feature Splatting: Language-Driven Physics-Based Scene Synthesis and Editing - b31b1b.svg)](https://arxiv.org/abs/2404.01223) | [![Star](https://img.shields.io/github/stars/vuer-ai/feature-splatting-inria.svg?style=social&label=Star)](https://github.com/vuer-ai/feature-splatting-inria) | [![Website](https://img.shields.io/badge/Website-9cf)](https://feature-splatting.github.io/) | ECCV, 2024 |
  - Reasoning Physical Video Generation with Diffusion Timestep Tokens via Reinforcement Learning - b31b1b.svg)](https://arxiv.org/abs/2504.15932)|-|-|Apr., 2025|
  - Teaching Video Diffusion Model with Latent Physical Phenomenon Knowledge - b31b1b.svg)](https://arxiv.org/abs/2411.11343)|[![Star](https://img.shields.io/github/stars/caoql98/TVML.svg?style=social&label=Star)](https://github.com/caoql98/TVML)|[![Website](https://img.shields.io/badge/Website-9cf)](https://qinglongcao.xyz/TVML-Diffusion.github.io/)| Nov., 2024 |
  - PhysDreamer: Physics-Based Interaction with 3D Objects via Video Generation - b31b1b.svg)](https://arxiv.org/abs/2404.13026) | [![Star](https://img.shields.io/github/stars/a1600012888/PhysDreamer.svg?style=social&label=Star)](https://github.com/a1600012888/PhysDreamer) | [![Website](https://img.shields.io/badge/Website-9cf)](https://physdreamer.github.io/) | ECCV, 2024 Oral |
  - PhysDreamer: Physics-Based Interaction with 3D Objects via Video Generation - b31b1b.svg)](https://arxiv.org/abs/2404.13026) | [![Star](https://img.shields.io/github/stars/a1600012888/PhysDreamer.svg?style=social&label=Star)](https://github.com/a1600012888/PhysDreamer) | [![Website](https://img.shields.io/badge/Website-9cf)](https://physdreamer.github.io/) | ECCV, 2024 Oral |
  - Feature Splatting: Language-Driven Physics-Based Scene Synthesis and Editing - b31b1b.svg)](https://arxiv.org/abs/2404.01223) | [![Star](https://img.shields.io/github/stars/vuer-ai/feature-splatting-inria.svg?style=social&label=Star)](https://github.com/vuer-ai/feature-splatting-inria) | [![Website](https://img.shields.io/badge/Website-9cf)](https://feature-splatting.github.io/) | ECCV, 2024 |
  - PhysDreamer: Physics-Based Interaction with 3D Objects via Video Generation - b31b1b.svg)](https://arxiv.org/abs/2404.13026) | [![Star](https://img.shields.io/github/stars/a1600012888/PhysDreamer.svg?style=social&label=Star)](https://github.com/a1600012888/PhysDreamer) | [![Website](https://img.shields.io/badge/Website-9cf)](https://physdreamer.github.io/) | ECCV, 2024 Oral |
  - Feature Splatting: Language-Driven Physics-Based Scene Synthesis and Editing - b31b1b.svg)](https://arxiv.org/abs/2404.01223) | [![Star](https://img.shields.io/github/stars/vuer-ai/feature-splatting-inria.svg?style=social&label=Star)](https://github.com/vuer-ai/feature-splatting-inria) | [![Website](https://img.shields.io/badge/Website-9cf)](https://feature-splatting.github.io/) | ECCV, 2024 |
- Datasets, Benchmarks and Metrics
  - T2VPhysBench: A First-Principles Benchmark for Physical Consistency in Text-to-Video Generation - b31b1b.svg)](https://arxiv.org/abs/2505.00337)|-|-|May, 2025|
  - Direct Motion Models for Assessing Generated Videos - b31b1b.svg)](https://arxiv.org/abs/2505.00209)|[![Star](https://img.shields.io/github/stars/google-deepmind/tapnet.svg?style=social&label=Star)](https://github.com/google-deepmind/tapnet)|[![Website](https://img.shields.io/badge/Website-9cf)](https://trajan-paper.github.io/)|Apr., 2025|
  - Morpheus: Benchmarking Physical Reasoning of Video Generative Models with Real Physical Experiments - b31b1b.svg)](https://arxiv.org/abs/2504.02918)|-|-|Apr., 2025|
  - Cognitive Science-Inspired Evaluation of Core Capabilities for Object Understanding in AI - b31b1b.svg)](https://arxiv.org/abs/2503.21668)|-|-|Mar., 2025|
  - Impossible Videos - b31b1b.svg)](https://arxiv.org/abs/2503.14378)|[![Star](https://img.shields.io/github/stars/showlab/Impossible-Videos.svg?style=social&label=Star)](https://github.com/showlab/Impossible-Videos)|[![Website](https://img.shields.io/badge/Website-9cf)](https://showlab.github.io/Impossible-Videos/)|Mar., 2025 |
  - VideoPhy-2: A Challenging Action-Centric Physical Commonsense Evaluation in Video Generation - b31b1b.svg)](https://arxiv.org/abs/2503.06800)|[![Star](https://img.shields.io/github/stars/Hritikbansal/videophy.svg?style=social&label=Star)](https://github.com/Hritikbansal/videophy)|[![Website](https://img.shields.io/badge/Website-9cf)](https://videophy2.github.io/)|Mar., 2025 |
  - A physical coherence benchmark for evaluating video generation models via optical flow-guided frame prediction - b31b1b.svg)](https://www.arxiv.org/abs/2502.05503) | [![Star](https://img.shields.io/github/stars/Jeckinchen/PhyCoBench.svg?style=social&label=Star)](https://github.com/Jeckinchen/PhyCoBench) | - | Feb., 2025 |
  - What You See Is What Matters: A Novel Visual and Physics-Based Metric for Evaluating Video Generation Quality - b31b1b.svg)](https://arxiv.org/abs/2411.13609)|-|-| Nov., 2024 |
  - Towards world simulator: Crafting physical commonsense-based benchmark for video generation - b31b1b.svg)](https://arxiv.org/abs/2410.05363) | [![Star](https://img.shields.io/github/stars/OpenGVLab/PhyGenBench.svg?style=social&label=Star)](https://github.com/OpenGVLab/PhyGenBench) | [![Website](https://img.shields.io/badge/Website-9cf)](https://phygenbench123.github.io/) | Oct., 2024 |
  - WorldSimBench: Towards Video Generation Models as World Simulators - b31b1b.svg)](https://arxiv.org/abs/2410.18072)|-|[![Website](https://img.shields.io/badge/Website-9cf)](https://iranqin.github.io/WorldSimBench.github.io/)|Oct., 2024 |
  - Phybench: A physical commonsense benchmark for evaluating text-to-image model - b31b1b.svg)](https://arxiv.org/abs/2406.11802) | - | - | Jun., 2024 |
  - Physion: Evaluating physical prediction from vision in humans and machines - b31b1b.svg)](https://arxiv.org/abs/2106.08261) | [![Star](https://img.shields.io/github/stars/cogtoolslab/physics-benchmarking-neurips2021.svg?style=social&label=Star)](https://github.com/cogtoolslab/physics-benchmarking-neurips2021) | [![Website](https://img.shields.io/badge/Website-9cf)](https://physion-benchmark.github.io/) | Nips, 2021 |
  - A Unified Evaluation Benchmark for World Generation - b31b1b.svg)](https://arxiv.org/abs/2504.00983)| [![Star](https://img.shields.io/github/stars/haoyi-duan/WorldScore.svg?style=social&label=Star)](https://github.com/haoyi-duan/WorldScore)| [![Website](https://img.shields.io/badge/Website-9cf)](https://haoyi-duan.github.io/WorldScore/)|Apr., 2025|
  - PhysGaia: A Physics-Aware Dataset of Multi-Body Interactions for Dynamic Novel View Synthesis - b31b1b.svg)](https://arxiv.org/abs/2506.02794)|[![Star](https://img.shields.io/github/stars/mjmjeong/PhysGaia.svg?style=social&label=Star)](https://github.com/mjmjeong/PhysGaia)|[![Website](https://img.shields.io/badge/Website-9cf)](https://cvlab.snu.ac.kr/research/PhysGaia/)|Jun., 2025|
  - Universal Visuo-Tactile Video Understanding for Embodied Interaction - b31b1b.svg)](https://arxiv.org/abs/2505.22566)|-|-|May, 2025|
  - HOIGen-1M: A Large-scale Dataset for Human-Object Interaction Video Generation - b31b1b.svg)](https://arxiv.org/abs/2503.23715)|-| [![Website](https://img.shields.io/badge/Website-9cf)](https://liuqi-creat.github.io/HOIGen.github.io/)|Mar., 2025; CVPR, 2025|
  - ChronoMagic-Bench: A Benchmark for Metamorphic Evaluation of Text-to-Time-lapse Video Generation - b31b1b.svg)](https://arxiv.org/abs/2406.18522)|[![Star](https://img.shields.io/github/stars/PKU-YuanGroup/ChronoMagic-Bench.svg?style=social&label=Star)](https://github.com/PKU-YuanGroup/ChronoMagic-Bench)|[![Website](https://img.shields.io/badge/Website-9cf)](https://pku-yuangroup.github.io/ChronoMagic-Bench/)|NeurIPS, 2024, Spotlight |
  - PhysGame: Uncovering Physical Commonsense Violations in Gameplay Videos - b31b1b.svg)](https://arxiv.org/abs/2412.01800)| [![Star](https://img.shields.io/github/stars/PhysGame/PhysGame.svg?style=social&label=Star)](https://github.com/PhysGame/PhysGame)|[![Website](https://img.shields.io/badge/Website-9cf)](https://physgame.github.io/)|Dec., 2024|
  - Videophy: Evaluating physical commonsense for video generation - b31b1b.svg)](https://arxiv.org/abs/2406.03520) | [![Star](https://img.shields.io/github/stars/Hritikbansal/videophy.svg?style=social&label=Star)](https://github.com/Hritikbansal/videophy) | [![Website](https://img.shields.io/badge/Website-9cf)](https://videophy.github.io/) | Jun., 2024 |
  - Videocon: Robust video-language alignment via contrast captions - b31b1b.svg)](https://arxiv.org/abs/2311.10111) | [![Star](https://img.shields.io/github/stars/Hritikbansal/videocon.svg?style=social&label=Star)](https://github.com/Hritikbansal/videocon) | [![Website](https://img.shields.io/badge/Website-9cf)](https://video-con.github.io/) | CVPR, 2024 |
  - Physion++: Evaluating physical scene understanding that requires online inference of different physical properties - b31b1b.svg)]([![arXiv](https://img.shields.io/badge/arXiv-b31b1b.svg)](https://arxiv.org/abs/2106.08261)) | - | [![Website](https://img.shields.io/badge/Website-9cf)](https://dingmyu.github.io/physion_v2/) | Nips, 2023 |
  - Craft: A benchmark for causal reasoning about forces and interactions - b31b1b.svg)](https://arxiv.org/abs/2012.04293) | [![Star](https://img.shields.io/github/stars/hucvl/craft.svg?style=social&label=Star)](https://github.com/hucvl/craft) | [![Website](https://img.shields.io/badge/Website-9cf)](https://sites.google.com/view/craft-benchmark) | ACL, 2022 |
  - Pisa:experiments: Exploring physics post-training for video diffusion models by watching stuff drop - b31b1b.svg)](https://arxiv.org/abs/2503.09595) | [![Star](https://img.shields.io/github/stars/vision-x-nyu/pisa-experiments.svg?style=social&label=Star)](https://github.com/vision-x-nyu/pisa-experiments) | - | Mar., 2025 |
  - Do generative video models learn physical principles from watching videos? - b31b1b.svg)](https://arxiv.org/abs/2501.09038) | [![Star](https://img.shields.io/github/stars/google-deepmind/physics-IQ-benchmark.svg?style=social&label=Star)](https://github.com/google-deepmind/physics-IQ-benchmark) | [![Website](https://img.shields.io/badge/Website-9cf)](https://physics-iq.github.io/) | Jan., 2025 |
  - Llmphy: Complex physical reasoning using large language models and world models - b31b1b.svg)](https://arxiv.org/abs/2411.08027) | - | - | Nov., 2024 |
  - GenWorld: Towards Detecting AI-generated Real-world Simulation Videos - b31b1b.svg)](https://arxiv.org/abs/2506.10975)|[![Star](https://img.shields.io/github/stars/chen-wl20/GenWorld.svg?style=social&label=Star)](https://github.com/chen-wl20/GenWorld)|[![Website](https://img.shields.io/badge/Website-9cf)](https://chen-wl20.github.io/GenWorld/)|Jun., 2025|
  - IntPhys 2: Benchmarking Intuitive Physics Understanding In Complex Synthetic Environments - b31b1b.svg)](https://arxiv.org/abs/2506.09849)|[![Star](https://img.shields.io/github/stars/facebookresearch/IntPhys2.svg?style=social&label=Star)](https://github.com/facebookresearch/IntPhys2)|[![Website](https://img.shields.io/badge/Website-9cf)](https://ai.meta.com/blog/v-jepa-2-world-model-benchmarks/)|Jun., 2025|
  - Seeing is Not Reasoning: MVPBench for Graph-based Evaluation of Multi-path Visual Physical CoT - b31b1b.svg)](https://arxiv.org/abs/2505.24182)|[![Star](https://img.shields.io/github/stars/CSU-JPG/MVPBench.svg?style=social&label=Star)](https://github.com/CSU-JPG/MVPBench)|[![Website](https://img.shields.io/badge/Website-9cf)](https://csu-jpg.github.io/MVPBench/)|May, 2025|
  - Wisa: World simulator assistant for physics-aware text-to-video generation - b31b1b.svg)](https://arxiv.org/abs/2503.08153) | [![Star](https://img.shields.io/github/stars/360CVGroup/WISA.svg?style=social&label=Star)](https://github.com/360CVGroup/WISA) | [![Website](https://img.shields.io/badge/Website-9cf)](https://360cvgroup.github.io/WISA/) | Mar., 2025 |
- Active Cognition for World Simulation
  - Cosmos world foundation model platform for physical ai - b31b1b.svg)](https://arxiv.org/abs/2501.03575) | [![Star](https://img.shields.io/github/stars/nvidia-cosmos/cosmos-predict1.svg?style=social&label=Star)](https://github.com/nvidia-cosmos/cosmos-predict1) | [ ![Website](https://img.shields.io/badge/Website-9cf) ](https://www.nvidia.com/en-us/ai/cosmos/) | Jan., 2025 |
  - Aether: Geometric-Aware Unified World Modeling - b31b1b.svg)](https://arxiv.org/abs/2503.18945)| [![Star](https://img.shields.io/github/stars/OpenRobotLab/Aether.svg?style=social&label=Star)](https://github.com/OpenRobotLab/Aether)|[![Website](https://img.shields.io/badge/Website-9cf)](https://aether-world.github.io/)|Mar., 2025|
  - Improving video generation with human feedback - b31b1b.svg)](https://arxiv.org/abs/2501.13918) | [![Star](https://img.shields.io/github/stars/KwaiVGI/VideoAlign.svg?style=social&label=Star)](https://github.com/KwaiVGI/VideoAlign) | [![Website](https://img.shields.io/badge/Website-9cf)](https://gongyeliu.github.io/videoalign/) | Jan., 2025 |
  - Dream to manipulate: Compositional world models empowering robot imitation learning with imagination - b31b1b.svg)](https://arxiv.org/abs/2412.14957) | [![Star](https://img.shields.io/github/stars/leobarcellona/drema_code.svg?style=social&label=Star)](https://github.com/leobarcellona/drema_code) | [![Website](https://img.shields.io/badge/Website-9cf)](https://dreamtomanipulate.github.io/) | ICLR, 2025 |
  - Improving Dynamic Object Interactions in Text-to-Video Generation with AI Feedback - b31b1b.svg)](https://arxiv.org/abs/2412.02617) | - | [![Website](https://img.shields.io/badge/Website-9cf)](https://sites.google.com/view/aif-dynamic-t2v/) | Dec., 2024 |
  - Physical informed driving world model - b31b1b.svg)](https://arxiv.org/abs/2412.08410) | - | [![Website](https://img.shields.io/badge/Website-9cf)](https://metadrivescape.github.io/papers_project/DrivePhysica/page.html) | Dec., 2024 |
  - How far is video generation from world model: A physical law perspective - b31b1b.svg)](https://arxiv.org/abs/2411.02385) | [![Star](https://img.shields.io/github/stars/phyworld/phyworld.svg?style=social&label=Star)](https://github.com/phyworld/phyworld) | [![Website](https://img.shields.io/badge/Website-9cf)](https://phyworld.github.io/) | Nov., 2024 |
  - Videoagent: Self-improving video generation - b31b1b.svg)](https://arxiv.org/abs/2410.10076) | [![Star](https://img.shields.io/github/stars/Video-as-Agent/VideoAgent.svg?style=social&label=Star)](https://github.com/Video-as-Agent/VideoAgent) | [![Website](https://img.shields.io/badge/Website-9cf)](https://video-as-agent.github.io/) | Oct., 2024 |
  - Gen-drive: Enhancing diffusion generative driving policies with reward modeling and reinforcement learning fine-tuning - b31b1b.svg)](https://arxiv.org/abs/2410.05582) | - | [![Website](https://img.shields.io/badge/Website-9cf)](https://mczhi.github.io/GenDrive/) | Oct, 2024 |
  - Open-sora: Democratizing efficient video production for all - b31b1b.svg)](https://arxiv.org/abs/2412.20404) | [![Star](https://img.shields.io/github/stars/hpcaitech/Open-Sora.svg?style=social&label=Star)](https://github.com/hpcaitech/Open-Sora) | [![Website](https://img.shields.io/badge/Website-9cf)](https://hpcaitech.github.io/Open-Sora/) | Dec., 2024 |
  - Genie: ¨ Generative interactive environments - b31b1b.svg)](https://arxiv.org/abs/2402.15391) | - | [![Website](https://img.shields.io/badge/Website-9cf)](https://sites.google.com/view/genie-2024/?pli=1) | Feb., 2024 |
  - Learning interactive real-world simulators - b31b1b.svg)](https://arxiv.org/abs/2310.06114) | - | [![Website](https://img.shields.io/badge/Website-9cf)](https://universal-simulator.github.io/unisim/) | ICLR, 2024, Outstanding Paper Award |
  - Gaia-1: A generative world model for autonomous driving - b31b1b.svg)](https://arxiv.org/abs/2309.17080) | - | [![Website](https://img.shields.io/badge/Website-9cf)]( https://wayve.ai/thinking/introducing-gaia1/) | Sep., 2023 |
  - Science-T2I: Addressing Scientific Illusions in Image Synthesis - b31b1b.svg)](https://arxiv.org/abs/2504.13129)|[![Star](https://img.shields.io/github/stars/Jialuo-Li/Science-T2I.svg?style=social&label=Star)](https://github.com/Jialuo-Li/Science-T2I)|[![Website](https://img.shields.io/badge/Website-9cf)](https://jialuo-li.github.io/Science-T2I-Web/)|CVPR, 2025|
  - DeepVerse: 4D Autoregressive Video Generation as a World Model - b31b1b.svg)](https://arxiv.org/abs/2506.01103)|[![Star](https://img.shields.io/github/stars/SOTAMak1r/DeepVerse.svg?style=social&label=Star)](https://github.com/SOTAMak1r/DeepVerse)|[![Website](https://img.shields.io/badge/Website-9cf)](https://sotamak1r.github.io/deepverse/)|Jun., 2025|
  - VideoREPA: Learning Physics for Video Generation through Relational Alignment with Foundation Models - b31b1b.svg)](https://arxiv.org/abs/2505.23656)|[![Star](https://img.shields.io/github/stars/aHapBean/VideoREPA.svg?style=social&label=Star)](https://github.com/aHapBean/VideoREPA)|[![Website](https://img.shields.io/badge/Website-9cf)](https://videorepa.github.io/)|May, 2025|
  - Learning World Models for Interactive Video Generation - b31b1b.svg)](https://arxiv.org/html/2505.21996v1)|-|-|May, 2025|
  - EnerVerse-AC: Envisioning Embodied Environments with Action Condition - b31b1b.svg)](https://arxiv.org/abs/2505.09723)|[![Star](https://img.shields.io/github/stars/AgibotTech/EnerVerse-AC.svg?style=social&label=Star)](https://github.com/AgibotTech/EnerVerse-AC)|[![Website](https://img.shields.io/badge/Website-9cf)](https://annaj2178.github.io/EnerverseAC.github.io/)|May, 2025|
  - AdaWorld: Learning Adaptable World Models with Latent Actions - b31b1b.svg)](https://arxiv.org/abs/2503.18938)|[![Star](https://img.shields.io/github/stars/Little-Podi/AdaWorld.svg?style=social&label=Star)](https://github.com/Little-Podi/AdaWorld)| [![Website](https://img.shields.io/badge/Website-9cf)](https://adaptable-world-model.github.io/)|Mar., 2025|
  - Ipo: Iterative preference optimization for text-to-video generation - b31b1b.svg)](https://arxiv.org/abs/2502.02088) | [![Star](https://img.shields.io/github/stars/SAIS-FUXI/IPO.svg?style=social&label=Star)](https://github.com/SAIS-FUXI/IPO) | - | Feb, 2025 |
  - Phyt2v: Llm-guided iterative self-refinement for physics-grounded text-to-video generation - b31b1b.svg)](https://arxiv.org/abs/2412.00596) | [![Star](https://img.shields.io/github/stars/pittisl/PhyT2V.svg?style=social&label=Star)](https://github.com/pittisl/PhyT2V) | - | CVPR, 2025 |
  - MagicTime: Time-lapse Video Generation Models as Metamorphic Simulators - b31b1b.svg)](https://arxiv.org/abs/2404.05014) | [![Star](https://img.shields.io/github/stars/PKU-YuanGroup/MagicTime.svg?style=social&label=Star)](https://github.com/PKU-YuanGroup/MagicTime) | [![Website](https://img.shields.io/badge/Website-9cf)](https://pku-yuangroup.github.io/MagicTime/) | TPAMI, 2025 |
  - ManiGaussian: Dynamic Gaussian Splatting for Multi-task Robotic Manipulation - b31b1b.svg)](https://arxiv.org/abs/2403.08321) | - | [![Website](https://img.shields.io/badge/Website-9cf)](https://guanxinglu.github.io/ManiGaussian/) | ECCV, 2025 |
  - Drivedreamer4d: World models are effective data machines for 4d driving scene representation - b31b1b.svg)](https://arxiv.org/abs/2410.13571) | [![Star](https://img.shields.io/github/stars/GigaAI-research/DriveDreamer4D.svg?style=social&label=Star)](https://github.com/GigaAI-research/DriveDreamer4D) | [![Website](https://img.shields.io/badge/Website-9cf)](https://drivedreamer4d.github.io/) | Oct, 2024 |
  - Imagen 3 - b31b1b.svg)](https://arxiv.org/abs/2408.07009) | - | [![Website](https://img.shields.io/badge/Website-9cf)](https://deepmind.google/technologies/imagen-3/) | Aug., 2024 |
  - Worlddreamer: Towards general world models for video generation via predicting masked tokens - b31b1b.svg)](https://arxiv.org/abs/2401.09985) | [![Star](https://img.shields.io/github/stars/JeffWang987/WorldDreamer.svg?style=social&label=Star)](https://github.com/JeffWang987/WorldDreamer) | [![Website](https://img.shields.io/badge/Website-9cf)](https://world-dreamer.github.io/) | Jan., 2024 |
  - Physically embodied gaussian splatting: A visually learnt and physically grounded 3d representation for robotics - | - | [![Website](https://img.shields.io/badge/Website-9cf)](https://embodied-gaussians.github.io/) | CoRL, 2024 |
  - MirrorVerse: Pushing Diffusion Models to Realistically Reflect the World - b31b1b.svg)](https://arxiv.org/abs/2504.15397)|-|[![Website](https://img.shields.io/badge/Website-9cf)](https://mirror-verse.github.io/)|CVPR, 2025|
- Workshops
  - Building Physically Plausible World Models - 9cf)](https://physical-world-modeling.github.io/#call)|ICML 2025 Workshop, Vancouver Saturday, July 19 (Whole-Day Workshop)|
  - 1st Workshop on Vision Meets Physics: Synergizing Physical Simulation and Computer Vision - 9cf)](https://visionmeetphysics.github.io/)|CVPR 2025 Workshop, June 12 (full day), 2025|
- Physical Understanding
  - PhyBlock: A Progressive Benchmark for Physical Understanding and Planning via 3D Block Assembly - b31b1b.svg)](https://arxiv.org/abs/2506.08708)|[![Star](https://img.shields.io/github/stars/PhyBlock/PhyBlock.svg?style=social&label=Star)](https://github.com/PhyBlock/PhyBlock)|[![Website](https://img.shields.io/badge/Website-9cf)](https://phyblock.github.io/)|Jun., 2025|
  - CausalVQA: A Physically Grounded Causal Reasoning Benchmark for Video Models - b31b1b.svg)](https://arxiv.org/abs/2506.09943)|[![Star](https://img.shields.io/github/stars/facebookresearch/CausalVQA.svg?style=social&label=Star)](https://github.com/facebookresearch/CausalVQA)|[![Website](https://img.shields.io/badge/Website-9cf)](https://ai.meta.com/blog/v-jepa-2-world-model-benchmarks/)|Jun., 2025|
  - SlotPi: Physics-informed Object-centric Reasoning Models - b31b1b.svg)](https://arxiv.org/abs/2506.10778)|-|-|Jun., 2025|

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

Awesome-Physics-Cognition-based-Video-Generation

📖 Table of Contents

Surveys

Basic Schematic Perception for Generation

Passive Cognition of Physical Knowledge for Generation

Datasets, Benchmarks and Metrics

Active Cognition for World Simulation

Workshops

Physical Understanding