Awesome-Robotics-Diffusion
(In progress) A curated list of recent robot learning papers incorporating diffusion models for robotics tasks.
https://github.com/showlab/Awesome-Robotics-Diffusion
Last synced: 6 days ago
JSON representation
-
Table of Contents <!-- omit in toc -->
-
Task Objectives and Applications
- Spatial-Temporal Graph Diffusion Policy with Kinematic Modeling for Bimanual Robotic Manipulation
- Imitating Human Behaviour with Diffusion Models
- Memory-Consistent Neural Networks for Imitation Learning
- EDMP: Ensemble-of-costs-guided Diffusion for Motion Planning
- Differentiable Robot Rendering
- Sparse Diffusion Policy: A Sparse, Reusable, and Flexible Policy for Robot Learning
- RoLD: Robot Latent Diffusion for Multi-task Policy Modeling
- UMI on Legs: Making Manipulation Policies Mobile with Manipulation-Centric Whole-body Controllers
- Efficient Diffusion Transformer Policies with Mixture of Expert Denoisers for Multitask Learning
- Waypoint-Based Imitation Learning for Robotic Manipulation
- Generalizable Humanoid Manipulation with Improved 3D Diffusion Policies
- Learning score-based grasping primitive for human-assisting dexterous grasping
- DexDiffuser: Generating Dexterous Grasps with Diffusion Models
- Dexterous Functional Pre-Grasp Manipulation with Diffusion Policy
- Diffusion Forcing: Next-token Prediction Meets Full-Sequence Diffusion
-
Robot Learning Utilizing Diffusion Model Properties
- ![Website - reward.github.io/)
- ![arXiv
- ![Star
- Adaptive Online Replanning with Diffusion Models
- Diffusion reward: Learning rewards via conditional video diffusion
- ![Star - Lab/diffusion_reward)
-
Diffusion as Policy
- Safe Flow Matching: Robot Motion Planning with Control Barrier Functions
- Consistency policy: Accelerated visuomotor policies via consistency distillation
- Equivariant Diffusion Policy
- Goal-conditioned imitation learning using score-based diffusion policies
- Diffskill: Improving Reinforcement Learning Through Diffusion-Based Skill Denoiser for Robotic Manipulation - Based Systems 2024)
- ChainedDiffuser: Unifying Trajectory Diffusion and Keypose Prediction for Robotic Manipulation
- Se(3)-diffusionfields: Learning cost functions for joint grasp and motion optimization through diffusion
- Diffusion policy: Visuomotor policy learning via action diffusion
- Diffusion-VLA: Scaling Robot Foundation Models via Unified Diffusion and Autoregression
- Vision-Language-Affordance-based Robot Manipulation with Flow Matching
- RDT-1B: a Diffusion Foundation Model for Bimanual Manipulation
- EquiBot: SIM (3)-Equivariant Diffusion Policy for Generalizable and Data Efficient Learning
- SkillDiffuser: Interpretable Hierarchical Planning via Skill Abstractions in Diffusion-Based Task Execution
- Hierarchical Diffusion Policy for Kinematics-Aware Multi-Task Robotic Manipulation
- XSkill: Cross Embodiment Skill Discovery
- Generative Skill Chaining: Long-Horizon Skill Planning with Diffusion Models
- 3D Diffuser Actor: Policy Diffusion with 3D Scene Representations
- 3D Diffusion Policy: Generalizable Visuomotor Policy Learning via Simple 3D Representations
- RISE: 3D Perception Makes Real-World Robot Imitation Simple and Effective
- GenDP: 3D Semantic Fields for Category-Level Generalizable Diffusion Policy
- Track2Act: Predicting Point Tracks from Internet Videos enables Generalizable Robot Manipulation
- PlayFusion: Skill Acquisition via Diffusion from Language-Annotated Play
- Composable Part-Based Manipulation
- Shelving, stacking, hanging: Relational pose diffusion for multi-modal rearrangement
- Reorientdiff: Diffusion model based reorientation for object manipulation
- Planning-Guided Diffusion Policy Learning for Generalizable Contact-Rich Bimanual Manipulation
- Scaling Up and Distilling Down: Language-Guided Robot Skill Acquisition
-
Benchmarks
- ![Star
- ![Star - sim)
- Towards Human-Level Bimanual Dexterous Manipulation with Reinforcement Learning
- ![Star - MARL/DexterousHands)
- RLBench: The Robot Learning Benchmark & Learning Environment
- ![Star
- LIBERO: Benchmarking Knowledge Transfer in Lifelong Robot Learning
- ![Star - Robot-Learning/LIBERO)
- BridgeData V2: A Dataset for Robot Learning at Scale
- ![Star - code/dexart-release)
- ![Star - berkeley/bridge_data_v2)
- CALVIN: A Benchmark for Language-conditioned Policy Learning for Long-horizon Robot Manipulation Tasks
- DexMV: Imitation Learning for Dexterous Manipulation from Human Videos
- Learning Complex Dexterous Manipulation with Deep Reinforcement Learning and Demonstrations
- ![Star
- Meta-World: A Benchmark and Evaluation for Multi-Task and Meta Reinforcement Learning
- ![Star - Foundation/Metaworld)
- Bridge data: Boosting generalization of robotic skills with cross-domain datasets
- ![Star
-
Diffusion Policy
- ![Star - skill-chaining/gsc-code)
- ![Star
- ![Star - 2000/EDMP)
- ![Star - Prasad/Consistency-Policy/)
- ![Star - 2-Act/)
- ![Star - columbia/drrobot)
- ![Star
- ![Star
- ![Star - stanford/diffusion_policy)
- ![Star - robots/beso)
- ![Star - stanford/scalingup)
- ![Star - chained-diffuser)
- ![Star - Human-Behaviour-w-Diffusion)
- ![Star
- ![Star
- ![Star
- ![Star
- ![Star
- ![Star
- ![Star - Diffusion-Policy)
- ![Star - policy/rise)
- ![Star - EU/flow-matching-policy) -->
- ![Star - stanford/xskill)
- ![Star - ZX/skilldiffuser)
- ![Star - ai/hdp)
- ![Star - robots/MoDE_Diffusion_Policy)
- ![Star - ml/RoboticsDiffusionTransformer)
- ![Star - stanford/umi-on-legs)
- ![Star - 3D-Diffusion-Policy)
- ![Star
- ![Star - assisting-dex-grasp/)
- ![Star
- ![Star
- ![Star - forcing)
-
Diffusion as Synthesizer
- This&That: Language-Gesture Controlled Video Generation for Robot Planning
- RoboDreamer: Learning Compositional World Models for Robot Imagination
- Closed-Loop Visuomotor Control with Generative Expectation for Robotic Manipulation
- Autonomous Improvement of Instruction Following Skills via Foundation Models
- DALL-E-Bot: Introducing Web-Scale Diffusion Models to Robotics - Letters 2023)
- Learning Universal Policies via Text-Guided Video Generation
- Diffusion Meets DAgger: Supercharging Eye-in-hand Imitation Learning
- Cacti: A framework for scalable multi-task multi-scene visual imitation learning
- GenAug: Retargeting behaviors to unseen situations via Generative Augmentation
- Learning to Act from Actionless Videos through Dense Correspondences
- UniSim: Learning Interactive Real-World Simulators
- Compositional Foundation Models for Hierarchical Planning
- Video language planning
- Dreamitate: Real-World Visuomotor Policy Learning via Video Generation
- ARDuP: Active Region Video Diffusion for Universal Policies
-
Diffusion Generation Models in Robot Learning
-