Diffusion models for robotic manipulation: a survey

R. Wolf, Yitian Shi, Sheng Liu, Rania Rayyes

Year: 2025
Citations: 16

Abstract

Diffusion generative models have demonstrated remarkable success in visual domains such as image and video generation. They have also recently emerged as a promising approach in robotics, especially in robot manipulations. Diffusion models leverage a probabilistic framework, and they stand out with their ability to model multi-modal distributions and their robustness to high-dimensional input and output spaces. This survey provides a comprehensive review of state-of-the-art diffusion models in robotic manipulation, including grasp learning, trajectory planning, and data augmentation. Diffusion models for scene and image augmentation lie at the intersection of robotics and computer vision for vision-based tasks to enhance generalizability and data scarcity. This paper also presents the two main frameworks of diffusion models and their integration with imitation learning and reinforcement learning. In addition, it discusses the common architectures and benchmarks and points out the challenges and advantages of current state-of-the-art diffusion-based methods.

Keywords

Leverage (statistics)GRASPRobustness (evolution)RoboticsGeneralizability theoryProbabilistic logicReinforcement learningRobot

Diffusion models for robotic manipulation: a survey

Abstract

Keywords

Related papers

Artificial intelligence: a modern approach

Are we ready for autonomous driving? The KITTI vision benchmark suite

Self-Organizing Maps

Vision meets robotics: The KITTI dataset