Home /Research /Real-Time 3D Motion Prediction for Human-Robot Collaboration via Bayesian-Optimized Diffusion Models
HRI

Real-Time 3D Motion Prediction for Human-Robot Collaboration via Bayesian-Optimized Diffusion Models

Sibo Tian, Minghui Zheng, Xiao Liang

Year
2025
Citations
2

Abstract

Human motion prediction is a cornerstone of human-robot collaboration (HRC), as robots need to infer future movements of human workers based on observed motion cues to proactively plan their actions, ensuring safety in close collaboration scenarios. The diffusion model has demonstrated remarkable performance in predicting high-quality human motion with reasonable diversity, but suffers from a slow generative process that requires multiple times of model inference, hindering real-world applications. To enable real-time prediction, we propose training a one-step multi-layer perceptron-based (MLP-based) diffusion model using knowledge distillation and Bayesian optimization. Our method consists of two steps. First, we distill a pretrained diffusion-based motion predictor, <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">TransFusion</i>, directly into a one-step diffusion model with the same denoiser architecture. Then, to further reduce the inference time, we remove the computationally expensive components from the original denoiser and use knowledge distillation once again to distill the obtained one-step diffusion model into an even smaller model based solely on MLPs. Bayesian optimization is used to tune the hyperparameters for training the smaller diffusion model. To demonstrate the effectiveness of our model, we design a human-robot collaborative desktop disassembly task. The results showcase our model’s capability to forecast multiple realistic human motion in real time, addressing the uncertainty and multi-modality of human motion. Additional experimental studies are conducted on benchmark datasets to ensure fair comparisons with existing works, highlighting that our model significantly improves inference speed, achieving real-time prediction without noticeable degradation in performance. The project page is available at https://github.com/sibotian96/SwiftDiff.

Keywords

InferenceBenchmark (surveying)HyperparameterMotion (physics)Process (computing)Bayesian optimizationBayesian inferenceBayesian probabilityMotion capture

Related papers

Browse all HRI papers