Human-to-Robot Interaction: Learning from Video Demonstration for Robot Imitation
Thanh Nguyen Canh, Thanh-Tuan Tran, Haolan Zhang, Ziyan Gao, Nak Young Chong, Xiem HoangVan
- 发表年份
- 2026
- 访问权限
- 开放获取
摘要
Learning from Demonstration (LfD) offers a promising paradigm for robot skill acquisition. Recent approaches attempt to extract manipulation commands directly from video demonstrations, yet face two critical challenges: (1) general video captioning models prioritize global scene features over task-relevant objects, producing descriptions unsuitable for precise robotic execution, and (2) end-to-end architectures coupling visual understanding with policy learning require extensive paired datasets and struggle to generalize across objects and scenarios. To address these limitations, we propose a novel ``Human-to-Robot'' imitation learning pipeline that enables robots to acquire manipulation skills directly from unstructured video demonstrations, inspired by the human ability to learn by watching and imitating. Our key innovation is a modular framework that decouples the learning process into two distinct stages: (1) Video Understanding, which combines Temporal Shift Modules (TSM) with Vision-Language Models (VLMs) to extract actions and identify interacted objects, and (2) Robot Imitation, which employs TD3-based deep reinforcement learning to execute the demonstrated manipulations. We validated our approach in PyBullet simulation environments with a UR5e manipulator and in a real-world experiment with a UF850 manipulator across four fundamental actions: reach, pick, move, and put. For video understanding, our method achieves 89.97% action classification accuracy and BLEU-4 scores of 0.351 on standard objects and 0.265 on novel objects, representing improvements of 76.4% and 128.4% over the best baseline, respectively. For robot manipulation, our framework achieves an average success rate of 87.5% across all actions, with 100% success on reaching tasks and up to 90% on complex pick-and-place operations. The project website is available at https://thanhnguyencanh.github.io/LfD4hri.
关键词
相关论文
面向大型复杂构件的移动机器人辅助磨削技术综述
Yusen Li, Ziwei Wang, Xiangye Zhu 等 12 位作者
Robotics and Computer-Integrated Manufacturing · 2026
基于物理信息与机器学习的五轴铣削TC4钛合金刀具磨损融合预测模型
Shaoqing Qin, Lida Zhu, Yanpeng Hao 等 10 位作者
Robotics and Computer-Integrated Manufacturing · 2026
通过新型压电主动阻尼刀柄提升机器人铣削质量
Bo Li, Yuanbo Zhao, Huijie Xiao 等 6 位作者
Robotics and Computer-Integrated Manufacturing · 2026
一种利用磁致非线性宽带多向被动减振器抑制机器人铣削低频颤振的新方法
Hao Li, Yuhui Yu, Rui Fu 等 6 位作者
Robotics and Computer-Integrated Manufacturing · 2026