Comparing Apples to Oranges: LLM-powered Multimodal Intention Prediction in an Object Categorization Task
Hassan Ali, Philipp Allgeuer, Stefan Wermter
- 发表年份
- 2024
- 访问权限
- 开放获取
摘要
Human intention-based systems enable robots to perceive and interpret user actions to interact with humans and adapt to their behavior proactively. Therefore, intention prediction is pivotal in creating a natural interaction with social robots in human-designed environments. In this paper, we examine using Large Language Models (LLMs) to infer human intention in a collaborative object categorization task with a physical robot. We propose a novel multimodal approach that integrates user non-verbal cues, like hand gestures, body poses, and facial expressions, with environment states and user verbal cues to predict user intentions in a hierarchical architecture. Our evaluation of five LLMs shows the potential for reasoning about verbal and non-verbal user cues, leveraging their context-understanding and real-world knowledge to support intention prediction while collaborating on a task with a social robot. Video: https://youtu.be/tBJHfAuzohI
关键词
相关论文
工业5.0中人机协作的多模态感知、互认知与具身执行综述与展望
Kai Ding, Qingyuan Mao, Yaqian Zhang 等 6 位作者
Robotics and Computer-Integrated Manufacturing · 2026
迈向以人为中心的制造:人机协作装配中不确定性下的任务规划
Yingchao You, Ze Ji, Changyun Wei
Robotics and Computer-Integrated Manufacturing · 2026
代理式人机协作:通过记忆实现上下文对齐
Jiahui Si, Wenchao Li, Xi Chen 等 7 位作者
Robotics and Computer-Integrated Manufacturing · 2026
自适应物理信息Transformer结合高斯过程残差补偿用于人机协作中的逆动力学建模
Rui Qian, Xi Zhang, Dongpeng Li 等 5 位作者
Robotics and Computer-Integrated Manufacturing · 2026