Enhancing Speech Instruction Understanding and Disambiguation in Robotics via Speech Prosody
David Sasu, Kweku Andoh Yamoah, Benedict Quartey, Natalie Schluter
- 发表年份
- 2025
- 访问权限
- 开放获取
摘要
Enabling robots to accurately interpret and execute spoken language instructions is essential for effective human-robot collaboration. Traditional methods rely on speech recognition to transcribe speech into text, often discarding crucial prosodic cues needed for disambiguating intent. We propose a novel approach that directly leverages speech prosody to infer and resolve instruction intent. Predicted intents are integrated into large language models via in-context learning to disambiguate and select appropriate task plans. Additionally, we present the first ambiguous speech dataset for robotics, designed to advance research in speech disambiguation. Our method achieves 95.79% accuracy in detecting referent intents within an utterance and determines the intended task plan of ambiguous instructions with 71.96% accuracy, demonstrating its potential to significantly improve human-robot communication.
关键词
相关论文
工业5.0中人机协作的多模态感知、互认知与具身执行综述与展望
Kai Ding, Qingyuan Mao, Yaqian Zhang 等 6 位作者
Robotics and Computer-Integrated Manufacturing · 2026
迈向以人为中心的制造:人机协作装配中不确定性下的任务规划
Yingchao You, Ze Ji, Changyun Wei
Robotics and Computer-Integrated Manufacturing · 2026
代理式人机协作:通过记忆实现上下文对齐
Jiahui Si, Wenchao Li, Xi Chen 等 7 位作者
Robotics and Computer-Integrated Manufacturing · 2026
自适应物理信息Transformer结合高斯过程残差补偿用于人机协作中的逆动力学建模
Rui Qian, Xi Zhang, Dongpeng Li 等 5 位作者
Robotics and Computer-Integrated Manufacturing · 2026