首页 /研究 /HuBo-VLM: Unified Vision-Language Model designed for HUman roBOt interaction tasks

HRI

HuBo-VLM: Unified Vision-Language Model designed for HUman roBOt interaction tasks

Zichao Dong, Weikun Zhang, Xufeng Huang, Hang Ji, Xin Zhan, Junbo Chen

发表年份: 2023
访问权限: 开放获取

摘要

Human robot interaction is an exciting task, which aimed to guide robots following instructions from human. Since huge gap lies between human natural language and machine codes, end to end human robot interaction models is fair challenging. Further, visual information receiving from sensors of robot is also a hard language for robot to perceive. In this work, HuBo-VLM is proposed to tackle perception tasks associated with human robot interaction including object detection and visual grounding by a unified transformer based vision language model. Extensive experiments on the Talk2Car benchmark demonstrate the effectiveness of our approach. Code would be publicly available in https://github.com/dzcgaara/HuBo-VLM.

关键词

cs.ROcs.CV

HuBo-VLM: Unified Vision-Language Model designed for HUman roBOt interaction tasks

摘要

关键词

相关论文

The Uncanny Valley [From the Field]

Measurement Instruments for the Anthropomorphism, Animacy, Likeability, Perceived Intelligence, and Perceived Safety of Robots

The development of Honda humanoid robot

A Meta-Analysis of Factors Affecting Trust in Human-Robot Interaction