Open-Ended Multi-Modal Relational Reasoning for Video Question Answering

Haozheng Luo, Ruiyang Qin, Chenwei Xu, Guo Ye, Zening Luo

发表年份: 2020
访问权限: 开放获取

摘要

In this paper, we introduce a robotic agent specifically designed to analyze external environments and address participants' questions. The primary focus of this agent is to assist individuals using language-based interactions within video-based scenes. Our proposed method integrates video recognition technology and natural language processing models within the robotic agent. We investigate the crucial factors affecting human-robot interactions by examining pertinent issues arising between participants and robot agents. Methodologically, our experimental findings reveal a positive relationship between trust and interaction efficiency. Furthermore, our model demonstrates a 2\% to 3\% performance enhancement in comparison to other benchmark methods.

关键词

cs.AIcs.HCcs.RO

Open-Ended Multi-Modal Relational Reasoning for Video Question Answering

摘要

关键词

相关论文

The Uncanny Valley [From the Field]

Measurement Instruments for the Anthropomorphism, Animacy, Likeability, Perceived Intelligence, and Perceived Safety of Robots

The development of Honda humanoid robot

A Meta-Analysis of Factors Affecting Trust in Human-Robot Interaction