首页 /研究 /Full-GRU Natural Language Video Description for Service Robotics Applications
HRI

Full-GRU Natural Language Video Description for Service Robotics Applications

Silvia Cascianelli, Gabriele Costante, Thomas A. Ciarfuglia, Paolo Valigi, Mario Luca Fravolini

发表年份
2018
引用次数
35

摘要

Enabling effective human-robot interaction is crucial for any service robotics application. In this context, a fundamental aspect is the development of a user-friendly human-robot interface, such as a natural language interface. In this letter, we investigate the robot side of the interface, in particular the ability to generate natural language descriptions for the scene it observes. We achieve this capability via a deep recurrent neural network architecture completely based on the gated recurrent unit paradigm. The robot is able to generate complete sentences describing the scene, dealing with the hierarchical nature of the temporal information contained in image sequences. The proposed approach has fewer parameters than previous state-of-the-art architectures, thus it is faster to train and smaller in memory occupancy. These benefits do not affect the prediction performance. In fact, we show that our method outperforms or is comparable to previous approaches in terms of quantitative metrics and qualitative evaluation when tested on benchmark publicly available datasets and on a new dataset we introduce in this letter.

关键词

Computer scienceArtificial intelligenceRoboticsRobotBenchmark (surveying)Interface (matter)Natural languageContext (archaeology)Service (business)Service robot

相关论文

查看 HRI 分类全部论文