首页 /研究 /Full-GRU Natural Language Video Description for Service Robotics Applications

HRI

Full-GRU Natural Language Video Description for Service Robotics Applications

Silvia Cascianelli, Gabriele Costante, Thomas A. Ciarfuglia, Paolo Valigi, Mario Luca Fravolini

发表年份: 2018
引用次数: 35

摘要

Enabling effective human-robot interaction is crucial for any service robotics application. In this context, a fundamental aspect is the development of a user-friendly human-robot interface, such as a natural language interface. In this letter, we investigate the robot side of the interface, in particular the ability to generate natural language descriptions for the scene it observes. We achieve this capability via a deep recurrent neural network architecture completely based on the gated recurrent unit paradigm. The robot is able to generate complete sentences describing the scene, dealing with the hierarchical nature of the temporal information contained in image sequences. The proposed approach has fewer parameters than previous state-of-the-art architectures, thus it is faster to train and smaller in memory occupancy. These benefits do not affect the prediction performance. In fact, we show that our method outperforms or is comparable to previous approaches in terms of quantitative metrics and qualitative evaluation when tested on benchmark publicly available datasets and on a new dataset we introduce in this letter.

关键词

Computer scienceArtificial intelligenceRoboticsRobotBenchmark (surveying)Interface (matter)Natural languageContext (archaeology)Service (business)Service robot

Full-GRU Natural Language Video Description for Service Robotics Applications

摘要

关键词

相关论文

Statistical Learning Theory

Artificial intelligence: a modern approach

Applied Nonlinear Control

A new optimizer using particle swarm theory