Object Graph Networks for Spatial Language Grounding
Philip Hawkins, Frédéric Maire, Simon Denman, Mahsa Baktashmotlagh
- 发表年份
- 2019
- 引用次数
- 2
摘要
Consider a domestic robot being asked to pick up "the cup nearest to the plate". Natural language is an intuitive way for humans to interact with robots. However, enabling robots to comprehend natural language, and correctly interpret spatial references, is challenging for two reasons. Firstly, phrases must be semantically represented in structures that can be processed computationally; secondly correspondences must be found to map these structures to models that represent objects, relationships and actions in the environment. Recently neural networks have demonstrated a strong potential to address both challenges, most notably in the context of Visual Question Answering (VQA) where they have performed well at answering natural language questions about images. However, the state-of-the-art networks for VQA tasks are not directly applicable to robotic applications. They do not support interfaces suitable for integration with a robotic system and most have a limited capacity to interpret spatial phrases. In this paper we present a neural network architecture trained on synthetic data and evaluated on synthetic and real data. It correctly interprets referring spatial relationships in phrases such as the one above and provides a modular interface that allows a robot to localise an object in the environment from such a phrase.
关键词
相关论文
Statistical Learning Theory
Yuhai Wu, Vladimir Vapnik
1999
Artificial intelligence: a modern approach
1995
Applied Nonlinear Control
Jean-Jacques Slotine, Weiping Li
1991
A new optimizer using particle swarm theory
R.C. Eberhart, James Kennedy
2002