Home /Research /Object Graph Networks for Spatial Language Grounding
LEARNING

Object Graph Networks for Spatial Language Grounding

Philip Hawkins, Frédéric Maire, Simon Denman, Mahsa Baktashmotlagh

Year
2019
Citations
2

Abstract

Consider a domestic robot being asked to pick up "the cup nearest to the plate". Natural language is an intuitive way for humans to interact with robots. However, enabling robots to comprehend natural language, and correctly interpret spatial references, is challenging for two reasons. Firstly, phrases must be semantically represented in structures that can be processed computationally; secondly correspondences must be found to map these structures to models that represent objects, relationships and actions in the environment. Recently neural networks have demonstrated a strong potential to address both challenges, most notably in the context of Visual Question Answering (VQA) where they have performed well at answering natural language questions about images. However, the state-of-the-art networks for VQA tasks are not directly applicable to robotic applications. They do not support interfaces suitable for integration with a robotic system and most have a limited capacity to interpret spatial phrases. In this paper we present a neural network architecture trained on synthetic data and evaluated on synthetic and real data. It correctly interprets referring spatial relationships in phrases such as the one above and provides a modular interface that allows a robot to localise an object in the environment from such a phrase.

Keywords

Computer scienceArtificial intelligenceNatural languageRobotPhraseInterface (matter)Modular designQuestion answeringSpatial contextual awarenessNatural language understanding

Related papers

Browse all LEARNING papers