A generative framework for multimodal learning of spatial concepts and object categories: An unsupervised part-of-speech tagging and 3D visual perception based approach
Amir Aly, Akira Taniguchi, Tadahiro Taniguchi
- Year
- 2017
- Citations
- 13
Abstract
Future human-robot collaboration employs language in instructing a robot about specific tasks to perform in its surroundings. This requires the robot to be able to associate spatial knowledge with language to understand the details of an assigned task so as to behave appropriately in the context of interaction. In this paper, we propose a probabilistic framework for learning the meaning of language spatial concepts (spatial prepositions) and object categories based on visual cues representing spatial layouts and geometric characteristics of objects in a tabletop scene. The model investigates unsupervised Part-of-Speech (POS) tagging through a Hidden Markov Model (HMM) that infers the corresponding hidden tags to words. Spatial configurations and geometric characteristics of objects on the tabletop are described through 3D point cloud information that encodes spatial semantics and categories of referents and landmarks in the environment. The proposed model is evaluated through human user interaction with Toyota HSR robot, where the obtained results show the significant effect of the model in making the robot able to successfully engage in interaction with the user in space.
Keywords
Related papers
Statistical Learning Theory
Yuhai Wu, Vladimir Vapnik
1999
Artificial intelligence: a modern approach
1995
Applied Nonlinear Control
Jean-Jacques Slotine, Weiping Li
1991
A new optimizer using particle swarm theory
R.C. Eberhart, James Kennedy
2002