首页 /研究 /A generative framework for multimodal learning of spatial concepts and object categories: An unsupervised part-of-speech tagging and 3D visual perception based approach
HRI

A generative framework for multimodal learning of spatial concepts and object categories: An unsupervised part-of-speech tagging and 3D visual perception based approach

Amir Aly, Akira Taniguchi, Tadahiro Taniguchi

发表年份
2017
引用次数
13

摘要

Future human-robot collaboration employs language in instructing a robot about specific tasks to perform in its surroundings. This requires the robot to be able to associate spatial knowledge with language to understand the details of an assigned task so as to behave appropriately in the context of interaction. In this paper, we propose a probabilistic framework for learning the meaning of language spatial concepts (spatial prepositions) and object categories based on visual cues representing spatial layouts and geometric characteristics of objects in a tabletop scene. The model investigates unsupervised Part-of-Speech (POS) tagging through a Hidden Markov Model (HMM) that infers the corresponding hidden tags to words. Spatial configurations and geometric characteristics of objects on the tabletop are described through 3D point cloud information that encodes spatial semantics and categories of referents and landmarks in the environment. The proposed model is evaluated through human user interaction with Toyota HSR robot, where the obtained results show the significant effect of the model in making the robot able to successfully engage in interaction with the user in space.

关键词

Computer scienceRobotArtificial intelligenceHidden Markov modelContext (archaeology)Semantics (computer science)Object (grammar)PerceptionGenerative modelSpatial contextual awareness

相关论文

查看 HRI 分类全部论文