首页 /研究 /Jointly Learning Grounded Task Structures from Language Instruction and Visual Demonstration

PERCEPTION

Jointly Learning Grounded Task Structures from Language Instruction and Visual Demonstration

Changsong Liu, Shaohua Yang, Sari Saba-Sadiya, Nishant Shukla, Yunzhong He, Song‐Chun Zhu, Joyce Chai

发表年份: 2016
引用次数: 42
访问权限: 开放获取

摘要

To enable language-based communication and collaboration with cognitive robots, this paper presents an approach where an agent can learn task models jointly from language instruction and visual demonstration using an And-Or Graph (AoG) representation. The learned AoG captures a hierarchical task structure where linguistic labels (for language communication) are grounded to corresponding state changes from the physical environment (for perception and action). Our empirical results on a cloth-folding domain have shown that, although state detection through visual processing is full of uncertainties and error prone, by a tight integration with language the agent is able to learn an effective AoG for task representation. The learned AoG can be further applied to infer and interpret on-going actions from new visual demonstration using linguistic labels at different levels of granularity.

关键词

Computer scienceTask (project management)Grounded theoryHuman–computer interactionNatural language processingVisual languageArtificial intelligenceLinguisticsQualitative researchEngineering

Jointly Learning Grounded Task Structures from Language Instruction and Visual Demonstration

摘要

关键词

相关论文

Statistical Learning Theory

Artificial intelligence: a modern approach

Applied Nonlinear Control

A new optimizer using particle swarm theory