A corpus-guided framework for robotic visual perception

Ching L. Teo, Yezhou Yang, Hal Daumé, Cornelia Fermüller, Yiannis Aloimonos

发表年份: 2011
引用次数: 6

摘要

We present a framework that produces sentence-level summa-rizations of videos containing complex human activities that can be implemented as part of the Robot Perception Control Unit (RPCU). This is done via: 1) detection of pertinent ob-jects in the scene: tools and direct-objects, 2) predicting ac-tions guided by a large lexical corpus and 3) generating the most likely sentence description of the video given the detec-tions. We pursue an active object detection approach by fo-cusing on regions of high optical flow. Next, an iterative EM strategy, guided by language, is used to predict the possible actions. Finally, we model the sentence generation process as a HMM optimization problem, combining visual detections and a trained language model to produce a readable descrip-tion of the video. Experimental results validate our approach and we discuss the implications of our approach to the RPCU in future applications.

关键词

Computer scienceSentenceArtificial intelligencePerceptionProcess (computing)Hidden Markov modelNatural language processingOptical flowLanguage modelObject (grammar)

A corpus-guided framework for robotic visual perception

摘要

关键词

相关论文

Statistical Learning Theory

Artificial intelligence: a modern approach

Applied Nonlinear Control

A new optimizer using particle swarm theory