Learning from demonstration using a multi-valued function regressor for time-series data
Jesse Butterfield, Sarah Osentoski, Graylin Jay, Odest Chadwicke Jenkins
- 发表年份
- 2010
- 引用次数
- 41
摘要
Using data collected from human teleoperation, our goal is to learn a control policy that maps perception to actuation. Such policies are potentially multi-valued with regard to perception with a single input mapping to multiple outputs depending on the user's objective at a particular time. We propose a multi-valued function regressor to learn a larger class of robot control policies from human demonstration and extend the Hierarchical Dirichlet Process Hidden Markov Model to discover latent variables representing unknown objectives in the demonstrated data and the transitions between these objectives. Each of these objectives requires only a single-valued policy function, and thus can be learned with a Gaussian process function regressor. The learned transitions between these objectives determine the correct actuation where the complete policy function is multi-valued. We present the results of experiments conducted on the Nao humanoid robot platform.
关键词
相关论文
Statistical Learning Theory
Yuhai Wu, Vladimir Vapnik
1999
Artificial intelligence: a modern approach
1995
Applied Nonlinear Control
Jean-Jacques Slotine, Weiping Li
1991
A new optimizer using particle swarm theory
R.C. Eberhart, James Kennedy
2002