ROCK* — Efficient black-box optimization for policy learning
Jemin Hwangbo, Christian Gehring, Hannes Sommer, Roland Siegwart, Jonas Buchli
- 发表年份
- 2014
- 引用次数
- 11
摘要
Robotic learning on real hardware requires an efficient algorithm which minimizes the number of trials needed to learn an optimal policy. Prolonged use of hardware causes wear and tear on the system and demands more attention from an operator. To this end, we present a novel black-box optimization algorithm, Reward Optimization with Compact Kernels and fast natural gradient regression (ROCK∗). Our algorithm immediately updates knowledge after a single trial and is able to extrapolate in a controlled manner. These features make fast and safe learning on real hardware possible. We have evaluated our algorithm on two simulated reaching tasks of a 50 degree-of-freedom robot arm and on a hopping task of a real articulated legged system. ROCK∗ outperformed current state-of-the-art algorithms in all tasks by a factor of three or more.
关键词
相关论文
Statistical Learning Theory
Yuhai Wu, Vladimir Vapnik
1999
Artificial intelligence: a modern approach
1995
Applied Nonlinear Control
Jean-Jacques Slotine, Weiping Li
1991
A new optimizer using particle swarm theory
R.C. Eberhart, James Kennedy
2002