Autonomous Reinforcement Learning with Experience Replay for Humanoid Gait Optimization
Paweł Wawrzyński
- 发表年份
- 2012
- 引用次数
- 9
摘要
This paper demonstrates application of Reinforcement Learning to optimization of control of a complex system in realistic setting that requires efficiency and autonomy of the learning algorithm. Namely, Actor-Critic with experience replay (which addresses efficiency), and the Fixed Point method for step-size estimation (which addresses autonomy) is applied here to approximately optimize humanoid robot gait. With complex dynamics and tens of continuous state and action variables, humanoid gait optimization represents a challenge for analytical synthesis of control. The presented algorithm learns a nimble gait within 80 minutes of training.
关键词
相关论文
Statistical Learning Theory
Yuhai Wu, Vladimir Vapnik
1999
Artificial intelligence: a modern approach
1995
Applied Nonlinear Control
Jean-Jacques Slotine, Weiping Li
1991
A new optimizer using particle swarm theory
R.C. Eberhart, James Kennedy
2002