A reward allocation method for reinforcement learning in stabilizing control of T-inverted pendulum
Shu Hosokawa, Kazushi Nakano
- 发表年份
- 2012
- 引用次数
- 3
摘要
Reinforcement learning is a type of machine learning methods that does not require a detailed teaching signal by a human, which is expected to be applied to real robots. In its application to real robots, the learning processes are required to be finished in a short learning period of time. A reinforcement learning method of non-bootstrap type has fast convergence speeds in the tasks such as Sutton's maze problem that aims to reach a target state in a minimum time. However, this method is difficult to learn a task of keeping a stable state as long as possible. This paper improves a reward allocation method for stabilizing control tasks. The validity of our method is demonstrated through simulation for stabilizing control of T-inverted pendulum. Our proposed method can acquire a policy of keeping a stable state within a short learning period of time.
关键词
相关论文
Statistical Learning Theory
Yuhai Wu, Vladimir Vapnik
1999
Artificial intelligence: a modern approach
1995
Applied Nonlinear Control
Jean-Jacques Slotine, Weiping Li
1991
A new optimizer using particle swarm theory
R.C. Eberhart, James Kennedy
2002