LEARNING
Greedy exploration policy of Q-learning based on state balance
Yu Zheng, Siwei Luo, Jing Zhang
- 发表年份
- 2005
- 引用次数
- 3
摘要
Q-learning is one of the successfully established algorithms for the reinforcement learning, which has been widely used to the intelligent control system, such as the control of robot pose. However, curse of dimensionality and difficulty in convergence exist in Q-learning arising from random exploration policy. In this paper, we propose a greedy exploration policy of Q-learning with rule guidance. This exploration policy can reduce the non-optimal action exploration as more as possible, and speed up the convergence of Q-learning. Simulation results indicate the effectiveness of the proposed method.
关键词
Reinforcement learningConvergence (economics)Computer scienceCurse of dimensionalityQ-learningArtificial intelligenceMachine learningControl (management)Balance (ability)State (computer science)
相关论文
OTHER
📊 26,957 引用
Statistical Learning Theory
Yuhai Wu, Vladimir Vapnik
1999
PERCEPTION
📊 22,245 引用
Artificial intelligence: a modern approach
1995
OTHER
📊 18,993 引用
Applied Nonlinear Control
Jean-Jacques Slotine, Weiping Li
1991
SWARM
📊 14,853 引用
A new optimizer using particle swarm theory
R.C. Eberhart, James Kennedy
2002