Greedy exploration policy of Q-learning based on state balance

Yu Zheng, Siwei Luo, Jing Zhang

Year: 2005
Citations: 3

Abstract

Q-learning is one of the successfully established algorithms for the reinforcement learning, which has been widely used to the intelligent control system, such as the control of robot pose. However, curse of dimensionality and difficulty in convergence exist in Q-learning arising from random exploration policy. In this paper, we propose a greedy exploration policy of Q-learning with rule guidance. This exploration policy can reduce the non-optimal action exploration as more as possible, and speed up the convergence of Q-learning. Simulation results indicate the effectiveness of the proposed method.

Keywords

Reinforcement learningConvergence (economics)Computer scienceCurse of dimensionalityQ-learningArtificial intelligenceMachine learningControl (management)Balance (ability)State (computer science)

Greedy exploration policy of Q-learning based on state balance

Abstract

Keywords

Related papers

Statistical Learning Theory

Artificial intelligence: a modern approach

Applied Nonlinear Control

A new optimizer using particle swarm theory