首页 /研究 /PQ−Learning: An Efficient Robot Learning Method for Intelligent Behavior Acquisition
LEARNING

PQ−Learning: An Efficient Robot Learning Method for Intelligent Behavior Acquisition

Weiyu Zhu, Stephen E. Levinson

发表年份
2001
引用次数
7

摘要

Abstract This paper presents an efficient reinforcement learning method, called the PQ-learning, for intelligent behavior acquisition by an autonomous robot. This method uses a special action value propagation technique, named the spatial propagation and temporal propagation, to achieve fast learning convergence in large state spaces. Compared with the approaches in literature, the proposed method offers three benefits for robot learning. First, this is a general method, which should be applicable to most reinforcement learning tasks. Second, the learning is guaranteed to converge to the optimum with a much faster converging speed than the traditional Q and Q(λ)-learning methods. Third, it supports both self and teacher-directed learning, where the help from the teacher is directing the robot to explore, instead of explicitly offering labels or ground truths as in the supervised-learning regime. The proposed method had been tested with a simulated robot navigation-learning problem. The results show that this method significantly outperforms the Q(λ)-learning algorithm in terms of the learning speeds in both self and teacher-directed learning regimes. 1.

关键词

Robot learningReinforcement learningRobotComputer scienceArtificial intelligenceSemi-supervised learningUnsupervised learningQ-learningLearning classifier systemActive learning (machine learning)

相关论文

查看 LEARNING 分类全部论文