首页 /研究 /Q-PSP Learning: An Exploitation-Oriented Q-Learning Algorithm and Its Applications
LEARNING

Q-PSP Learning: An Exploitation-Oriented Q-Learning Algorithm and Its Applications

Tadashi Horiuchi, Akinori Fujino, Osamu Katai, Tetsuo Sawaragi

发表年份
1999
引用次数
18
访问权限
开放获取

摘要

Reinforcement learning alogrithms can be classified into two approaches. One is “exploitation-oriented” approach which attempts to acquire action rules mainly by reinforcing and relying on good experiences, and the other is “exploration-oriented” approach which pursuits the optimality of actions to receive highest rewards by exploring the environment. In this paper, we propose Q-PSP Learning method which incorporates the the idea of PSP (Profit Sharing Plan) used in Classifier System as “exploitation-oriented” reinforcement learning into Q-Learning as “exploration-oriented” reinforcement learning in order to take the merits of these two approaches. Through applying the Q-PSP Learning to several control problems and a robot navigation problem, it will be shown that not only the speed up of learning but also effectiveness for complex problems can be expected and that an appropriate balance between exploration and exploitation can be attained in Q-PSP Learning.

关键词

Reinforcement learningLearning classifier systemComputer scienceArtificial intelligenceQ-learningMachine learningRobot learningProfit sharingError-driven learningUnsupervised learning

相关论文

查看 LEARNING 分类全部论文