首页 /研究 /Parallel Reinforcement Learning Systems Using Exploration Agents and Dyna-Q Algorithm
LEARNING

Parallel Reinforcement Learning Systems Using Exploration Agents and Dyna-Q Algorithm

Takeshi Tateyama, Seiichi Kawata, Y. Shimomura

发表年份
2007
引用次数
10

摘要

We propose a new strategy for parallel reinforcement learning; using this strategy, the optimal value function and policy can be constructed more quickly than by using traditional strategies. We define two types of agents: exploitation agents and exploration agents. The exploitation agents select actions mainly for the purpose of exploitation, and the exploration agents concentrate on exploration by using the extended κ-certainty exploration method. These agents learn in the same environment in parallel, combine each value function periodically and execute Dyna-Q. The use of this strategy, make it possible to expect the construction of the optimal value function , and enables the exploration agents to quickly select the optimal actions. The experimental results of the mobile robot simulation showed the applicability of our method.

关键词

Reinforcement learningComputer scienceFunction (biology)Q-learningBellman equationValue (mathematics)Mobile robotArtificial intelligenceRobotMathematical optimization

相关论文

查看 LEARNING 分类全部论文