首页 /研究 /Parallel Reinforcement Learning Systems Using Exploration Agents and Dyna-Q Algorithm

LEARNING

Parallel Reinforcement Learning Systems Using Exploration Agents and Dyna-Q Algorithm

Takeshi Tateyama, Seiichi Kawata, Y. Shimomura

发表年份: 2007
引用次数: 10

摘要

We propose a new strategy for parallel reinforcement learning; using this strategy, the optimal value function and policy can be constructed more quickly than by using traditional strategies. We define two types of agents: exploitation agents and exploration agents. The exploitation agents select actions mainly for the purpose of exploitation, and the exploration agents concentrate on exploration by using the extended κ-certainty exploration method. These agents learn in the same environment in parallel, combine each value function periodically and execute Dyna-Q. The use of this strategy, make it possible to expect the construction of the optimal value function , and enables the exploration agents to quickly select the optimal actions. The experimental results of the mobile robot simulation showed the applicability of our method.

关键词

Reinforcement learningComputer scienceFunction (biology)Q-learningBellman equationValue (mathematics)Mobile robotArtificial intelligenceRobotMathematical optimization

Parallel Reinforcement Learning Systems Using Exploration Agents and Dyna-Q Algorithm

摘要

关键词

相关论文

Statistical Learning Theory

Artificial intelligence: a modern approach

Applied Nonlinear Control

A new optimizer using particle swarm theory