Tree-based Dyna-Q agent

Kao‐Shing Hwang, Wei‐Cheng Jiang, Yu-Jen Chen

发表年份: 2012
引用次数: 4

摘要

This article presented a Dyna-Q learning method based on a world model of tree structures to enhance the efficiency on sampling data in reinforcement learning problem. The Q-Learning mechanism is for policy learning as the tree is learning the world model by observing the transitions between the states after the actions taken. In early stages of learning, the learning agent does not have an accurate model but explores the environment as possible to collect sufficient experiences to approximate the environment model. When the agent develops a more accurate model, a planning method can use the model to produce simulated experiences to accelerate value iterations. Thus, the agent with the proposed method can obtain virtual experiences for updating the policy. Simulations on a mobile robot escaping from a labyrinth to verify the performance of the robot equipped with the proposed method. The result proves that tree-based Dyna-Q agent can speed up the learning process.

关键词

Reinforcement learningComputer scienceTree (set theory)Q-learningProcess (computing)Mobile robotArtificial intelligenceRobotTree structureMachine learning

Tree-based Dyna-Q agent

摘要

关键词

相关论文

Statistical Learning Theory

Artificial intelligence: a modern approach

Applied Nonlinear Control

A new optimizer using particle swarm theory