Tree-based Dyna-Q agent
Kao‐Shing Hwang, Wei‐Cheng Jiang, Yu-Jen Chen
- 发表年份
- 2012
- 引用次数
- 4
摘要
This article presented a Dyna-Q learning method based on a world model of tree structures to enhance the efficiency on sampling data in reinforcement learning problem. The Q-Learning mechanism is for policy learning as the tree is learning the world model by observing the transitions between the states after the actions taken. In early stages of learning, the learning agent does not have an accurate model but explores the environment as possible to collect sufficient experiences to approximate the environment model. When the agent develops a more accurate model, a planning method can use the model to produce simulated experiences to accelerate value iterations. Thus, the agent with the proposed method can obtain virtual experiences for updating the policy. Simulations on a mobile robot escaping from a labyrinth to verify the performance of the robot equipped with the proposed method. The result proves that tree-based Dyna-Q agent can speed up the learning process.
关键词
相关论文
Statistical Learning Theory
Yuhai Wu, Vladimir Vapnik
1999
Artificial intelligence: a modern approach
1995
Applied Nonlinear Control
Jean-Jacques Slotine, Weiping Li
1991
A new optimizer using particle swarm theory
R.C. Eberhart, James Kennedy
2002