Using Partial-Policy Q-Learning to Plan Path for Robot Navigation in Unknown Enviroment
Juli Zhang, Junyi Zhang, Zhong Ma, Zhanzhuang He
- 发表年份
- 2017
- 引用次数
- 7
摘要
Reinforcement learning has been widely used as a mechanism for robot navigation to learn good policies from the environment and make decisions to take actions with optimizing a cumulative future reward signal. Due to limit power, robots should reach to the final destination in shortest time by taking the shortest path. To achieve this aim, choosing a method to get a shortest path is necessary. However, most RL methods suffer from the slow convergence and the trade-off problem of exploration and exploitation. To solve these problems, a novel improved Q-learning method is proposed. Q-learning is a well-known model-free RL method. Model-free RL methods are computationally cheaper than model-based methods, because action values can be accessed by inspecting a look-up table constructed through trial-and-error. The proposed method uses a non-zero primary values of Q-table instead of all zeroes in conventional Q-learning, and a partial-policy to accelerate the convergence. The results of experiments undertaken on simulated mazes show that the proposed method not only reduces the time consuming but also leads to much better performance than previous model-free method of Q-learning.
关键词
相关论文
Statistical Learning Theory
Yuhai Wu, Vladimir Vapnik
1999
Artificial intelligence: a modern approach
1995
Applied Nonlinear Control
Jean-Jacques Slotine, Weiping Li
1991
A new optimizer using particle swarm theory
R.C. Eberhart, James Kennedy
2002