首页 /研究 /An Improved Dyna-<i>Q</i> Algorithm for Mobile Robot Path Planning in Unknown Dynamic Environment
LEARNING

An Improved Dyna-<i>Q</i> Algorithm for Mobile Robot Path Planning in Unknown Dynamic Environment

Muleilan Pei, Hao An, Bo Liu, Changhong Wang

发表年份
2021
引用次数
95

摘要

This article deals with the problem of mobile robot path planning in an unknown environment that contains both static and dynamic obstacles, utilizing a reinforcement learning approach. We propose an improved Dyna- <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">${Q}$ </tex-math></inline-formula> algorithm, which incorporates heuristic search strategies, simulated annealing mechanism, and reactive navigation principle into <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">${Q}$ </tex-math></inline-formula> -learning based on the Dyna architecture. A novel action-selection strategy combining <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$\varepsilon $ </tex-math></inline-formula> -greedy policy with the cooling schedule control is presented, which, together with the heuristic reward function and heuristic actions, can tackle the exploration-exploitation dilemma and enhance the performance of global searching, convergence property, and learning efficiency for path planning. The proposed method is superior to the classical <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">${Q}$ </tex-math></inline-formula> -learning and Dyna- <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">${Q}$ </tex-math></inline-formula> algorithms in an unknown static environment, and it is successfully applied to an uncertain environment with multiple dynamic obstacles in simulations. Further, practical experiments are conducted by integrating MATLAB and robot operating system (ROS) on a physical robot platform, and the mobile robot manages to find a collision-free path, thus fulfilling autonomous navigation tasks in the real world.

关键词

NotationHeuristicAlgorithmComputer scienceFunction (biology)Simulated annealingArtificial intelligenceMathematicsArithmetic

相关论文

查看 LEARNING 分类全部论文