Home /Research /An Improved Dyna-<i>Q</i> Algorithm for Mobile Robot Path Planning in Unknown Dynamic Environment
LEARNING

An Improved Dyna-<i>Q</i> Algorithm for Mobile Robot Path Planning in Unknown Dynamic Environment

Muleilan Pei, Hao An, Bo Liu, Changhong Wang

Year
2021
Citations
95

Abstract

This article deals with the problem of mobile robot path planning in an unknown environment that contains both static and dynamic obstacles, utilizing a reinforcement learning approach. We propose an improved Dyna- <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">${Q}$ </tex-math></inline-formula> algorithm, which incorporates heuristic search strategies, simulated annealing mechanism, and reactive navigation principle into <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">${Q}$ </tex-math></inline-formula> -learning based on the Dyna architecture. A novel action-selection strategy combining <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$\varepsilon $ </tex-math></inline-formula> -greedy policy with the cooling schedule control is presented, which, together with the heuristic reward function and heuristic actions, can tackle the exploration-exploitation dilemma and enhance the performance of global searching, convergence property, and learning efficiency for path planning. The proposed method is superior to the classical <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">${Q}$ </tex-math></inline-formula> -learning and Dyna- <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">${Q}$ </tex-math></inline-formula> algorithms in an unknown static environment, and it is successfully applied to an uncertain environment with multiple dynamic obstacles in simulations. Further, practical experiments are conducted by integrating MATLAB and robot operating system (ROS) on a physical robot platform, and the mobile robot manages to find a collision-free path, thus fulfilling autonomous navigation tasks in the real world.

Keywords

NotationHeuristicAlgorithmComputer scienceFunction (biology)Simulated annealingArtificial intelligenceMathematicsArithmetic

Related papers

Browse all LEARNING papers