Home /Research /Dynamically Feasible Deep Reinforcement Learning Policy for Robot\n Navigation in Dense Mobile Crowds

LEARNING

Dynamically Feasible Deep Reinforcement Learning Policy for Robot\n Navigation in Dense Mobile Crowds

Utsav Patel, Nithish Kumar, Adarsh Jagan Sathyamoorthy, Dinesh Manocha

Year: 2020
Citations: 8
Access: Open access

Abstract

We present a novel Deep Reinforcement Learning (DRL) based policy to compute\ndynamically feasible and spatially aware velocities for a robot navigating\namong mobile obstacles. Our approach combines the benefits of the Dynamic\nWindow Approach (DWA) in terms of satisfying the robot's dynamics constraints\nwith state-of-the-art DRL-based navigation methods that can handle moving\nobstacles and pedestrians well. Our formulation achieves these goals by\nembedding the environmental obstacles' motions in a novel low-dimensional\nobservation space. It also uses a novel reward function to positively reinforce\nvelocities that move the robot away from the obstacle's heading direction\nleading to significantly lower number of collisions. We evaluate our method in\nrealistic 3-D simulated environments and on a real differential drive robot in\nchallenging dense indoor scenarios with several walking pedestrians. We compare\nour method with state-of-the-art collision avoidance methods and observe\nsignificant improvements in terms of success rate (up to 33\\% increase), number\nof dynamics constraint violations (up to 61\\% decrease), and smoothness. We\nalso conduct ablation studies to highlight the advantages of our observation\nspace formulation, and reward structure.\n

Keywords

Reinforcement learningCollision avoidanceComputer scienceRobotObstacle avoidanceMobile robotConstraint (computer-aided design)SmoothnessArtificial intelligenceObstacle

Dynamically Feasible Deep Reinforcement Learning Policy for Robot\n Navigation in Dense Mobile Crowds

Abstract

Keywords

Related papers

Statistical Learning Theory

Artificial intelligence: a modern approach

Applied Nonlinear Control

A new optimizer using particle swarm theory