首页 /研究 /Toward Observation Based Least Restrictive Collision Avoidance Using Deep Meta Reinforcement Learning

LEARNING

Toward Observation Based Least Restrictive Collision Avoidance Using Deep Meta Reinforcement Learning

Salar Asayesh, Mo Chen, Mehran Mehrandezh, Kamal Gupta

发表年份: 2021
引用次数: 6

摘要

This letter presents the Observation-based Least-Restrictive Collision Avoidance Module (OLR-CAM) that can be added to any autonomous robot working in a shared environment and provide a high-level safety layer to the existing policy for each robot. The OLR-CAM takes raw sensory observations as input, evaluates the agents' safety against dynamic and static obstacles, and only intervenes the default policy when needed - in a least-restrictive fashion - to avoid a potential collision. In our approach, we meta-train the OLR-CAM policy within a “2D Navigation Meta World System”. Furthermore, to endow the policy with a notion of safety in multi-agent environments with obstacles, we propose a novel reward function based on a safety value function derived from the Hamilton-Jacobi reachability theory and a local cost map. The proposed reward function does not need any additional information about the environment's map. This facilitates the adoption of the algorithm in a new environment at the meta test stage. The proposed algorithm is fully meta-trained in simulation and tested on a real multi-agent system without any additional training conducted in the real setting. Our results show that the OLR-CAM success rate outperforms a well-known classical baseline approach by 10 percent on average and reduces the interruptions/changes to the preferred velocity by 15 percent.

关键词

Reinforcement learningCollision avoidanceComputer scienceFunction (biology)RobotCollisionReachabilityArtificial intelligenceSimulationAlgorithm

Toward Observation Based Least Restrictive Collision Avoidance Using Deep Meta Reinforcement Learning

摘要

关键词

相关论文

Statistical Learning Theory

Artificial intelligence: a modern approach

Applied Nonlinear Control

A new optimizer using particle swarm theory