Robust Behavior Cloning with Adversarial Demonstration Detection

Mostafa Hussein, Brendan Crowe, Madison Clark-Turner, Paul Gesel, Marek Petrik, Momotaz Begum

发表年份: 2021
引用次数: 5

摘要

Imitation learning (IL) frameworks in robotics typically assume that a domain expert's demonstration always contains a correct way of doing the task. Despite its theoretical convenience, this assumption has limited practical values for an IL-powered robot in real world. There are many reasons for an expert in the real world to provide demonstrations that may contain incorrect or potentially unsafe way of doing a task. In order for IL-powered robots to work in the real world, IL frameworks need to detect such adversarial demonstrations and not learn from them. This paper proposes an IL framework that can autonomously detect and remove adversarial demonstrations, if they exist in the demonstration set, as it directly learns a task policy from the expert. The proposed framework that we term Robust Maximum Entropy behavior cloning (R-MaxEnt) learns a stochastic model that maps states to actions. In doing so, R-MaxEnt solves a minmax problem that leverages the entropy of the model to assign weights to different demonstrations while assigning poor weights to adversarial samples. Our empirical results show that R-MaxEnt outperforms the existing IL approaches in both real and simulated robotics tasks.

关键词

Adversarial systemComputer scienceArtificial intelligenceRoboticsRobotMachine learningMinimaxTask (project management)Entropy (arrow of time)Principle of maximum entropy

Robust Behavior Cloning with Adversarial Demonstration Detection

摘要

关键词

相关论文

Statistical Learning Theory

Artificial intelligence: a modern approach

Applied Nonlinear Control

A new optimizer using particle swarm theory