Home /Research /AVID: Learning Multi-Stage Tasks via Pixel-Level Translation of Human\n Videos

LEARNING

AVID: Learning Multi-Stage Tasks via Pixel-Level Translation of Human\n Videos

Laura Smith, Nikita Dhawan, Marvin Zhang, Pieter Abbeel, Sergey Levine

Year: 2019
Citations: 2
Access: Open access

Abstract

Robotic reinforcement learning (RL) holds the promise of enabling robots to\nlearn complex behaviors through experience. However, realizing this promise for\nlong-horizon tasks in the real world requires mechanisms to reduce human burden\nin terms of defining the task and scaffolding the learning process. In this\npaper, we study how these challenges can be alleviated with an automated\nrobotic learning framework, in which multi-stage tasks are defined simply by\nproviding videos of a human demonstrator and then learned autonomously by the\nrobot from raw image observations. A central challenge in imitating human\nvideos is the difference in appearance between the human and robot, which\ntypically requires manual correspondence. We instead take an automated approach\nand perform pixel-level image translation via CycleGAN to convert the human\ndemonstration into a video of a robot, which can then be used to construct a\nreward function for a model-based RL algorithm. The robot then learns the task\none stage at a time, automatically learning how to reset each stage to retry it\nmultiple times without human-provided resets. This makes the learning process\nlargely automatic, from intuitive task specification via a video to automated\ntraining with minimal human intervention. We demonstrate that our approach is\ncapable of learning complex tasks, such as operating a coffee machine, directly\nfrom raw image observations, requiring only 20 minutes to provide human\ndemonstrations and about 180 minutes of robot interaction.\n

Keywords

Computer scienceRobotArtificial intelligenceTask (project management)Process (computing)Reinforcement learningRobot learningComputer visionHuman–robot interactionConstruct (python library)

AVID: Learning Multi-Stage Tasks via Pixel-Level Translation of Human\n Videos

Abstract

Keywords

Related papers

Statistical Learning Theory

Artificial intelligence: a modern approach

Applied Nonlinear Control

A new optimizer using particle swarm theory