首页 /研究 /Composable Action-Conditioned Predictors: Flexible Off-Policy Learning\n for Robot Navigation
PERCEPTION

Composable Action-Conditioned Predictors: Flexible Off-Policy Learning\n for Robot Navigation

Gregory Kahn, Adam Villaflor, Pieter Abbeel, Sergey Levine

发表年份
2018
引用次数
10
访问权限
开放获取

摘要

A general-purpose intelligent robot must be able to learn autonomously and be\nable to accomplish multiple tasks in order to be deployed in the real world.\nHowever, standard reinforcement learning approaches learn separate\ntask-specific policies and assume the reward function for each task is known a\npriori. We propose a framework that learns event cues from off-policy data, and\ncan flexibly combine these event cues at test time to accomplish different\ntasks. These event cue labels are not assumed to be known a priori, but are\ninstead labeled using learned models, such as computer vision detectors, and\nthen `backed up' in time using an action-conditioned predictive model. We show\nthat a simulated robotic car and a real-world RC car can gather data and train\nfully autonomously without any human-provided labels beyond those needed to\ntrain the detectors, and then at test-time be able to accomplish a variety of\ndifferent tasks. Videos of the experiments and code can be found at\nhttps://github.com/gkahn13/CAPs\n

关键词

Action (physics)Computer scienceHuman–computer interactionRobotPolicy learningArtificial intelligenceMachine learningPhysics

相关论文

查看 PERCEPTION 分类全部论文