Composable Action-Conditioned Predictors: Flexible Off-Policy Learning\n for Robot Navigation
Gregory Kahn, Adam Villaflor, Pieter Abbeel, Sergey Levine
- Year
- 2018
- Citations
- 10
- Access
- Open access
Abstract
A general-purpose intelligent robot must be able to learn autonomously and be\nable to accomplish multiple tasks in order to be deployed in the real world.\nHowever, standard reinforcement learning approaches learn separate\ntask-specific policies and assume the reward function for each task is known a\npriori. We propose a framework that learns event cues from off-policy data, and\ncan flexibly combine these event cues at test time to accomplish different\ntasks. These event cue labels are not assumed to be known a priori, but are\ninstead labeled using learned models, such as computer vision detectors, and\nthen `backed up' in time using an action-conditioned predictive model. We show\nthat a simulated robotic car and a real-world RC car can gather data and train\nfully autonomously without any human-provided labels beyond those needed to\ntrain the detectors, and then at test-time be able to accomplish a variety of\ndifferent tasks. Videos of the experiments and code can be found at\nhttps://github.com/gkahn13/CAPs\n
Keywords
Related papers
Statistical Learning Theory
Yuhai Wu, Vladimir Vapnik
1999
Artificial intelligence: a modern approach
1995
Applied Nonlinear Control
Jean-Jacques Slotine, Weiping Li
1991
A new optimizer using particle swarm theory
R.C. Eberhart, James Kennedy
2002