Data-Efficient Reinforcement Learning with Probabilistic Model Predictive Control
Sanket Kamthe, Marc Peter Deisenroth
- Year
- 2017
- Citations
- 74
Abstract
Trial-and-error based reinforcement learning \n(RL) has seen rapid advancements in recent \ntimes, especially with the advent of deep neural networks. However, the majority of autonomous RL algorithms require a large number of interactions with the environment. A \nlarge number of interactions may be impractical in many real-world applications, such as \nrobotics, and many practical systems have to \nobey limitations in the form of state space \nor control constraints. To reduce the number \nof system interactions while simultaneously \nhandling constraints, we propose a modelbased RL framework based on probabilistic \nModel Predictive Control (MPC). In particular, we propose to learn a probabilistic transition model using Gaussian Processes (GPs) \nto incorporate model uncertainty into longterm predictions, thereby, reducing the impact of model errors. We then use MPC to \nfind a control sequence that minimises the \nexpected long-term cost. We provide theoretical guarantees for first-order optimality in \nthe GP-based transition models with deterministic approximate inference for long-term \nplanning. We demonstrate that our approach \ndoes not only achieve state-of-the-art data \nefficiency, but also is a principled way for RL \nin constrained environments.
Keywords
Related papers
Statistical Learning Theory
Yuhai Wu, Vladimir Vapnik
1999
Artificial intelligence: a modern approach
1995
Applied Nonlinear Control
Jean-Jacques Slotine, Weiping Li
1991
A new optimizer using particle swarm theory
R.C. Eberhart, James Kennedy
2002