Pontryagin Differentiable Programming: An End-to-End Learning and\n Control Framework
Wanxin Jin, Zhaoran Wang, Zhuoran Yang, Shaoshuai Mou
- Year
- 2019
- Citations
- 29
- Access
- Open access
Abstract
This paper develops a Pontryagin Differentiable Programming (PDP)\nmethodology, which establishes a unified framework to solve a broad class of\nlearning and control tasks. The PDP distinguishes from existing methods by two\nnovel techniques: first, we differentiate through Pontryagin's Maximum\nPrinciple, and this allows to obtain the analytical derivative of a trajectory\nwith respect to tunable parameters within an optimal control system, enabling\nend-to-end learning of dynamics, policies, or/and control objective functions;\nand second, we propose an auxiliary control system in the backward pass of the\nPDP framework, and the output of this auxiliary control system is the\nanalytical derivative of the original system's trajectory with respect to the\nparameters, which can be iteratively solved using standard control tools. We\ninvestigate three learning modes of the PDP: inverse reinforcement learning,\nsystem identification, and control/planning. We demonstrate the capability of\nthe PDP in each learning mode on different high-dimensional systems, including\nmulti-link robot arm, 6-DoF maneuvering quadrotor, and 6-DoF rocket powered\nlanding.\n
Keywords
Related papers
Statistical Learning Theory
Yuhai Wu, Vladimir Vapnik
1999
Artificial intelligence: a modern approach
1995
Applied Nonlinear Control
Jean-Jacques Slotine, Weiping Li
1991
A new optimizer using particle swarm theory
R.C. Eberhart, James Kennedy
2002