Home /Research /Pontryagin Differentiable Programming: An End-to-End Learning and\n Control Framework
LEARNING

Pontryagin Differentiable Programming: An End-to-End Learning and\n Control Framework

Wanxin Jin, Zhaoran Wang, Zhuoran Yang, Shaoshuai Mou

Year
2019
Citations
29
Access
Open access

Abstract

This paper develops a Pontryagin Differentiable Programming (PDP)\nmethodology, which establishes a unified framework to solve a broad class of\nlearning and control tasks. The PDP distinguishes from existing methods by two\nnovel techniques: first, we differentiate through Pontryagin's Maximum\nPrinciple, and this allows to obtain the analytical derivative of a trajectory\nwith respect to tunable parameters within an optimal control system, enabling\nend-to-end learning of dynamics, policies, or/and control objective functions;\nand second, we propose an auxiliary control system in the backward pass of the\nPDP framework, and the output of this auxiliary control system is the\nanalytical derivative of the original system's trajectory with respect to the\nparameters, which can be iteratively solved using standard control tools. We\ninvestigate three learning modes of the PDP: inverse reinforcement learning,\nsystem identification, and control/planning. We demonstrate the capability of\nthe PDP in each learning mode on different high-dimensional systems, including\nmulti-link robot arm, 6-DoF maneuvering quadrotor, and 6-DoF rocket powered\nlanding.\n

Keywords

Differentiable functionTrajectoryControl theory (sociology)Computer scienceOptimal controlInverseInverse systemControl (management)Control systemControl engineering

Related papers

Browse all LEARNING papers