首页 /研究 /Pontryagin Differentiable Programming: An End-to-End Learning and\n Control Framework
LEARNING

Pontryagin Differentiable Programming: An End-to-End Learning and\n Control Framework

Wanxin Jin, Zhaoran Wang, Zhuoran Yang, Shaoshuai Mou

发表年份
2019
引用次数
29
访问权限
开放获取

摘要

This paper develops a Pontryagin Differentiable Programming (PDP)\nmethodology, which establishes a unified framework to solve a broad class of\nlearning and control tasks. The PDP distinguishes from existing methods by two\nnovel techniques: first, we differentiate through Pontryagin's Maximum\nPrinciple, and this allows to obtain the analytical derivative of a trajectory\nwith respect to tunable parameters within an optimal control system, enabling\nend-to-end learning of dynamics, policies, or/and control objective functions;\nand second, we propose an auxiliary control system in the backward pass of the\nPDP framework, and the output of this auxiliary control system is the\nanalytical derivative of the original system's trajectory with respect to the\nparameters, which can be iteratively solved using standard control tools. We\ninvestigate three learning modes of the PDP: inverse reinforcement learning,\nsystem identification, and control/planning. We demonstrate the capability of\nthe PDP in each learning mode on different high-dimensional systems, including\nmulti-link robot arm, 6-DoF maneuvering quadrotor, and 6-DoF rocket powered\nlanding.\n

关键词

Differentiable functionTrajectoryControl theory (sociology)Computer scienceOptimal controlInverseInverse systemControl (management)Control systemControl engineering

相关论文

查看 LEARNING 分类全部论文