LGROSYOCDec 30, 2019

Pontryagin Differentiable Programming: An End-to-End Learning and Control Framework

arXiv:1912.12970v5123 citations
Originality Incremental advance
AI Analysis

This provides a novel framework for researchers and practitioners in robotics and control to handle complex learning and control tasks more efficiently, though it appears incremental as it builds on existing optimal control principles.

The paper tackles the problem of solving learning and control tasks by developing Pontryagin Differentiable Programming (PDP), a unified framework that enables end-to-end learning of dynamics, policies, or control objectives through differentiation of Pontryagin's Maximum Principle and an auxiliary control system, demonstrated on high-dimensional systems like multi-link robot arms and quadrotors.

This paper develops a Pontryagin Differentiable Programming (PDP) methodology, which establishes a unified framework to solve a broad class of learning and control tasks. The PDP distinguishes from existing methods by two novel techniques: first, we differentiate through Pontryagin's Maximum Principle, and this allows to obtain the analytical derivative of a trajectory with respect to tunable parameters within an optimal control system, enabling end-to-end learning of dynamics, policies, or/and control objective functions; and second, we propose an auxiliary control system in the backward pass of the PDP framework, and the output of this auxiliary control system is the analytical derivative of the original system's trajectory with respect to the parameters, which can be iteratively solved using standard control tools. We investigate three learning modes of the PDP: inverse reinforcement learning, system identification, and control/planning. We demonstrate the capability of the PDP in each learning mode on different high-dimensional systems, including multi-link robot arm, 6-DoF maneuvering quadrotor, and 6-DoF rocket powered landing.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes