SY SYJan 7, 2017

Differential Dynamic Programming for time-delayed systems

arXiv:1701.018825 citationsh-index: 47

AI Analysis

Provides a method for optimal control of time-delayed systems, which is incremental for robotics and control communities.

Extended Differential Dynamic Programming (DDP) to handle multiple time-delays in the state, enabling trajectory optimization for time-delayed systems. Demonstrated on a two-tank reactor and a recurrent neural network model of an inverted pendulum.

Trajectory optimization considers the problem of deciding how to control a dynamical system to move along a trajectory which minimizes some cost function. Differential Dynamic Programming (DDP) is an optimal control method which utilizes a second-order approximation of the problem to find the control. It is fast enough to allow real-time control and has been shown to work well for trajectory optimization in robotic systems. Here we extend classic DDP to systems with multiple time-delays in the state. Being able to find optimal trajectories for time-delayed systems with DDP opens up the possibility to use richer models for system identification and control, including recurrent neural networks with multiple timesteps in the state. We demonstrate the algorithm on a two-tank continuous stirred tank reactor. We also demonstrate the algorithm on a recurrent neural network trained to model an inverted pendulum with position information only.

View on arXiv PDF

Similar