SY SYMar 9, 2018

Acceleration of Gradient-based Path Integral Method for Efficient Optimal and Inverse Optimal Control

arXiv:1710.0657822 citationsh-index: 30

AI Analysis

For researchers in optimal control and reinforcement learning, this work provides a practical acceleration of the path integral method, though it is an incremental improvement applying existing optimization techniques.

The paper introduces momentum-based acceleration methods (Nesterov Accelerated Gradient and Adam) to the path integral method for optimal control, achieving significantly faster convergence in simulated control systems and improved performance in model predictive control for vehicle navigation. The accelerated method also enables more efficient training of path integral networks for inverse optimal control with reduced RAM usage.

This paper deals with a new accelerated path integral method, which iteratively searches optimal controls with a small number of iterations. This study is based on the recent observations that a path integral method for reinforcement learning can be interpreted as gradient descent. This observation also applies to an iterative path integral method for optimal control, which sets a convincing argument for utilizing various optimization methods for gradient descent, such as momentum-based acceleration, step-size adaptation and their combination. We introduce these types of methods to the path integral and demonstrate that momentum-based methods, like Nesterov Accelerated Gradient and Adam, can significantly improve the convergence rate to search for optimal controls in simulated control systems. We also demonstrate that the accelerated path integral could improve the performance on model predictive control for various vehicle navigation tasks. Finally, we represent this accelerated path integral method as a recurrent network, which is the accelerated version of the previously proposed path integral networks (PI-Net). We can train the accelerated PI-Net more efficiently for inverse optimal control with less RAM than the original PI-Net.

View on arXiv PDF

Similar