RO AI SYDec 5, 2022

Learning Sampling Distributions for Model Predictive Control

arXiv:2212.02587v114.233 citationsh-index: 47

Originality Incremental advance

AI Analysis

This work addresses a bottleneck in MPC for robotics by improving sampling efficiency, though it is incremental as it builds on existing latent space approaches.

The paper tackles the problem of poor performance in sampling-based Model Predictive Control (MPC) due to suboptimal sampling distributions by proposing a method that performs all operations in a learned latent space, using a normalizing flow parameterization. The result is improved performance on simulated robotics tasks, surpassing prior methods and scaling better with fewer samples.

Sampling-based methods have become a cornerstone of contemporary approaches to Model Predictive Control (MPC), as they make no restrictions on the differentiability of the dynamics or cost function and are straightforward to parallelize. However, their efficacy is highly dependent on the quality of the sampling distribution itself, which is often assumed to be simple, like a Gaussian. This restriction can result in samples which are far from optimal, leading to poor performance. Recent work has explored improving the performance of MPC by sampling in a learned latent space of controls. However, these methods ultimately perform all MPC parameter updates and warm-starting between time steps in the control space. This requires us to rely on a number of heuristics for generating samples and updating the distribution and may lead to sub-optimal performance. Instead, we propose to carry out all operations in the latent space, allowing us to take full advantage of the learned distribution. Specifically, we frame the learning problem as bi-level optimization and show how to train the controller with backpropagation-through-time. By using a normalizing flow parameterization of the distribution, we can leverage its tractable density to avoid requiring differentiability of the dynamics and cost function. Finally, we evaluate the proposed approach on simulated robotics tasks and demonstrate its ability to surpass the performance of prior methods and scale better with a reduced number of samples.

View on arXiv PDF

Similar