SY SYMay 8

Sampling-based Model Predictive Control Using Trust Regions

Markus Walker, Marcel Reith-Braun, Daniel Frisch, Uwe D. Hanebeck

arXiv:2605.078013.7

Predicted impact top 61% in SY · last 90 daysOriginality Incremental advance

AI Analysis

For practitioners of sampling-based MPC, this provides a principled alternative to manual hyperparameter tuning, improving performance in sample-limited settings.

Sampling-based MPC methods like MPPI rely on heuristics for hyperparameter tuning. The authors propose a trust region formulation with KL divergence and entropy bounds that replaces heuristics with optimal Lagrangian-derived values, achieving faster convergence and better sample efficiency in low-sample regimes.

Sampling-based model predictive control (MPC) algorithms, such as model predictive path integral (MPPI), enable approximate, gradient-free solutions to optimal control problems by drawing samples from a proposal distribution, evaluating their trajectory costs, and updating the proposal parameters accordingly. However, these approaches typically rely on heuristics for adjusting hyperparameters, such as temperature or momentum, or manual tuning. We propose a trust region formulation for sampling-based MPC that constrains updates of the proposal distribution via a principled Kullback--Leibler (KL) divergence bound and, optionally, an entropy lower bound. This replaces heuristic hyperparameter adaptation with values that are optimal w.r.t. the underlying Lagrangian. We further improve sample efficiency and convergence by combining the trust region update with deterministic localized cumulative distribution (LCD)-based sampling. Experiments on two benchmark environments demonstrate that the proposed trust region update achieves faster convergence and better sample efficiency in low-sample and low-iteration regimes, especially when paired with deterministic LCD-based sampling.

View on arXiv PDF

Similar