LGROJun 3, 2025

Accelerating Model-Based Reinforcement Learning using Non-Linear Trajectory Optimization

arXiv:2506.02767v1h-index: 102025 33rd Mediterranean Conference on Control and Automation (MED)
Originality Incremental advance
AI Analysis

This work addresses a performance bottleneck in model-based reinforcement learning for robotics or control systems, representing an incremental improvement over existing methods.

This paper tackles the slow policy optimization convergence of the MC-PILCO model-based reinforcement learning algorithm by integrating it with iLQR trajectory optimization, resulting in up to 45.9% reduction in execution time on a cart-pole task while maintaining a 100% success rate.

This paper addresses the slow policy optimization convergence of Monte Carlo Probabilistic Inference for Learning Control (MC-PILCO), a state-of-the-art model-based reinforcement learning (MBRL) algorithm, by integrating it with iterative Linear Quadratic Regulator (iLQR), a fast trajectory optimization method suitable for nonlinear systems. The proposed method, Exploration-Boosted MC-PILCO (EB-MC-PILCO), leverages iLQR to generate informative, exploratory trajectories and initialize the policy, significantly reducing the number of required optimization steps. Experiments on the cart-pole task demonstrate that EB-MC-PILCO accelerates convergence compared to standard MC-PILCO, achieving up to $\bm{45.9\%}$ reduction in execution time when both methods solve the task in four trials. EB-MC-PILCO also maintains a $\bm{100\%}$ success rate across trials while solving the task faster, even in cases where MC-PILCO converges in fewer iterations.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes