SYAILGOCJul 14, 2025

Intersection of Reinforcement Learning and Bayesian Optimization for Intelligent Control of Industrial Processes: A Safe MPC-based DPG using Multi-Objective BO

arXiv:2507.09864v1
Originality Incremental advance
AI Analysis

This addresses safety and efficiency challenges in industrial process control, though it appears incremental as it combines existing methods like MPC-RL and MOBO.

The paper tackles the slow convergence, suboptimal policy learning, and safety issues in Model Predictive Control-based Reinforcement Learning (MPC-RL) by integrating it with Multi-Objective Bayesian Optimization (MOBO), resulting in improved closed-loop performance with sample-efficient, stable, and high-performance learning for control systems.

Model Predictive Control (MPC)-based Reinforcement Learning (RL) offers a structured and interpretable alternative to Deep Neural Network (DNN)-based RL methods, with lower computational complexity and greater transparency. However, standard MPC-RL approaches often suffer from slow convergence, suboptimal policy learning due to limited parameterization, and safety issues during online adaptation. To address these challenges, we propose a novel framework that integrates MPC-RL with Multi-Objective Bayesian Optimization (MOBO). The proposed MPC-RL-MOBO utilizes noisy evaluations of the RL stage cost and its gradient, estimated via a Compatible Deterministic Policy Gradient (CDPG) approach, and incorporates them into a MOBO algorithm using the Expected Hypervolume Improvement (EHVI) acquisition function. This fusion enables efficient and safe tuning of the MPC parameters to achieve improved closed-loop performance, even under model imperfections. A numerical example demonstrates the effectiveness of the proposed approach in achieving sample-efficient, stable, and high-performance learning for control systems.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes