ROMay 21

Real-Time Auto-Optimization in Unknown Environments via Structure-Exploiting Dual Control for Exploration and Exploitation

Shiying Dong, Haoyang Yang, Qiwei Liu, Wen-Hua Chen

arXiv:2605.224313.1

Predicted impact top 82% in RO · last 90 daysOriginality Incremental advance

AI Analysis

For real-time control systems operating in unknown environments, this work provides a computationally efficient dual control method that significantly reduces computation time while improving control performance.

The paper develops a fast numerical dual control method for auto-optimization in unknown environments, achieving a speedup of approximately one order of magnitude with a maximum computation time of 83 μs on a vehicle embedded CPU, as demonstrated in simulation and hardware-in-the-loop experiments.

This paper develops a fast numerical dual control for exploration and exploitation (DCEE) method to address auto-optimization problems in unknown environments. In auto-optimization problems, the optimal operating condition is unknown a priori and may vary with the environment. As in classical dual control techniques, computational burden remains a major concern in DCEE for active learning. Existing DCEE methods provide a principled exploration-exploitation objective, but mainly realized through standard optimization packages or explicit gradient-type update laws, where the numerical structure of the DCEE has not been fully exploited. This paper shows that the reward function in DCEE has an inherent convex-over-nonlinear structure, where the exploitation and exploration terms form a unified nonlinear residual map equipped with a convex outer loss. Benefiting from this structure, a structure-exploiting numerical method is developed by linearizing only the nonlinear residual map while preserving the convex outer loss. Thus, each subproblem is transformed into a structured convex form that can be solved reliably. The resulting generalized Gauss-Newton Hessian approximation is positive semidefinite and depends only on first-order derivatives, thereby supporting fast online computation. The proposed method is evaluated on a vehicle cruising auto-optimization problem and compared with existing methods. Simulation and hardware-in-the-loop experimental results show that the proposed method improves control performance and achieves a speedup of approximately one order of magnitude, with a microsecond-level maximum computation time of only 83 μs on a typical vehicle embedded CPU.

View on arXiv PDF

Similar