Distributionally Robust Regret Optimal LQR with Common Stage-Law Ambiguity
This provides a less conservative control method for stochastic systems with uncertain disturbance distributions, though it is incremental as it builds on existing DRO and regret optimization frameworks.
The authors tackled the problem of designing a tractable distributionally robust regret-optimal controller for linear-quadratic regulator (LQR) systems under common stage-law ambiguity, showing it can be reformulated as a semidefinite program and often reduces conservatism by up to substantial amounts compared to distributionally robust optimization (DRO) while maintaining regret guarantees.
We study, to our knowledge, the first tractable multistage ex-ante distributionally robust regret optimization (DRRO) formulation for stochastic control. We consider finite-horizon LQR under common stage-law ambiguity: disturbances are independent across time but share an unknown stage law whose mean and covariance lie in a Gelbrich ball around nominal parameters. Unlike the single-stage quadratic case, the nominal certainty-equivalent (CE) controller is generally not regret-optimal, because reuse of the stage law makes past disturbances informative for future decisions. Despite the general NP-hardness of DRRO, we show that over linear disturbance-feedback policies the resulting multistage DRRO-LQR problem admits an exact semidefinite programming reformulation. The optimal controller is the nominal certainty-equivalent LQR law plus a strictly causal empirical-mean correction. We also characterize worst-case distributions and show that those for the DRRO-optimal policy are nonunique. Numerical results show that, relative to the corresponding DRO controller under the same ambiguity set, DRRO is often substantially less conservative while preserving the intended regret guarantee, and that its correction coefficients empirically approach the certainty-equivalent feedforward coefficient.