OCLGFeb 18, 2024

Model-Free $μ$-Synthesis: A Nonsmooth Optimization Perspective

arXiv:2402.11654v1h-index: 37
Originality Incremental advance
AI Analysis

This work addresses robust control problems for engineers and researchers, offering a competitive model-free approach, though it is incremental as it builds on existing policy optimization methods.

The paper tackles the problem of model-free policy search for robust control in μ-synthesis, extending subgradient-based methods to a model-free setting. The results show that model-free methods consistently replicate design outcomes from model-based counterparts, with theoretical convergence guarantees under certain assumptions.

In this paper, we revisit model-free policy search on an important robust control benchmark, namely $μ$-synthesis. In the general output-feedback setting, there do not exist convex formulations for this problem, and hence global optimality guarantees are not expected. Apkarian (2011) presented a nonconvex nonsmooth policy optimization approach for this problem, and achieved state-of-the-art design results via using subgradient-based policy search algorithms which generate update directions in a model-based manner. Despite the lack of convexity and global optimality guarantees, these subgradient-based policy search methods have led to impressive numerical results in practice. Built upon such a policy optimization persepctive, our paper extends these subgradient-based search methods to a model-free setting. Specifically, we examine the effectiveness of two model-free policy optimization strategies: the model-free non-derivative sampling method and the zeroth-order policy search with uniform smoothing. We performed an extensive numerical study to demonstrate that both methods consistently replicate the design outcomes achieved by their model-based counterparts. Additionally, we provide some theoretical justifications showing that convergence guarantees to stationary points can be established for our model-free $μ$-synthesis under some assumptions related to the coerciveness of the cost function. Overall, our results demonstrate that derivative-free policy optimization offers a competitive and viable approach for solving general output-feedback $μ$-synthesis problems in the model-free setting.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes