A Moreau Envelope Approach for LQR Meta-Policy Estimation
This addresses policy adaptation in control systems for robotics or autonomous systems, but it is incremental as it builds on existing meta-learning and LQR methods.
The paper tackles policy estimation for Linear Quadratic Regulator (LQR) in uncertain dynamical systems by proposing a Moreau Envelope-based surrogate cost to define an efficiently adjustable meta-policy, with numerical results showing it outperforms naive averaging and has better sample complexity than MAML approaches.
We study the problem of policy estimation for the Linear Quadratic Regulator (LQR) in discrete-time linear time-invariant uncertain dynamical systems. We propose a Moreau Envelope-based surrogate LQR cost, built from a finite set of realizations of the uncertain system, to define a meta-policy efficiently adjustable to new realizations. Moreover, we design an algorithm to find an approximate first-order stationary point of the meta-LQR cost function. Numerical results show that the proposed approach outperforms naive averaging of controllers on new realizations of the linear system. We also provide empirical evidence that our method has better sample complexity than Model-Agnostic Meta-Learning (MAML) approaches.