Data-Driven Continuous-Time Linear Quadratic Regulator via Closed-Loop and Reinforcement Learning Parameterizations

Armin Gießler, Felix Thömmes, Sören Hohmann

arXiv:2604.2792217.8

Predicted impact top 65% in OC · last 90 daysOriginality Synthesis-oriented

AI Analysis

For control theorists and practitioners, this work unifies and extends existing data-driven LQR methods to continuous-time systems, but the contributions are primarily theoretical and incremental.

The paper adapts closed-loop and integral reinforcement learning parameterizations to continuous-time LQR, deriving policy iteration schemes, data-driven Riccati equations, and convex reformulations, while providing a unified framework that clarifies structural relationships between approaches.

This paper studies data-driven approaches to the continuous-time linear quadratic regulator (LQR) problem based on two existing parameterizations, namely a closed-loop (CL) parameterization from behavioral system theory and an integral reinforcement learning (IRL) parameterization. The CL parameterization characterizes the closed-loop system via a matrix that satisfies equality constraints. While this parameterization has been extensively studied for discrete-time systems, we adapt key results to the continuous-time setting and develop a policy iteration (PI) scheme, derive a data-driven continuous-time algebraic Riccati equation (CARE), and introduce an alternative convex problem formulation. The IRL parameterization utilizes off-policy data to perform policy evaluation, which is then used for PI or value iteration. Within the IRL framework, we derive a policy gradient flow and propose convex reformulations of the LQR problem. Finally, we provide a unified treatment of these parameterizations that enables a systematic understanding of existing approaches and clarifies their structural relationships.

View on arXiv PDF

Similar