OCLGMar 29, 2023

Policy Gradient Methods for Discrete Time Linear Quadratic Regulator With Random Parameters

arXiv:2303.16548v2h-index: 4
Originality Incremental advance
AI Analysis

This work addresses control problems in uncertain environments for researchers and practitioners in reinforcement learning and control theory, offering an incremental improvement with more verifiable assumptions.

The paper tackles the infinite horizon optimal control problem for discrete-time linear quadratic regulators with random parameters by applying policy gradient methods without needing statistical knowledge, establishing global linear convergence under weaker assumptions and demonstrating results through numerical experiments.

This paper studies an infinite horizon optimal control problem for discrete-time linear system and quadratic criteria, both with random parameters which are independent and identically distributed with respect to time. In this general setting, we apply the policy gradient method, a reinforcement learning technique, to search for the optimal control without requiring knowledge of statistical information of the parameters. We investigate the sub-Gaussianity of the state process and establish global linear convergence guarantee for this approach based on assumptions that are weaker and easier to verify compared to existing results. Numerical experiments are presented to illustrate our result.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes