Amirreza Neshaei Moghaddam

2papers

2 Papers

5.9SYApr 16, 2024

Sample Complexity of the Linear Quadratic Regulator: A Reinforcement Learning Lens

Amirreza Neshaei Moghaddam, Alex Olshevsky, Bahman Gharesifard

We provide the first known algorithm that provably achieves $\varepsilon$-optimality within $\widetilde{\mathcal{O}}(1/\varepsilon)$ function evaluations for the discounted discrete-time LQR problem with unknown parameters, without relying on two-point gradient estimates. These estimates are known to be unrealistic in many settings, as they depend on using the exact same initialization, which is to be selected randomly, for two different policies. Our results substantially improve upon the existing literature outside the realm of two-point gradient estimates, which either leads to $\widetilde{\mathcal{O}}(1/\varepsilon^2)$ rates or heavily relies on stability assumptions.

7.1OCFeb 20, 2025

Sample Complexity of Linear Quadratic Regulator Without Initial Stability

Amirreza Neshaei Moghaddam, Alex Olshevsky, Bahman Gharesifard

Inspired by REINFORCE, we introduce a novel receding-horizon algorithm for the Linear Quadratic Regulator (LQR) problem with unknown dynamics. Unlike prior methods, our algorithm avoids reliance on two-point gradient estimates while maintaining the same order of sample complexity. Furthermore, it eliminates the restrictive requirement of starting with a stable initial policy, broadening its applicability. Beyond these improvements, we introduce a refined analysis of error propagation through the contraction of the Riccati operator under the Riemannian distance. This refinement leads to a better sample complexity and ensures improved convergence guarantees.