Learning the Kalman Filter with Fine-Grained Sample Complexity
This provides a theoretical foundation for applying model-free reinforcement learning to linear dynamical systems with noisy or adversarial disturbances, though it is incremental in advancing sample complexity bounds.
The paper tackles the problem of learning the Kalman filter without prior stability assumptions, achieving an $ ilde{\mathcal{O}}(ε^{-2})$ sample complexity for a model-free policy gradient method to find a filter close to optimal.
We develop the first end-to-end sample complexity of model-free policy gradient (PG) methods in discrete-time infinite-horizon Kalman filtering. Specifically, we introduce the receding-horizon policy gradient (RHPG-KF) framework and demonstrate $\tilde{\mathcal{O}}(ε^{-2})$ sample complexity for RHPG-KF in learning a stabilizing filter that is $ε$-close to the optimal Kalman filter. Notably, the proposed RHPG-KF framework does not require the system to be open-loop stable nor assume any prior knowledge of a stabilizing filter. Our results shed light on applying model-free PG methods to control a linear dynamical system where the state measurements could be corrupted by statistical noises and other (possibly adversarial) disturbances.