Learning Expected Reward for Switched Linear Control Systems: A Non-Asymptotic View
This work provides a foundational non-asymptotic analysis for average reward-based optimal control in switched linear dynamical systems, which is incremental as it builds on existing ergodic theory and control methods.
The paper tackled the problem of learning expected rewards for switched linear control systems by establishing the existence of an invariant ergodic measure under norm-stability assumptions, and derived non-asymptotic bounds for this learning process using Birkhoff's Ergodic Theorem, with results illustrated in two case-studies.
In this work, we show existence of invariant ergodic measure for switched linear dynamical systems (SLDSs) under a norm-stability assumption of system dynamics in some unbounded subset of $\mathbb{R}^{n}$. Consequently, given a stationary Markov control policy, we derive non-asymptotic bounds for learning expected reward (w.r.t the invariant ergodic measure our closed-loop system mixes to) from time-averages using Birkhoff's Ergodic Theorem. The presented results provide a foundation for deriving non-asymptotic analysis for average reward-based optimal control of SLDSs. Finally, we illustrate the presented theoretical results in two case-studies.