STLGMLFeb 1, 2019

Estimating the Mixing Time of Ergodic Markov Chains

arXiv:1902.01224v450 citations
Originality Highly original
AI Analysis

This work solves an open problem in Markov chain analysis for researchers in statistics and machine learning, offering practical tools for non-reversible chains with incremental improvements in the reversible setting.

The paper tackles the problem of estimating the mixing time of arbitrary ergodic Markov chains from a single trajectory, achieving polynomial dependence on key parameters and providing empirical confidence intervals that shrink at a rate of roughly 1/√m, improving state-of-the-art results even in the reversible case.

We address the problem of estimating the mixing time $t_{\mathsf{mix}}$ of an arbitrary ergodic finite-state Markov chain from a single trajectory of length $m$. The reversible case was addressed by Hsu et al. [2019], who left the general case as an open problem. In the reversible case, the analysis is greatly facilitated by the fact that the Markov operator is self-adjoint, and Weyl's inequality allows for a dimension-free perturbation analysis of the empirical eigenvalues. As Hsu et al. point out, in the absence of reversibility (which induces asymmetric pair probabilities matrices), the existing perturbation analysis has a worst-case exponential dependence on the number of states $d$. Furthermore, even if an eigenvalue perturbation analysis with better dependence on $d$ were available, in the non-reversible case the connection between the spectral gap and the mixing time is not nearly as straightforward as in the reversible case. Our key insight is to estimate the pseudo-spectral gap $γ_{\mathsf{ps}}$ instead, which allows us to overcome the loss of symmetry and to achieve a polynomial dependence on the minimal stationary probability $π_\star$ and $γ_{\mathsf{ps}}$. Additionally, in the reversible case, we obtain simultaneous nearly (up to logarithmic factors) minimax rates in $t_{\mathsf{mix}}$ and precision $\varepsilon$, closing a gap in Hsu et al., who treated $\varepsilon$ as constant in the lower bounds. Finally, we construct fully empirical confidence intervals for $γ_{\mathsf{ps}}$, which shrink to zero at a rate of roughly $1/\sqrt{m}$, and improve the state of the art in even the reversible case.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes