LGMLJun 16, 2022

Learning with little mixing

arXiv:2206.08269v338 citationsh-index: 30
Originality Highly original
AI Analysis

This addresses the challenge of efficient learning from weakly correlated data in time-series analysis, offering a significant improvement over existing methods that require stronger mixing assumptions.

The paper tackles the problem of learning from dependent time-series data by showing that, under a trajectory hypercontractivity condition, the least-squares estimator achieves an excess risk bound matching the i.i.d. rate after a burn-in time, without the typical mixing-time deflation seen in prior work, and provides examples such as bounded function classes and ergodic Markov chains.

We study square loss in a realizable time-series framework with martingale difference noise. Our main result is a fast rate excess risk bound which shows that whenever a trajectory hypercontractivity condition holds, the risk of the least-squares estimator on dependent data matches the iid rate order-wise after a burn-in time. In comparison, many existing results in learning from dependent data have rates where the effective sample size is deflated by a factor of the mixing-time of the underlying process, even after the burn-in time. Furthermore, our results allow the covariate process to exhibit long range correlations which are substantially weaker than geometric ergodicity. We call this phenomenon learning with little mixing, and present several examples for when it occurs: bounded function classes for which the $L^2$ and $L^{2+ε}$ norms are equivalent, ergodic finite state Markov chains, various parametric models, and a broad family of infinite dimensional $\ell^2(\mathbb{N})$ ellipsoids. By instantiating our main result to system identification of nonlinear dynamics with generalized linear model transitions, we obtain a nearly minimax optimal excess risk bound after only a polynomial burn-in time.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes