MLSep 19, 2013

Predictive PAC Learning and Process Decompositions

Cosma Rohilla Shalizi, Aryeh Kontorovich

arXiv:1309.4859v138 citations

Originality Highly original

AI Analysis

This addresses a theoretical challenge in learning from dependent data, offering a new perspective for researchers in statistical learning theory.

The paper tackles the problem of learning from mixtures of stochastic processes in predictive PAC learning, showing that conditioning on the mixture component yields a generalization bound not worse than that of each component.

We informally call a stochastic process learnable if it admits a generalization error approaching zero in probability for any concept class with finite VC-dimension (IID processes are the simplest example). A mixture of learnable processes need not be learnable itself, and certainly its generalization error need not decay at the same rate. In this paper, we argue that it is natural in predictive PAC to condition not on the past observations but on the mixture component of the sample path. This definition not only matches what a realistic learner might demand, but also allows us to sidestep several otherwise grave problems in learning from dependent data. In particular, we give a novel PAC generalization bound for mixtures of learnable processes with a generalization error that is not worse than that of each mixture component. We also provide a characterization of mixtures of absolutely regular ($β$-mixing) processes, of independent probability-theoretic interest.

View on arXiv PDF

Similar