LGITSTMLDec 20, 2018

Finite-time optimality of Bayesian predictors

arXiv:1812.08292v2
Originality Highly original
AI Analysis

This provides a foundational result for sequential prediction, applicable broadly in machine learning and statistics, though it is incremental in extending Bayesian optimality to non-asymptotic bounds.

The paper tackles the problem of sequential probability forecasting in a general setting without assumptions on the model class, showing that a Bayesian predictor with a discrete prior can match any predictor's cumulative loss up to an additive term of order log n, with a non-matching lower bound that may increase arbitrarily slowly.

The problem of sequential probability forecasting is considered in the most general setting: a model set C is given, and it is required to predict as well as possible if any of the measures (environments) in C is chosen to generate the data. No assumptions whatsoever are made on the model class C, in particular, no independence or mixing assumptions; C may not be measurable; there may be no predictor whose loss is sublinear, etc. It is shown that the cumulative loss of any possible predictor can be matched by that of a Bayesian predictor whose prior is discrete and is concentrated on C, up to an additive term of order $\log n$, where $n$ is the time step. The bound holds for every $n$ and every measure in C. This is the first non-asymptotic result of this kind. In addition, a non-matching lower bound is established: it goes to infinity with $n$ but may do so arbitrarily slow.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes