Universality of Bayesian mixture predictors
This provides a theoretical guarantee for optimal forecasting in broad, unstructured settings, which is foundational for statistical learning and time series analysis.
The paper tackles the problem of sequential probability forecasting for arbitrary sets of unknown probability distributions, showing that the minimax asymptotic performance is always attainable using a Bayesian mixture of countably many measures from the set, extending prior results limited to cases with zero asymptotic error.
The problem is that of sequential probability forecasting for finite-valued time series. The data is generated by an unknown probability distribution over the space of all one-way infinite sequences. It is known that this measure belongs to a given set C, but the latter is completely arbitrary (uncountably infinite, without any structure given). The performance is measured with asymptotic average log loss. In this work it is shown that the minimax asymptotic performance is always attainable, and it is attained by a convex combination of a countably many measures from the set C (a Bayesian mixture). This was previously only known for the case when the best achievable asymptotic error is 0. This also contrasts previous results that show that in the non-realizable case all Bayesian mixtures may be suboptimal, while there is a predictor that achieves the optimal performance.