ML LGJan 13, 2021

On Misspecification in Prediction Problems and Robustness via Improper Learning

Annie Marsden, John Duchi, Gregory Valiant

arXiv:2101.05234v23.62 citations

Originality Incremental advance

AI Analysis

This addresses robustness in prediction for machine learning practitioners by highlighting the limitations of proper predictors and the advantages of improper learning under model misspecification, though it is incremental in building on existing aggregation methods.

The paper tackles the problem of prediction under model misspecification, showing that proper predictors suffer regret scaling at least as √(γn) due to misspecification, while improper learners achieve regret d log n regardless of the true distribution, with experiments supporting this robustness.

We study probabilistic prediction games when the underlying model is misspecified, investigating the consequences of predicting using an incorrect parametric model. We show that for a broad class of loss functions and parametric families of distributions, the regret of playing a "proper" predictor -- one from the putative model class -- relative to the best predictor in the same model class has lower bound scaling at least as $\sqrt{γn}$, where $γ$ is a measure of the model misspecification to the true distribution in terms of total variation distance. In contrast, using an aggregation-based (improper) learner, one can obtain regret $d \log n$ for any underlying generating distribution, where $d$ is the dimension of the parameter; we exhibit instances in which this is unimprovable even over the family of all learners that may play distributions in the convex hull of the parametric family. These results suggest that simple strategies for aggregating multiple learners together should be more robust, and several experiments conform to this hypothesis.

View on arXiv PDF

Similar