LGSep 27, 2021

Training on Test Data with Bayesian Adaptation for Covariate Shift

arXiv:2109.12746v114 citations
Originality Highly original
AI Analysis

This addresses the issue of unreliable predictions under covariate shift for deep learning practitioners, offering a direct adaptation approach rather than robustifying against all shifts.

The paper tackles the problem of distribution shift at test time by proposing a Bayesian model that enables adaptation to unlabeled test data, improving both accuracy and uncertainty estimation across various image classification shifts.

When faced with distribution shift at test time, deep neural networks often make inaccurate predictions with unreliable uncertainty estimates. While improving the robustness of neural networks is one promising approach to mitigate this issue, an appealing alternate to robustifying networks against all possible test-time shifts is to instead directly adapt them to unlabeled inputs from the particular distribution shift we encounter at test time. However, this poses a challenging question: in the standard Bayesian model for supervised learning, unlabeled inputs are conditionally independent of model parameters when the labels are unobserved, so what can unlabeled data tell us about the model parameters at test-time? In this paper, we derive a Bayesian model that provides for a well-defined relationship between unlabeled inputs under distributional shift and model parameters, and show how approximate inference in this model can be instantiated with a simple regularized entropy minimization procedure at test-time. We evaluate our method on a variety of distribution shifts for image classification, including image corruptions, natural distribution shifts, and domain adaptation settings, and show that our method improves both accuracy and uncertainty estimation.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes