LGMLJun 17, 2021

PAC Prediction Sets Under Covariate Shift

arXiv:2106.09848v255 citations
Originality Highly original
AI Analysis

This addresses uncertainty quantification for machine learning models when data distributions shift, which is crucial for real-world applications but often overlooked in existing methods.

The paper tackles the problem of quantifying prediction uncertainty under covariate shift by proposing a method to construct probably approximately correct (PAC) prediction sets, achieving the smallest average normalized set size among PAC-compliant approaches on DomainNet and ImageNet datasets.

An important challenge facing modern machine learning is how to rigorously quantify the uncertainty of model predictions. Conveying uncertainty is especially important when there are changes to the underlying data distribution that might invalidate the predictive model. Yet, most existing uncertainty quantification algorithms break down in the presence of such shifts. We propose a novel approach that addresses this challenge by constructing \emph{probably approximately correct (PAC)} prediction sets in the presence of covariate shift. Our approach focuses on the setting where there is a covariate shift from the source distribution (where we have labeled training examples) to the target distribution (for which we want to quantify uncertainty). Our algorithm assumes given importance weights that encode how the probabilities of the training examples change under the covariate shift. In practice, importance weights typically need to be estimated; thus, we extend our algorithm to the setting where we are given confidence intervals for the importance weights. We demonstrate the effectiveness of our approach on covariate shifts based on DomainNet and ImageNet. Our algorithm satisfies the PAC constraint, and gives prediction sets with the smallest average normalized size among approaches that always satisfy the PAC constraint.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes