MLAILGFeb 18, 2025

Conformal Inference under High-Dimensional Covariate Shifts via Likelihood-Ratio Regularization

arXiv:2502.13030v59 citationsh-index: 28
Originality Incremental advance
AI Analysis

This addresses the challenge of reliable uncertainty quantification for machine learning models in real-world scenarios with domain shifts, though it is incremental as it builds on existing conformal prediction methods.

The paper tackles the problem of constructing prediction sets with valid coverage under covariate shift in high-dimensional data, such as images, by introducing the LR-QR algorithm, which achieves coverage at the desired level with a small controllable error and outperforms existing methods on tasks including regression, image classification, and LLM question-answering.

We consider the problem of conformal prediction under covariate shift. Given labeled data from a source domain and unlabeled data from a covariate shifted target domain, we seek to construct prediction sets with valid marginal coverage in the target domain. Most existing methods require estimating the unknown likelihood ratio function, which can be prohibitive for high-dimensional data such as images. To address this challenge, we introduce the likelihood ratio regularized quantile regression (LR-QR) algorithm, which combines the pinball loss with a novel choice of regularization in order to construct a threshold function without directly estimating the unknown likelihood ratio. We show that the LR-QR method has coverage at the desired level in the target domain, up to a small error term that we can control. Our proofs draw on a novel analysis of coverage via stability bounds from learning theory. Our experiments demonstrate that the LR-QR algorithm outperforms existing methods on high-dimensional prediction tasks, including a regression task for the Communities and Crime dataset, an image classification task from the WILDS repository, and an LLM question-answering task on the MMLU benchmark.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes