Invariant Probabilistic Prediction
This work addresses the challenge of reliable uncertainty quantification in machine learning under distribution shifts, which is crucial for applications like healthcare and finance, though it is incremental by building on causality-inspired frameworks.
The paper tackles the problem of making probabilistic predictions robust to distribution shifts between training and test data, showing that arbitrary shifts generally prevent invariant predictions, and proposes a method called IPP that achieves invariance under restricted shifts, with empirical validation on simulated and single-cell data.
In recent years, there has been a growing interest in statistical methods that exhibit robust performance under distribution changes between training and test data. While most of the related research focuses on point predictions with the squared error loss, this article turns the focus towards probabilistic predictions, which aim to comprehensively quantify the uncertainty of an outcome variable given covariates. Within a causality-inspired framework, we investigate the invariance and robustness of probabilistic predictions with respect to proper scoring rules. We show that arbitrary distribution shifts do not, in general, admit invariant and robust probabilistic predictions, in contrast to the setting of point prediction. We illustrate how to choose evaluation metrics and restrict the class of distribution shifts to allow for identifiability and invariance in the prototypical Gaussian heteroscedastic linear model. Motivated by these findings, we propose a method to yield invariant probabilistic predictions, called IPP, and study the consistency of the underlying parameters. Finally, we demonstrate the empirical performance of our proposed procedure on simulated as well as on single-cell data.