MLLGFeb 4, 2025

Achievable distributional robustness when the robust risk is only partially identified

arXiv:2502.02710v14 citationsh-index: 1NIPS
Originality Incremental advance
AI Analysis

This addresses a critical gap in safety-critical applications where distribution shifts are common but identifiability conditions are rarely met, offering a more robust theoretical framework.

The paper tackles the problem of machine learning models generalizing under worst-case distribution shifts when the robust risk is only partially identifiable, introducing a new measure called worst-case robust risk and showing that existing methods are suboptimal in this scenario, with test error improvements demonstrated on real-world gene expression data as unseen environments increase.

In safety-critical applications, machine learning models should generalize well under worst-case distribution shifts, that is, have a small robust risk. Invariance-based algorithms can provably take advantage of structural assumptions on the shifts when the training distributions are heterogeneous enough to identify the robust risk. However, in practice, such identifiability conditions are rarely satisfied -- a scenario so far underexplored in the theoretical literature. In this paper, we aim to fill the gap and propose to study the more general setting when the robust risk is only partially identifiable. In particular, we introduce the worst-case robust risk as a new measure of robustness that is always well-defined regardless of identifiability. Its minimum corresponds to an algorithm-independent (population) minimax quantity that measures the best achievable robustness under partial identifiability. While these concepts can be defined more broadly, in this paper we introduce and derive them explicitly for a linear model for concreteness of the presentation. First, we show that existing robustness methods are provably suboptimal in the partially identifiable case. We then evaluate these methods and the minimizer of the (empirical) worst-case robust risk on real-world gene expression data and find a similar trend: the test error of existing robustness methods grows increasingly suboptimal as the fraction of data from unseen environments increases, whereas accounting for partial identifiability allows for better generalization.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes