ML LGDec 17, 2024

Sequential Harmful Shift Detection Without Labels

Salim I. Amoukou, Tom Bewley, Saumitra Mishra, Freddy Lecue, Daniele Magazzeni, Manuela Veloso

arXiv:2412.12910v116.810 citationsh-index: 7NIPS

Originality Incremental advance

AI Analysis

This addresses the challenge of monitoring model performance in real-world applications where labels are unavailable, though it is incremental as it builds on prior work.

The paper tackles the problem of detecting harmful distribution shifts in machine learning models during continuous production without needing ground truth labels, by extending an existing framework to use a proxy for true error derived from a trained error estimator, and demonstrates high power and false alarm control in experiments under various shift types.

We introduce a novel approach for detecting distribution shifts that negatively impact the performance of machine learning models in continuous production environments, which requires no access to ground truth data labels. It builds upon the work of Podkopaev and Ramdas [2022], who address scenarios where labels are available for tracking model errors over time. Our solution extends this framework to work in the absence of labels, by employing a proxy for the true error. This proxy is derived using the predictions of a trained error estimator. Experiments show that our method has high power and false alarm control under various distribution shifts, including covariate and label shifts and natural shifts over geography and time.

View on arXiv PDF

Similar