MEMLSep 18, 2020

Sequential changepoint detection in classification data under label shift

arXiv:2009.08592v21 citations
AI Analysis

This addresses the issue of performance degradation in classifiers when data distributions change, which is critical for real-world applications like monitoring systems, but it is incremental as it builds on existing changepoint detection methods.

The paper tackles the problem of detecting distribution changes in sequentially-observed, unlabeled classification data under label shift, where class priors shift but class conditional distributions remain unchanged, and shows that their method outperforms other detection procedures in simulations.

Classifier predictions often rely on the assumption that new observations come from the same distribution as training data. When the underlying distribution changes, so does the optimal classification rule, and performance may degrade. We consider the problem of detecting such a change in distribution in sequentially-observed, unlabeled classification data. We focus on label shift changes to the distribution, where the class priors shift but the class conditional distributions remain unchanged. We reduce this problem to the problem of detecting a change in the one-dimensional classifier scores, leading to simple nonparametric sequential changepoint detection procedures. Our procedures leverage classifier training data to estimate the detection statistic, and converge to their parametric counterparts in the size of the training data. In simulations, we show that our method outperforms other detection procedures in this label shift setting.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes