LGNov 3, 2021

Shift Happens: Adjusting Classifiers

Theodore James Thibault Heiser, Mari-Liis Allikivi, Meelis Kull

arXiv:2111.02529v15.54 citations

Originality Incremental advance

AI Analysis

This addresses the issue of maintaining classifier accuracy for users in dynamic data environments, though it is incremental as it builds on existing adjustment techniques.

The paper tackles the problem of classifier performance degradation due to dataset shift by proposing unbounded and bounded general adjustment (UGA and BGA) methods that transform predictions to equalize average predictions with class distributions, showing theoretical guarantees and experimental reductions in loss when class distributions are known approximately.

Minimizing expected loss measured by a proper scoring rule, such as Brier score or log-loss (cross-entropy), is a common objective while training a probabilistic classifier. If the data have experienced dataset shift where the class distributions change post-training, then often the model's performance will decrease, over-estimating the probabilities of some classes while under-estimating the others on average. We propose unbounded and bounded general adjustment (UGA and BGA) methods that transform all predictions to (re-)equalize the average prediction and the class distribution. These methods act differently depending on which proper scoring rule is to be minimized, and we have a theoretical guarantee of reducing loss on test data, if the exact class distribution is known. We also demonstrate experimentally that, when in practice the class distribution is known only approximately, there is often still a reduction in loss depending on the amount of shift and the precision to which the class distribution is known.

View on arXiv PDF

Similar