LGMay 23

Testing the Test: Score-Direction Instability in Class-Split Anomaly Detection

arXiv:2606.0260119.6
AI Analysis

For researchers using class-split benchmarks for anomaly detection, this work highlights a fundamental flaw in the evaluation protocol, showing that results are geometry-dependent rather than unconditional.

The paper reveals that within-dataset class-split anomaly detection can become ill-posed when the held-out anomaly class overlaps the normal mixture in representation space, causing score inversion or collapse. The authors introduce a diagnostic, neighborhood class leakage, which predicts this instability across multiple datasets and representation spaces.

Within-dataset class-split evaluation is widely used as a proxy for fully unconditional out-of-distribution anomaly detection. We show that this protocol can become ill-posed when the held-out anomaly class overlaps the normal mixture in representation space. In this regime, anomaly scores may collapse toward chance or even invert, and the preferred score direction can depend on the unknown anomaly class. We introduce a simple training-free diagnostic, neighborhood class leakage, and show that it predicts score-direction instability across Fashion-MNIST, CIFAR-10, and Imagenette, in both pixel and VAE latent spaces. Our results suggest that class-split AD benchmarks should be treated as geometry-dependent stress tests rather than unconditional evidence of anomaly-detection ability.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes