Bernhard Rabus

2papers

2 Papers

LGSep 17, 2022

Linking Neural Collapse and L2 Normalization with Improved Out-of-Distribution Detection in Deep Neural Networks

Jarrod Haas, William Yolland, Bernhard Rabus

We propose a simple modification to standard ResNet architectures--L2 normalization over feature space--that substantially improves out-of-distribution (OoD) performance on the previously proposed Deep Deterministic Uncertainty (DDU) benchmark. We show that this change also induces early Neural Collapse (NC), an effect linked to better OoD performance. Our method achieves comparable or superior OoD detection scores and classification accuracy in a small fraction of the training time of the benchmark. Additionally, it substantially improves worst case OoD performance over multiple, randomly initialized models. Though we do not suggest that NC is the sole mechanism or a comprehensive explanation for OoD behaviour in deep neural networks (DNN), we believe NC's simple mathematical and geometric structure can provide a framework for analysis of this complex phenomenon in future work.

LGJun 7, 2023

Exploring Simple, High Quality Out-of-Distribution Detection with L2 Normalization

Jarrod Haas, William Yolland, Bernhard Rabus

We demonstrate that L2 normalization over feature space can produce capable performance for Out-of-Distribution (OoD) detection for some models and datasets. Although it does not demonstrate outright state-of-the-art performance, this method is notable for its extreme simplicity: it requires only two addition lines of code, and does not need specialized loss functions, image augmentations, outlier exposure or extra parameter tuning. We also observe that training may be more efficient for some datasets and architectures. Notably, only 60 epochs with ResNet18 on CIFAR10 (or 100 epochs with ResNet50) can produce performance within two percentage points (AUROC) of several state-of-the-art methods for some near and far OoD datasets. We provide theoretical and empirical support for this method, and demonstrate viability across five architectures and three In-Distribution (ID) datasets.