On the Importance of Feature Separability in Predicting Out-Of-Distribution Error
This work addresses the challenge of estimating generalization performance on OOD data for machine learning practitioners, offering a more efficient method, though it is incremental as it builds on existing feature-based approaches.
The paper tackles the problem of predicting out-of-distribution (OOD) error without ground-truth labels by focusing on feature separability, showing that a large domain gap does not always lead to low test accuracy. It proposes a dataset-level score based on feature dispersion, with experiments demonstrating superior prediction performance and computational efficiency.
Estimating the generalization performance is practically challenging on out-of-distribution (OOD) data without ground-truth labels. While previous methods emphasize the connection between distribution difference and OOD accuracy, we show that a large domain gap not necessarily leads to a low test accuracy. In this paper, we investigate this problem from the perspective of feature separability empirically and theoretically. Specifically, we propose a dataset-level score based upon feature dispersion to estimate the test accuracy under distribution shift. Our method is inspired by desirable properties of features in representation learning: high inter-class dispersion and high intra-class compactness. Our analysis shows that inter-class dispersion is strongly correlated with the model accuracy, while intra-class compactness does not reflect the generalization performance on OOD data. Extensive experiments demonstrate the superiority of our method in both prediction performance and computational efficiency.