A neural anisotropic view of underspecification in deep learning
This work addresses the problem of ensuring robustness in deep learning systems for researchers and practitioners, but it is incremental as it builds on existing theories of underspecification.
The paper investigates how neural networks handle underspecification, showing that the geometry and complexity of learned predictors depend heavily on data representation, which affects fairness, robustness, and generalization.
The underspecification of most machine learning pipelines means that we cannot rely solely on validation performance to assess the robustness of deep learning systems to naturally occurring distribution shifts. Instead, making sure that a neural network can generalize across a large number of different situations requires to understand the specific way in which it solves a task. In this work, we propose to study this problem from a geometric perspective with the aim to understand two key characteristics of neural network solutions in underspecified settings: how is the geometry of the learned function related to the data representation? And, are deep networks always biased towards simpler solutions, as conjectured in recent literature? We show that the way neural networks handle the underspecification of these problems is highly dependent on the data representation, affecting both the geometry and the complexity of the learned predictors. Our results highlight that understanding the architectural inductive bias in deep learning is fundamental to address the fairness, robustness, and generalization of these systems.