Wormhole Dynamics in Deep Neural Networks
This work addresses the issue of shortcut learning and generalization in deep neural networks for researchers in machine learning, though it appears incremental as it builds on existing phenomena like fooling examples.
The paper tackled the problem of DNNs confidently classifying random inputs (fooling examples) by analyzing generalization behavior, revealing that overparameterized DNNs experience output space collapse that improves generalization but leads to degeneracy with more layers, and introduced a 'wormhole' solution to bypass this degeneracy and reconcile labels for fooling examples.
This work investigates the generalization behavior of deep neural networks (DNNs), focusing on the phenomenon of "fooling examples," where DNNs confidently classify inputs that appear random or unstructured to humans. To explore this phenomenon, we introduce an analytical framework based on maximum likelihood estimation, without adhering to conventional numerical approaches that rely on gradient-based optimization and explicit labels. Our analysis reveals that DNNs operating in an overparameterized regime exhibit a collapse in the output feature space. While this collapse improves network generalization, adding more layers eventually leads to a state of degeneracy, where the model learns trivial solutions by mapping distinct inputs to the same output, resulting in zero loss. Further investigation demonstrates that this degeneracy can be bypassed using our newly derived "wormhole" solution. The wormhole solution, when applied to arbitrary fooling examples, reconciles meaningful labels with random ones and provides a novel perspective on shortcut learning. These findings offer deeper insights into DNN generalization and highlight directions for future research on learning dynamics in unsupervised settings to bridge the gap between theory and practice.