On Privileged and Convergent Bases in Neural Network Representations
This addresses a foundational problem in understanding neural network representations for researchers in machine learning, but it is incremental as it builds on prior work like Linear Mode Connectivity.
The study investigated whether neural networks learn a unique and convergent basis in their representations, finding that they do not converge to a unique basis even in wide networks, and basis correlation increases only when early layers are frozen identically.
In this study, we investigate whether the representations learned by neural networks possess a privileged and convergent basis. Specifically, we examine the significance of feature directions represented by individual neurons. First, we establish that arbitrary rotations of neural representations cannot be inverted (unlike linear networks), indicating that they do not exhibit complete rotational invariance. Subsequently, we explore the possibility of multiple bases achieving identical performance. To do this, we compare the bases of networks trained with the same parameters but with varying random initializations. Our study reveals two findings: (1) Even in wide networks such as WideResNets, neural networks do not converge to a unique basis; (2) Basis correlation increases significantly when a few early layers of the network are frozen identically. Furthermore, we analyze Linear Mode Connectivity, which has been studied as a measure of basis correlation. Our findings give evidence that while Linear Mode Connectivity improves with increased network width, this improvement is not due to an increase in basis correlation.