LGJun 3

Beyond Structural Symmetries: Linear Mode Connectivity via Neuron Identifiability

Vincent Bürgin, Daniel Herbst, Ya-Wei Eileen Lin, Stefanie Jegelka

arXiv:2606.0475472.7

AI Analysis

For deep learning researchers, this work provides a theoretical understanding of linear mode connectivity and representation merging, though it is primarily theoretical and incremental.

The paper develops a theoretical framework of effective function classes to analyze neuron identifiability across independent training runs, showing that neural networks can admit large families of approximately equivalent solutions even in structurally asymmetric models, and that neuron identifiability enables representation merging without prior alignment with linear low-loss paths.

Many striking phenomena in deep learning, such as linear mode connectivity and the structured behavior of training dynamics, are closely tied to parameter symmetries: transformations that leave the realized function unchanged. Despite growing attention to parameter symmetries, the exact interplay between parameters, data, and representations remains underexplored. To investigate this, we develop a theoretical framework of effective function classes, i.e., the set of functions a neuron can realize on its input support, and the norm cost of realizing them. We then formalize effective symmetry breaking via neuron identifiability across independent training runs. Our analysis shows that neural networks can admit large families of approximately equivalent solutions even in structurally asymmetric models. We further show that neuron identifiability enables representation merging without prior alignment, and characterize when such merging admits a linear low-loss path. These findings highlight the role of effective function classes in affecting the loss landscape.

View on arXiv PDF

Similar