Connecting Independently Trained Modes via Layer-Wise Connectivity
This work addresses a fundamental issue in neural network training for researchers and practitioners, enabling connectivity analysis across a wider range of architectures, though it is incremental in extending existing methods.
The paper tackled the problem of connecting independently trained neural network modes, which was previously limited to simple architectures, by proposing a new empirical algorithm that generalizes to modern and diverse models like MobileNet and EfficientNet, achieving broader applicability and more consistent connectivity paths.
Empirical and theoretical studies have shown that continuous low-loss paths can be constructed between independently trained neural network models. This phenomenon, known as mode connectivity, refers to the existence of such paths between distinct modes-i.e., well-trained solutions in parameter space. However, existing empirical methods are primarily effective for older and relatively simple architectures such as basic CNNs, VGG, and ResNet, raising concerns about their applicability to modern and structurally diverse models. In this work, we propose a new empirical algorithm for connecting independently trained modes that generalizes beyond traditional architectures and supports a broader range of networks, including MobileNet, ShuffleNet, EfficientNet, RegNet, Deep Layer Aggregation (DLA), and Compact Convolutional Transformers (CCT). In addition to broader applicability, the proposed method yields more consistent connectivity paths across independently trained mode pairs and supports connecting modes obtained with different training hyperparameters.