Linear Mode Connectivity under Data Shifts for Deep Ensembles of Image Classifiers
This work addresses training stability and ensemble diversity for image classification under data shifts, but it is incremental as it builds on existing LMC research.
The paper investigates linear mode connectivity (LMC) under data shifts in deep ensembles of image classifiers, finding that small learning rates and large batch sizes can mitigate the impact of data shifts by reducing gradient noise, which influences model convergence and error similarity.
The phenomenon of linear mode connectivity (LMC) links several aspects of deep learning, including training stability under noisy stochastic gradients, the smoothness and generalization of local minima (basins), the similarity and functional diversity of sampled models, and architectural effects on data processing. In this work, we experimentally study LMC under data shifts and identify conditions that mitigate their impact. We interpret data shifts as an additional source of stochastic gradient noise, which can be reduced through small learning rates and large batch sizes. These parameters influence whether models converge to the same local minimum or to regions of the loss landscape with varying smoothness and generalization. Although models sampled via LMC tend to make similar errors more frequently than those converging to different basins, the benefit of LMC lies in balancing training efficiency against the gains achieved from larger, more diverse ensembles. Code and supplementary materials will be made publicly available at https://github.com/DLR-KI/LMC in due course.