LGAug 22, 2023

Mode Combinability: Exploring Convex Combinations of Permutation Aligned Models

Adrián Csiszárik, Melinda F. Kiss, Péter Kőrösi-Szabó, Márton Muntag, Gergely Papp, Dániel Varga

arXiv:2308.11511v15.32 citationsh-index: 4

Originality Incremental advance

AI Analysis

This work addresses the problem of understanding neural network loss landscapes for researchers in machine learning, but it is incremental as it builds on existing concepts like linear mode connectivity.

The paper investigates convex combinations of permutation-aligned neural network parameters, revealing that large regions in parameter space yield low loss, extending linear mode connectivity to a broader phenomenon termed mode combinability, with experiments showing transitivity and robustness in model combinations.

We explore element-wise convex combinations of two permutation-aligned neural network parameter vectors $Θ_A$ and $Θ_B$ of size $d$. We conduct extensive experiments by examining various distributions of such model combinations parametrized by elements of the hypercube $[0,1]^{d}$ and its vicinity. Our findings reveal that broad regions of the hypercube form surfaces of low loss values, indicating that the notion of linear mode connectivity extends to a more general phenomenon which we call mode combinability. We also make several novel observations regarding linear mode connectivity and model re-basin. We demonstrate a transitivity property: two models re-based to a common third model are also linear mode connected, and a robustness property: even with significant perturbations of the neuron matchings the resulting combinations continue to form a working model. Moreover, we analyze the functional and weight similarity of model combinations and show that such combinations are non-vacuous in the sense that there are significant functional differences between the resulting models.

View on arXiv PDF

Similar