MLLGOct 3, 2022

Plateau in Monotonic Linear Interpolation -- A "Biased" View of Loss Landscape for Deep Networks

UW
arXiv:2210.01019v211 citationsh-index: 51
AI Analysis

This work addresses a fundamental issue in understanding neural network optimization for researchers, revealing that MLI is not a reliable indicator of optimization ease, which is incremental but clarifies a key misconception.

The paper investigates the monotonic linear interpolation (MLI) phenomenon in neural networks, showing that it is heavily influenced by biases, particularly last-layer biases, which can cause plateaus in loss and accuracy that existing theory cannot explain. They demonstrate this with a simple model on balanced datasets and confirm it empirically on practical networks.

Monotonic linear interpolation (MLI) - on the line connecting a random initialization with the minimizer it converges to, the loss and accuracy are monotonic - is a phenomenon that is commonly observed in the training of neural networks. Such a phenomenon may seem to suggest that optimization of neural networks is easy. In this paper, we show that the MLI property is not necessarily related to the hardness of optimization problems, and empirical observations on MLI for deep neural networks depend heavily on biases. In particular, we show that interpolating both weights and biases linearly leads to very different influences on the final output, and when different classes have different last-layer biases on a deep network, there will be a long plateau in both the loss and accuracy interpolation (which existing theory of MLI cannot explain). We also show how the last-layer biases for different classes can be different even on a perfectly balanced dataset using a simple model. Empirically we demonstrate that similar intuitions hold on practical networks and realistic datasets.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes