LGMLJun 1, 2018

Implicit Bias of Gradient Descent on Linear Convolutional Networks

arXiv:1806.00468v2471 citations
Originality Incremental advance
AI Analysis

This work addresses a theoretical gap in machine learning by revealing how network architecture influences optimization outcomes, which is incremental but important for understanding deep learning behavior.

The paper tackles the problem of understanding the implicit bias of gradient descent in linear convolutional networks, showing that it converges to a linear predictor associated with the ℓ_{2/L} bridge penalty in the frequency domain, unlike fully connected networks which converge to the hard margin SVM solution regardless of depth.

We show that gradient descent on full-width linear convolutional networks of depth $L$ converges to a linear predictor related to the $\ell_{2/L}$ bridge penalty in the frequency domain. This is in contrast to linearly fully connected networks, where gradient descent converges to the hard margin linear support vector machine solution, regardless of depth.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes