LG MLJun 1, 2018

Implicit Bias of Gradient Descent on Linear Convolutional Networks

Suriya Gunasekar, Jason Lee, Daniel Soudry, Nathan Srebro

arXiv:1806.00468v238.2476 citations

Originality Incremental advance

AI Analysis

This work addresses a theoretical gap in machine learning by revealing how network architecture influences optimization outcomes, which is incremental but important for understanding deep learning behavior.

The paper tackles the problem of understanding the implicit bias of gradient descent in linear convolutional networks, showing that it converges to a linear predictor associated with the ℓ_{2/L} bridge penalty in the frequency domain, unlike fully connected networks which converge to the hard margin SVM solution regardless of depth.

We show that gradient descent on full-width linear convolutional networks of depth $L$ converges to a linear predictor related to the $\ell_{2/L}$ bridge penalty in the frequency domain. This is in contrast to linearly fully connected networks, where gradient descent converges to the hard margin linear support vector machine solution, regardless of depth.

View on arXiv PDF

Similar