Implicit Regularization in Matrix Factorization
This addresses the problem of understanding optimization biases in machine learning for researchers, though it appears incremental as it builds on existing regularization concepts.
The paper investigates implicit regularization in matrix factorization, showing that gradient descent on a full-dimensional factorization with small step sizes and near-zero initialization converges to the minimum nuclear norm solution, supported by empirical and theoretical evidence.
We study implicit regularization when optimizing an underdetermined quadratic objective over a matrix $X$ with gradient descent on a factorization of $X$. We conjecture and provide empirical and theoretical evidence that with small enough step sizes and initialization close enough to the origin, gradient descent on a full dimensional factorization converges to the minimum nuclear norm solution.