LGMar 5

Implicit Bias and Loss of Plasticity in Matrix Completion: Depth Promotes Low-Rankness

arXiv:2603.04703v1
Originality Highly original
AI Analysis

This research provides a theoretical understanding of implicit low-rank bias in deep matrix factorization for researchers studying deep learning dynamics and matrix completion, offering insights into why deeper networks exhibit this behavior and avoid plasticity loss.

This paper investigates matrix completion using deep matrix factorization, demonstrating that increased network depth intensifies a coupled dynamic, which in turn promotes a low-rank bias. It proves that convergence to rank-1 occurs if and only if the dynamics are coupled, resolving an open question by Menon (2024) for a family of initializations. Additionally, deep models are shown to avoid the loss of plasticity phenomenon due to their inherent low-rank bias.

We study matrix completion via deep matrix factorization (a.k.a. deep linear neural networks) as a simplified testbed to examine how network depth influences training dynamics. Despite the simplicity and importance of the problem, prior theory largely focuses on shallow (depth-2) models and does not fully explain the implicit low-rank bias observed in deeper networks. We identify coupled dynamics as a key mechanism behind this bias and show that it intensifies with increasing depth. Focusing on gradient flow under block-diagonal observations, we prove: (a) networks of depth $\geq 3$ exhibit coupling unless initialized diagonally, and (b) convergence to rank-1 occurs if and only if the dynamics is coupled -- resolving an open question by Menon (2024) for a family of initializations. We also revisit the loss of plasticity phenomenon in matrix completion (Kleinman et al., 2024), where pre-training on few observations and resuming with more degrades performance. We show that deep models avoid plasticity loss due to their low-rank bias, whereas depth-2 networks pre-trained under decoupled dynamics fail to converge to low-rank, even when resumed training (with additional data) satisfies the coupling condition -- shedding light on the mechanism behind this phenomenon.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes