LG MLNov 20, 2024

On Generalization Bounds for Neural Networks with Low Rank Layers

Andrea Pinto, Akshay Rangamani, Tomaso Poggio

arXiv:2411.13733v19.24 citationsh-index: 13ALT

Originality Incremental advance

AI Analysis

This work provides theoretical insights into generalization for deep learning practitioners, but it is incremental as it builds on existing bounds and methods.

The paper tackles the problem of understanding generalization bounds for neural networks with low-rank layers, showing that such networks can achieve better generalization than full-rank ones by preventing the accumulation of rank and dimensionality factors across layers.

While previous optimization results have suggested that deep neural networks tend to favour low-rank weight matrices, the implications of this inductive bias on generalization bounds remain underexplored. In this paper, we apply Maurer's chain rule for Gaussian complexity to analyze how low-rank layers in deep networks can prevent the accumulation of rank and dimensionality factors that typically multiply across layers. This approach yields generalization bounds for rank and spectral norm constrained networks. We compare our results to prior generalization bounds for deep networks, highlighting how deep networks with low-rank layers can achieve better generalization than those with full-rank layers. Additionally, we discuss how this framework provides new perspectives on the generalization capabilities of deep networks exhibiting neural collapse.

View on arXiv PDF

Similar