LGAIOct 12, 2020

How does Weight Correlation Affect the Generalisation Ability of Deep Neural Networks

arXiv:2010.05983v353 citations
Originality Highly original
AI Analysis

This addresses a fundamental issue in machine learning by providing a new theoretical and practical approach to enhance generalization, though it is incremental in building on existing PAC-Bayesian frameworks.

The paper tackles the problem of understanding and improving the generalization ability of deep neural networks by introducing the concept of weight correlation, showing that it can be incorporated into PAC-Bayesian bounds and developing a regularizer that greatly reduces generalization error in experiments.

This paper studies the novel concept of weight correlation in deep neural networks and discusses its impact on the networks' generalisation ability. For fully-connected layers, the weight correlation is defined as the average cosine similarity between weight vectors of neurons, and for convolutional layers, the weight correlation is defined as the cosine similarity between filter matrices. Theoretically, we show that, weight correlation can, and should, be incorporated into the PAC Bayesian framework for the generalisation of neural networks, and the resulting generalisation bound is monotonic with respect to the weight correlation. We formulate a new complexity measure, which lifts the PAC Bayes measure with weight correlation, and experimentally confirm that it is able to rank the generalisation errors of a set of networks more precisely than existing measures. More importantly, we develop a new regulariser for training, and provide extensive experiments that show that the generalisation error can be greatly reduced with our novel approach.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes