LGPFJun 9, 2022

Redundancy in Deep Linear Neural Networks

arXiv:2206.04490v1h-index: 1
AI Analysis

This work offers incremental insights into the optimization properties of linear neural networks, potentially informing constraints in more complex architectures like convolutional and non-linear networks.

The paper challenges conventional wisdom by showing that training deep linear fully-connected networks with conventional optimizers is convex, similar to a single linear layer, providing a new conceptual understanding of linear networks.

Conventional wisdom states that deep linear neural networks benefit from expressiveness and optimization advantages over a single linear layer. This paper suggests that, in practice, the training process of deep linear fully-connected networks using conventional optimizers is convex in the same manner as a single linear fully-connected layer. This paper aims to explain this claim and demonstrate it. Even though convolutional networks are not aligned with this description, this work aims to attain a new conceptual understanding of fully-connected linear networks that might shed light on the possible constraints of convolutional settings and non-linear architectures.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes