LGAICVOCMLOct 30, 2017

Optimization Landscape and Expressivity of Deep CNNs

arXiv:1710.10928v299 citations
Originality Incremental advance
AI Analysis

This provides theoretical insights into why deep CNNs work well, addressing optimization challenges for researchers in machine learning, though it is incremental as it builds on existing landscape analyses.

The paper analyzes the loss landscape and expressiveness of deep CNNs with shared weights and max pooling, showing that wide layers with more neurons than training samples lead to linearly independent features and conditions for global minima with zero training error, suggesting depth enhances representational power while width smooths the optimization landscape.

We analyze the loss landscape and expressiveness of practical deep convolutional neural networks (CNNs) with shared weights and max pooling layers. We show that such CNNs produce linearly independent features at a "wide" layer which has more neurons than the number of training samples. This condition holds e.g. for the VGG network. Furthermore, we provide for such wide CNNs necessary and sufficient conditions for global minima with zero training error. For the case where the wide layer is followed by a fully connected layer we show that almost every critical point of the empirical loss is a global minimum with zero training error. Our analysis suggests that both depth and width are very important in deep learning. While depth brings more representational power and allows the network to learn high level features, width smoothes the optimization landscape of the loss function in the sense that a sufficiently wide network has a well-behaved loss surface with almost no bad local minima.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes