LGAICVMLJan 22, 2019

Understanding Geometry of Encoder-Decoder CNNs

arXiv:1901.07647v280 citations
Originality Incremental advance
AI Analysis

This work offers a theoretical foundation for understanding encoder-decoder CNNs, which are widely used in deep learning for inverse problems, but it is incremental as it builds on existing theories like convolutional framelets.

The authors tackled the lack of a coherent geometric understanding of why encoder-decoder CNNs perform well in inverse problems, and provided a unified theoretical framework showing that these networks relate to nonlinear basis representations with exponentially increasing expressibility with depth.

Encoder-decoder networks using convolutional neural network (CNN) architecture have been extensively used in deep learning literatures thanks to its excellent performance for various inverse problems. However, it is still difficult to obtain coherent geometric view why such an architecture gives the desired performance. Inspired by recent theoretical understanding on generalizability, expressivity and optimization landscape of neural networks, as well as the theory of convolutional framelets, here we provide a unified theoretical framework that leads to a better understanding of geometry of encoder-decoder CNNs. Our unified mathematical framework shows that encoder-decoder CNN architecture is closely related to nonlinear basis representation using combinatorial convolution frames, whose expressibility increases exponentially with the network depth. We also demonstrate the importance of skipped connection in terms of expressibility, and optimization landscape.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes