LGMLJul 20, 2018

Convolutional Neural Networks Analyzed via Inverse Problem Theory and Sparse Representations

arXiv:1807.07998v21 citations
Originality Incremental advance
AI Analysis

This work addresses the theoretical understanding of CNNs for researchers in machine learning and inverse problems, providing incremental insights into convergence mechanisms.

The paper tackles the lack of mathematical validation for how convolutional neural networks (CNNs) learn by proving that CNN elements solve inverse problems during training, with optimum solutions stored as neuron filters, and shows that mutual coherence via residual learning and skip connections ensures convergence, setting rules for training sets and network depth to improve performance.

Inverse problems in imaging such as denoising, deblurring, superresolution (SR) have been addressed for many decades. In recent years, convolutional neural networks (CNNs) have been widely used for many inverse problem areas. Although their indisputable success, CNNs are not mathematically validated as to how and what they learn. In this paper, we prove that during training, CNN elements solve for inverse problems which are optimum solutions stored as CNN neuron filters. We discuss the necessity of mutual coherence between CNN layer elements in order for a network to converge to the optimum solution. We prove that required mutual coherence can be provided by the usage of residual learning and skip connections. We have set rules over training sets and depth of networks for better convergence, i.e. performance.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes