Reframing Neural Networks: Deep Structure in Overcomplete Representations
This work provides a theoretical foundation for deep learning that could reduce reliance on ad-hoc engineering in architecture design, though it is incremental in connecting existing theories to neural networks.
The authors tackled the problem of understanding why deep neural networks are effective by introducing a unifying framework called deep frame approximation, which links deep network operations to overcomplete representation theory and shows correlation with generalization error across architectures and datasets.
In comparison to classical shallow representation learning techniques, deep neural networks have achieved superior performance in nearly every application benchmark. But despite their clear empirical advantages, it is still not well understood what makes them so effective. To approach this question, we introduce deep frame approximation: a unifying framework for constrained representation learning with structured overcomplete frames. While exact inference requires iterative optimization, it may be approximated by the operations of a feed-forward deep neural network. We indirectly analyze how model capacity relates to frame structures induced by architectural hyperparameters such as depth, width, and skip connections. We quantify these structural differences with the deep frame potential, a data-independent measure of coherence linked to representation uniqueness and stability. As a criterion for model selection, we show correlation with generalization error on a variety of common deep network architectures and datasets. We also demonstrate how recurrent networks implementing iterative optimization algorithms can achieve performance comparable to their feed-forward approximations while improving adversarial robustness. This connection to the established theory of overcomplete representations suggests promising new directions for principled deep network architecture design with less reliance on ad-hoc engineering.