Mechanisms of dimensionality reduction and decorrelation in deep neural networks
This work addresses the interpretability of deep neural networks, which is a foundational issue in AI, but it is incremental as it builds on existing theoretical frameworks.
The authors tackled the problem of understanding computations in deep neural networks by constructing a mean-field framework to analyze how compact representations develop across layers, showing that deep computation implements dimensionality reduction while maintaining weak correlations for feature extraction.
Deep neural networks are widely used in various domains. However, the nature of computations at each layer of the deep networks is far from being well understood. Increasing the interpretability of deep neural networks is thus important. Here, we construct a mean-field framework to understand how compact representations are developed across layers, not only in deterministic deep networks with random weights but also in generative deep networks where an unsupervised learning is carried out. Our theory shows that the deep computation implements a dimensionality reduction while maintaining a finite level of weak correlations between neurons for possible feature extraction. Mechanisms of dimensionality reduction and decorrelation are unified in the same framework. This work may pave the way for understanding how a sensory hierarchy works.