A Local Geometric Interpretation of Feature Extraction in Deep Feedforward Neural Networks
This provides a theoretical interpretation for how deep neural networks extract features, which is incremental as it builds on existing geometric analyses without introducing new methods or broad applications.
The paper tackles the problem of interpreting feature extraction in deep neural networks by presenting a local geometric analysis, showing that optimal weights and features form a low-rank approximation of a matrix determined by the Bayes action, applicable to output and hidden layers with non-vanishing gradient activations.
In this paper, we present a local geometric analysis to interpret how deep feedforward neural networks extract low-dimensional features from high-dimensional data. Our study shows that, in a local geometric region, the optimal weight in one layer of the neural network and the optimal feature generated by the previous layer comprise a low-rank approximation of a matrix that is determined by the Bayes action of this layer. This result holds (i) for analyzing both the output layer and the hidden layers of the neural network, and (ii) for neuron activation functions with non-vanishing gradients. We use two supervised learning problems to illustrate our results: neural network based maximum likelihood classification (i.e., softmax regression) and neural network based minimum mean square estimation. Experimental validation of these theoretical results will be conducted in our future work.