Investigating the Compositional Structure Of Deep Neural Networks
This work addresses interpretability for deep learning researchers, but it is incremental as it builds on existing activation function analysis.
The authors tackled the problem of understanding how input structure, network parameters, and optimization jointly contribute to the generalization power of deep neural networks by introducing a theoretical framework based on compositional structure, and preliminary tests on MNIST showed it could group input instances by similarity in internal representation.
The current understanding of deep neural networks can only partially explain how input structure, network parameters and optimization algorithms jointly contribute to achieve the strong generalization power that is typically observed in many real-world applications. In order to improve the comprehension and interpretability of deep neural networks, we here introduce a novel theoretical framework based on the compositional structure of piecewise linear activation functions. By defining a direct acyclic graph representing the composition of activation patterns through the network layers, it is possible to characterize the instances of the input data with respect to both the predicted label and the specific (linear) transformation used to perform predictions. Preliminary tests on the MNIST dataset show that our method can group input instances with regard to their similarity in the internal representation of the neural network, providing an intuitive measure of input complexity.