Explanatory Graphs for CNNs
This work addresses the interpretability challenge in deep learning for researchers and practitioners by providing a method to visualize and organize learned features without manual annotation, though it is incremental in building on existing CNN architectures.
The paper tackles the problem of understanding the knowledge hierarchy within convolutional layers of pre-trained CNNs by introducing explanatory graphs to disentangle object-part patterns from filters, resulting in improved transferability of features and significant outperformance in part localization tasks.
This paper introduces a graphical model, namely an explanatory graph, which reveals the knowledge hierarchy hidden inside conv-layers of a pre-trained CNN. Each filter in a conv-layer of a CNN for object classification usually represents a mixture of object parts. We develop a simple yet effective method to disentangle object-part pattern components from each filter. We construct an explanatory graph to organize the mined part patterns, where a node represents a part pattern, and each edge encodes co-activation relationships and spatial relationships between patterns. More crucially, given a pre-trained CNN, the explanatory graph is learned without a need of annotating object parts. Experiments show that each graph node consistently represented the same object part through different images, which boosted the transferability of CNN features. We transferred part patterns in the explanatory graph to the task of part localization, and our method significantly outperformed other approaches.