CVJan 27, 2018

Understanding Deep Architectures by Visual Summaries

arXiv:1801.09103v34 citations
Originality Incremental advance
AI Analysis

This work addresses the need for better interpretability in deep learning for image classification, offering a tool to visualize and improve network decisions, though it is incremental in building on existing visualization techniques.

The paper tackles the problem of understanding what deep networks learn by developing a visualization framework that clusters salient image regions across multiple images to reveal systematic patterns used for classification, and demonstrates its utility through automatic tagging and a user study, showing that the number of summaries correlates with network performance and can improve classification accuracy.

In deep learning, visualization techniques extract the salient patterns exploited by deep networks for image classification, focusing on single images; no effort has been spent in investigating whether these patterns are systematically related to precise semantic entities over multiple images belonging to a same class, thus failing to capture the very understanding of the image class the network has realized. This paper goes in this direction, presenting a visualization framework which produces a group of clusters or summaries, each one formed by crisp salient image regions focusing on a particular part that the network has exploited with high regularity to decide for a given class. The approach is based on a sparse optimization step providing sharp image saliency masks that are clustered together by means of a semantic flow similarity measure. The summaries communicate clearly what a network has exploited of a particular image class, and this is proved through automatic image tagging and with a user study. Beyond the deep network understanding, summaries are also useful for many quantitative reasons: their number is correlated with ability of a network to classify (more summaries, better performances), and they can be used to improve the classification accuracy of a network through summary-driven specializations.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes