CVLGJan 11, 2021

Towards glass-box CNNs

arXiv:2101.10443v34 citations
Originality Synthesis-oriented
AI Analysis

This work addresses the problem of interpreting deep learning models for researchers and practitioners in sensitive fields, offering an incremental approach to understanding CNN representations.

This paper proposes a transparent, three-layer binary prototype for CNNs that aims to reveal multiscale and distributed representations. It achieves this by decreasing intra-class distance and increasing inter-class distance through class-information and symmetric transformations, which are then passed through dimension reduction and a classifier.

With the substantial performance of neural networks in sensitive fields increases the need for interpretable deep learning models. Major challenge is to uncover the multiscale and distributed representation hidden inside the basket mappings of the deep neural networks. Researchers have been trying to comprehend it through visual analysis of features, mathematical structures, or other data-driven approaches. Here, we work on implementation invariances of CNN-based representations and present an analytical binary prototype that provides useful insights for large scale real-life applications. We begin by unfolding conventional CNN and then repack it with a more transparent representation. Inspired by the attainment of neural networks, we choose to present our findings as a three-layer model. First is a representation layer that encompasses both the class information (group invariant) and symmetric transformations (group equivariant) of input images. Through these transformations, we decrease intra-class distance and increase the inter-class distance. It is then passed through a dimension reduction layer followed by a classifier. The proposed representation is compared with the equivariance of AlexNet (CNN) internal representation for better dissemination of simulation results. We foresee following immediate advantages of this toy version: i) contributes pre-processing of data to increase the feature or class separability in large scale problems, ii) helps designing neural architecture to improve the classification performance in multi-class problems, and iii) helps building interpretable CNN through scalable functional blocks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes