CV IT LGFeb 10, 2022

Towards Disentangling Information Paths with Coded ResNeXt

arXiv:2202.05343v21.4Has Code

Originality Highly original

AI Analysis

This addresses the problem of model interpretability for AI researchers and practitioners by offering a novel, non-incremental approach to transparency and efficiency.

The paper tackles the lack of interpretability in deep learning models by proposing a neural network architecture where class-specific information flows through pre-designed paths using coding theory, enabling lightweight binary classifiers with at least 60% fewer parameters and early predictions while improving overall accuracy on CIFAR-10/100 and ImageNet-1k.

The conventional, widely used treatment of deep learning models as black boxes provides limited or no insights into the mechanisms that guide neural network decisions. Significant research effort has been dedicated to building interpretable models to address this issue. Most efforts either focus on the high-level features associated with the last layers, or attempt to interpret the output of a single layer. In this paper, we take a novel approach to enhance the transparency of the function of the whole network. We propose a neural network architecture for classification, in which the information that is relevant to each class flows through specific paths. These paths are designed in advance before training leveraging coding theory and without depending on the semantic similarities between classes. A key property is that each path can be used as an autonomous single-purpose model. This enables us to obtain, without any additional training and for any class, a lightweight binary classifier that has at least $60\%$ fewer parameters than the original network. Furthermore, our coding theory based approach allows the neural network to make early predictions at intermediate layers during inference, without requiring its full evaluation. Remarkably, the proposed architecture provides all the aforementioned properties while improving the overall accuracy. We demonstrate these properties on a slightly modified ResNeXt model tested on CIFAR-10/100 and ImageNet-1k.

View on arXiv PDF Code

Similar