CVAISPMLSep 19, 2022

On the Shift Invariance of Max Pooling Feature Maps in Convolutional Neural Networks

arXiv:2209.11740v32 citationsh-index: 35
Originality Incremental advance
AI Analysis

This addresses interpretability and robustness issues in image classification for CNNs, but is incremental as it builds on existing theories of shift invariance.

The paper tackled the instability of CNPs' first-layer filters to small input shifts by establishing conditions where max pooling approximates shift invariance, and derived a measure for this, highlighting filter frequency and orientation roles.

This paper focuses on improving the mathematical interpretability of convolutional neural networks (CNNs) in the context of image classification. Specifically, we tackle the instability issue arising in their first layer, which tends to learn parameters that closely resemble oriented band-pass filters when trained on datasets like ImageNet. Subsampled convolutions with such Gabor-like filters are prone to aliasing, causing sensitivity to small input shifts. In this context, we establish conditions under which the max pooling operator approximates a complex modulus, which is nearly shift invariant. We then derive a measure of shift invariance for subsampled convolutions followed by max pooling. In particular, we highlight the crucial role played by the filter's frequency and orientation in achieving stability. We experimentally validate our theory by considering a deterministic feature extractor based on the dual-tree complex wavelet packet transform, a particular case of discrete Gabor-like decomposition.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes