CVOct 9, 2020

Permuted AdaIN: Reducing the Bias Towards Global Statistics in Image Classification

arXiv:2010.05785v321.7115 citationsHas Code

Originality Incremental advance

AI Analysis

This addresses a fundamental bias in image classifiers that affects their robustness and generalization, though it is an incremental improvement over existing normalization techniques.

The paper tackles the problem of convolutional neural networks overly relying on global image statistics at the expense of shape or texture cues, and introduces Permuted AdaIN to reduce this bias, resulting in improved performance in image classification, robustness, and domain adaptation tasks across multiple datasets and architectures.

Recent work has shown that convolutional neural network classifiers overly rely on texture at the expense of shape cues. We make a similar but different distinction between shape and local image cues, on the one hand, and global image statistics, on the other. Our method, called Permuted Adaptive Instance Normalization (pAdaIN), reduces the representation of global statistics in the hidden layers of image classifiers. pAdaIN samples a random permutation $π$ that rearranges the samples in a given batch. Adaptive Instance Normalization (AdaIN) is then applied between the activations of each (non-permuted) sample $i$ and the corresponding activations of the sample $π(i)$, thus swapping statistics between the samples of the batch. Since the global image statistics are distorted, this swapping procedure causes the network to rely on cues, such as shape or texture. By choosing the random permutation with probability $p$ and the identity permutation otherwise, one can control the effect's strength. With the correct choice of $p$, fixed apriori for all experiments and selected without considering test data, our method consistently outperforms baselines in multiple settings. In image classification, our method improves on both CIFAR100 and ImageNet using multiple architectures. In the setting of robustness, our method improves on both ImageNet-C and Cifar-100-C for multiple architectures. In the setting of domain adaptation and domain generalization, our method achieves state of the art results on the transfer learning task from GTAV to Cityscapes and on the PACS benchmark.

View on arXiv PDF Code

Similar