CVMar 21, 2018

Assessing Shape Bias Property of Convolutional Neural Networks

arXiv:1803.07739v140 citations
Originality Incremental advance
AI Analysis

This addresses the interpretability and robustness of CNNs in computer vision, providing insights into their learning biases, though it is incremental as it builds on prior work on shape bias in neural networks.

The paper investigates whether convolutional neural networks (CNNs) inherently display shape bias, a property where classification prioritizes shape over color, by proposing a metric using accuracy on negative images (reversed brightness) and conducting large-scale experiments. It finds that CNNs do not intrinsically have shape bias, as models with similar accuracy on original images perform differently on negative images, but they can learn it with proper initialization, data augmentation, and batch normalization.

It is known that humans display "shape bias" when classifying new items, i.e., they prefer to categorize objects based on their shape rather than color. Convolutional Neural Networks (CNNs) are also designed to take into account the spatial structure of image data. In fact, experiments on image datasets, consisting of triples of a probe image, a shape-match and a color-match, have shown that one-shot learning models display shape bias as well. In this paper, we examine the shape bias property of CNNs. In order to conduct large scale experiments, we propose using the model accuracy on images with reversed brightness as a metric to evaluate the shape bias property. Such images, called negative images, contain objects that have the same shape as original images, but with different colors. Through extensive systematic experiments, we investigate the role of different factors, such as training data, model architecture, initialization and regularization techniques, on the shape bias property of CNNs. We show that it is possible to design different CNNs that achieve similar accuracy on original images, but perform significantly different on negative images, suggesting that CNNs do not intrinsically display shape bias. We then show that CNNs are able to learn and generalize the structures, when the model is properly initialized or data is properly augmented, and if batch normalization is used.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes