CVJun 26, 2017

Do Deep Neural Networks Suffer from Crowding?

arXiv:1706.08616v134 citations
Originality Incremental advance
AI Analysis

This addresses a fundamental limitation in computer vision for AI systems, showing that current models may not handle real-world visual clutter effectively, though it is incremental in proposing a biologically-inspired modification.

The study investigated whether deep neural networks experience crowding, a visual effect where object recognition degrades with nearby flankers, finding that eccentricity-dependent models trained on isolated targets can recognize them near the image center, while standard DCNNs cannot, with accuracy decreasing as flankers get closer and more numerous.

Crowding is a visual effect suffered by humans, in which an object that can be recognized in isolation can no longer be recognized when other objects, called flankers, are placed close to it. In this work, we study the effect of crowding in artificial Deep Neural Networks for object recognition. We analyze both standard deep convolutional neural networks (DCNNs) as well as a new version of DCNNs which is 1) multi-scale and 2) with size of the convolution filters change depending on the eccentricity wrt to the center of fixation. Such networks, that we call eccentricity-dependent, are a computational model of the feedforward path of the primate visual cortex. Our results reveal that the eccentricity-dependent model, trained on target objects in isolation, can recognize such targets in the presence of flankers, if the targets are near the center of the image, whereas DCNNs cannot. Also, for all tested networks, when trained on targets in isolation, we find that recognition accuracy of the networks decreases the closer the flankers are to the target and the more flankers there are. We find that visual similarity between the target and flankers also plays a role and that pooling in early layers of the network leads to more crowding. Additionally, we show that incorporating the flankers into the images of the training set does not improve performance with crowding.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes