CVAILGNCMay 20, 2022

The developmental trajectory of object recognition robustness: children are like small adults but unlike big deep neural networks

arXiv:2205.10144v122 citationsh-index: 43
Originality Incremental advance
AI Analysis

This research addresses the problem of understanding human visual robustness for cognitive science and AI, showing that human-like robustness in DNNs may require different strategies, but it is incremental as it builds on prior work on DNN robustness.

The study investigated whether human robustness to distorted images in object recognition is due to extensive visual experience by comparing children, adults, and deep neural networks (DNNs). It found that children as young as 4-6 years old show high robustness, outperforming DNNs trained on ImageNet, and require relatively little data compared to DNNs, suggesting early emergence of robustness not from mere experience accumulation.

In laboratory object recognition tasks based on undistorted photographs, both adult humans and Deep Neural Networks (DNNs) perform close to ceiling. Unlike adults', whose object recognition performance is robust against a wide range of image distortions, DNNs trained on standard ImageNet (1.3M images) perform poorly on distorted images. However, the last two years have seen impressive gains in DNN distortion robustness, predominantly achieved through ever-increasing large-scale datasets$\unicode{x2014}$orders of magnitude larger than ImageNet. While this simple brute-force approach is very effective in achieving human-level robustness in DNNs, it raises the question of whether human robustness, too, is simply due to extensive experience with (distorted) visual input during childhood and beyond. Here we investigate this question by comparing the core object recognition performance of 146 children (aged 4$\unicode{x2013}$15) against adults and against DNNs. We find, first, that already 4$\unicode{x2013}$6 year-olds showed remarkable robustness to image distortions and outperform DNNs trained on ImageNet. Second, we estimated the number of $\unicode{x201C}$images$\unicode{x201D}$ children have been exposed to during their lifetime. Compared to various DNNs, children's high robustness requires relatively little data. Third, when recognizing objects children$\unicode{x2014}$like adults but unlike DNNs$\unicode{x2014}$rely heavily on shape but not on texture cues. Together our results suggest that the remarkable robustness to distortions emerges early in the developmental trajectory of human object recognition and is unlikely the result of a mere accumulation of experience with distorted visual input. Even though current DNNs match human performance regarding robustness they seem to rely on different and more data-hungry strategies to do so.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes