ML CV LGJun 26, 2017

Cognitive Psychology for Deep Neural Networks: A Shape Bias Case Study

Samuel Ritter, David G. T. Barrett, Adam Santoro, Matt M. Botvinick

arXiv:1706.08606v226.8206 citationsh-index: 30

Originality Synthesis-oriented

AI Analysis

This work addresses the interpretability gap in AI for researchers and practitioners by demonstrating how cognitive psychology can reveal hidden computational properties, though it is incremental as it applies existing psychological methods to neural networks.

The authors tackled the interpretability problem in deep neural networks by applying cognitive psychology tools, specifically a shape bias analysis from developmental psychology, to one-shot learning models trained on ImageNet, finding that these models exhibit a shape bias similar to humans, with variability across seeds and training despite similar classification performance.

Deep neural networks (DNNs) have achieved unprecedented performance on a wide range of complex tasks, rapidly outpacing our understanding of the nature of their solutions. This has caused a recent surge of interest in methods for rendering modern neural systems more interpretable. In this work, we propose to address the interpretability problem in modern DNNs using the rich history of problem descriptions, theories and experimental methods developed by cognitive psychologists to study the human mind. To explore the potential value of these tools, we chose a well-established analysis from developmental psychology that explains how children learn word labels for objects, and applied that analysis to DNNs. Using datasets of stimuli inspired by the original cognitive psychology experiments, we find that state-of-the-art one shot learning models trained on ImageNet exhibit a similar bias to that observed in humans: they prefer to categorize objects according to shape rather than color. The magnitude of this shape bias varies greatly among architecturally identical, but differently seeded models, and even fluctuates within seeds throughout training, despite nearly equivalent classification performance. These results demonstrate the capability of tools from cognitive psychology for exposing hidden computational properties of DNNs, while concurrently providing us with a computational model for human word learning.

View on arXiv PDF

Similar