CVNov 19, 2016

PsyPhy: A Psychophysics Driven Evaluation Framework for Visual Recognition

Brandon RichardWebster, Samuel E. Anthony, Walter J. Scheirer

arXiv:1611.06448v614.177 citations

Originality Incremental advance

AI Analysis

This work addresses the evaluation gap in computer vision for researchers and practitioners, offering a more rigorous method to assess model performance, though it is incremental in applying psychophysics to existing benchmarks.

The authors tackled the problem of evaluating visual recognition models by introducing a psychophysics-driven framework, which revealed that current algorithms fall short of human-like performance and identified specific algorithmic deficiencies.

By providing substantial amounts of data and standardized evaluation protocols, datasets in computer vision have helped fuel advances across all areas of visual recognition. But even in light of breakthrough results on recent benchmarks, it is still fair to ask if our recognition algorithms are doing as well as we think they are. The vision sciences at large make use of a very different evaluation regime known as Visual Psychophysics to study visual perception. Psychophysics is the quantitative examination of the relationships between controlled stimuli and the behavioral responses they elicit in experimental test subjects. Instead of using summary statistics to gauge performance, psychophysics directs us to construct item-response curves made up of individual stimulus responses to find perceptual thresholds, thus allowing one to identify the exact point at which a subject can no longer reliably recognize the stimulus class. In this article, we introduce a comprehensive evaluation framework for visual recognition models that is underpinned by this methodology. Over millions of procedurally rendered 3D scenes and 2D images, we compare the performance of well-known convolutional neural networks. Our results bring into question recent claims of human-like performance, and provide a path forward for correcting newly surfaced algorithmic deficiencies.

View on arXiv PDF

Similar