CV LGJun 12, 2020

Are we done with ImageNet?

Lucas Beyer, Olivier J. Hénaff, Alexander Kolesnikov, Xiaohua Zhai, Aäron van den Oord

arXiv:2006.07159v139.8482 citationsHas Code

Originality Synthesis-oriented

AI Analysis

This work critically assesses the reliability of a foundational benchmark in computer vision, highlighting issues for researchers and practitioners.

The paper investigates whether recent improvements on ImageNet classification reflect genuine generalization or overfitting to its labeling quirks, finding that gains are much smaller when evaluated with more robust human annotations.

Yes, and no. We ask whether recent progress on the ImageNet classification benchmark continues to represent meaningful generalization, or whether the community has started to overfit to the idiosyncrasies of its labeling procedure. We therefore develop a significantly more robust procedure for collecting human annotations of the ImageNet validation set. Using these new labels, we reassess the accuracy of recently proposed ImageNet classifiers, and find their gains to be substantially smaller than those reported on the original labels. Furthermore, we find the original ImageNet labels to no longer be the best predictors of this independently-collected set, indicating that their usefulness in evaluating vision models may be nearing an end. Nevertheless, we find our annotation procedure to have largely remedied the errors in the original labels, reinforcing ImageNet as a powerful benchmark for future research in visual recognition.

View on arXiv PDF Code

Similar