CVLGJun 12, 2020

Are we done with ImageNet?

arXiv:2006.07159v1478 citations
Originality Synthesis-oriented
AI Analysis

This work critically assesses the reliability of a foundational benchmark in computer vision, highlighting issues for researchers and practitioners.

The paper investigates whether recent improvements on ImageNet classification reflect genuine generalization or overfitting to its labeling quirks, finding that gains are much smaller when evaluated with more robust human annotations.

Yes, and no. We ask whether recent progress on the ImageNet classification benchmark continues to represent meaningful generalization, or whether the community has started to overfit to the idiosyncrasies of its labeling procedure. We therefore develop a significantly more robust procedure for collecting human annotations of the ImageNet validation set. Using these new labels, we reassess the accuracy of recently proposed ImageNet classifiers, and find their gains to be substantially smaller than those reported on the original labels. Furthermore, we find the original ImageNet labels to no longer be the best predictors of this independently-collected set, indicating that their usefulness in evaluating vision models may be nearing an end. Nevertheless, we find our annotation procedure to have largely remedied the errors in the original labels, reinforcing ImageNet as a powerful benchmark for future research in visual recognition.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes