LG CVApr 2, 2024

ImageNot: A contrast with ImageNet preserves model rankings

arXiv:2404.02112v115.714 citationsh-index: 54Has Code

Originality Synthesis-oriented

AI Analysis

This work addresses the issue of external validity in model evaluations for researchers and practitioners, showing that relative performance is preserved across datasets, though it is incremental in nature.

The authors tackled the problem of whether model rankings are robust across different datasets by introducing ImageNot, a dataset matching ImageNet's scale but differing in other aspects, and found that key model architectures rank identically on both datasets, with strong correlation in relative improvements.

We introduce ImageNot, a dataset designed to match the scale of ImageNet while differing drastically in other aspects. We show that key model architectures developed for ImageNet over the years rank identically when trained and evaluated on ImageNot to how they rank on ImageNet. This is true when training models from scratch or fine-tuning them. Moreover, the relative improvements of each model over earlier models strongly correlate in both datasets. We further give evidence that ImageNot has a similar utility as ImageNet for transfer learning purposes. Our work demonstrates a surprising degree of external validity in the relative performance of image classification models. This stands in contrast with absolute accuracy numbers that typically drop sharply even under small changes to a dataset.

View on arXiv PDF Code

Similar