LG CV MLNov 30, 2019

Probing the State of the Art: A Critical Look at Visual Representation Evaluation

arXiv:1912.00215v210.714 citationsh-index: 36

Originality Synthesis-oriented

AI Analysis

This addresses evaluation limitations in self-supervised learning research, which is incremental but important for the field.

The paper demonstrates that linear classification probes are insufficient for evaluating self-supervised visual representations, showing models can perform poorly on linear classification but strongly on complex tasks like temporal activity localization, and introduces a new dataset for this task.

Self-supervised research improved greatly over the past half decade, with much of the growth being driven by objectives that are hard to quantitatively compare. These techniques include colorization, cyclical consistency, and noise-contrastive estimation from image patches. Consequently, the field has settled on a handful of measurements that depend on linear probes to adjudicate which approaches are the best. Our first contribution is to show that this test is insufficient and that models which perform poorly (strongly) on linear classification can perform strongly (weakly) on more involved tasks like temporal activity localization. Our second contribution is to analyze the capabilities of five different representations. And our third contribution is a much needed new dataset for temporal activity localization.

View on arXiv PDF

Similar