CVLGDec 12, 2022

You Only Need a Good Embeddings Extractor to Fix Spurious Correlations

Amazon
arXiv:2212.06254v121 citationsh-index: 47
Originality Incremental advance
AI Analysis

This addresses robustness issues in computer vision for models affected by spurious correlations, offering a simpler alternative to methods like GroupDRO.

The paper tackles spurious correlations in training data, such as models relying on background features, by using embeddings from large pre-trained vision models and training a linear classifier, achieving up to 90% worst-group accuracy on the Waterbirds dataset without subgroup labels.

Spurious correlations in training data often lead to robustness issues since models learn to use them as shortcuts. For example, when predicting whether an object is a cow, a model might learn to rely on its green background, so it would do poorly on a cow on a sandy background. A standard dataset for measuring state-of-the-art on methods mitigating this problem is Waterbirds. The best method (Group Distributionally Robust Optimization - GroupDRO) currently achieves 89\% worst group accuracy and standard training from scratch on raw images only gets 72\%. GroupDRO requires training a model in an end-to-end manner with subgroup labels. In this paper, we show that we can achieve up to 90\% accuracy without using any sub-group information in the training set by simply using embeddings from a large pre-trained vision model extractor and training a linear classifier on top of it. With experiments on a wide range of pre-trained models and pre-training datasets, we show that the capacity of the pre-training model and the size of the pre-training dataset matters. Our experiments reveal that high capacity vision transformers perform better compared to high capacity convolutional neural networks, and larger pre-training dataset leads to better worst-group accuracy on the spurious correlation dataset.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes