CVLGApr 8, 2022

Does Robustness on ImageNet Transfer to Downstream Tasks?

arXiv:2204.03934v133 citationsh-index: 17
Originality Incremental advance
AI Analysis

This work addresses the problem of evaluating robust model transferability for practitioners in computer vision, revealing that current robustification methods are overly tailored to ImageNet and may not generalize effectively.

The study investigated whether robustness to distributional shifts in ImageNet-trained models transfers to downstream tasks like object detection, semantic segmentation, and CIFAR10 classification, finding that a Swin Transformer transfers robustness better than robustified CNNs for dense tasks, but robustified models lose robustness when fine-tuned on CIFAR10.

As clean ImageNet accuracy nears its ceiling, the research community is increasingly more concerned about robust accuracy under distributional shifts. While a variety of methods have been proposed to robustify neural networks, these techniques often target models trained on ImageNet classification. At the same time, it is a common practice to use ImageNet pretrained backbones for downstream tasks such as object detection, semantic segmentation, and image classification from different domains. This raises a question: Can these robust image classifiers transfer robustness to downstream tasks? For object detection and semantic segmentation, we find that a vanilla Swin Transformer, a variant of Vision Transformer tailored for dense prediction tasks, transfers robustness better than Convolutional Neural Networks that are trained to be robust to the corrupted version of ImageNet. For CIFAR10 classification, we find that models that are robustified for ImageNet do not retain robustness when fully fine-tuned. These findings suggest that current robustification techniques tend to emphasize ImageNet evaluations. Moreover, network architecture is a strong source of robustness when we consider transfer learning.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes