Performance degradation of ImageNet trained models by simple image transformations
This highlights a robustness issue for practitioners using off-the-shelf models in computer vision, though it is incremental as it systematically tests known vulnerabilities.
The paper investigates how simple image transformations, such as rotation and scaling, degrade the performance of ImageNet-trained models, finding that even minor changes like a 10° rotation can reduce top-1 accuracy by over 1% in models like ResNet152.
ImageNet trained PyTorch models are generally preferred as the off-the-shelf models for direct use or for initialisation in most computer vision tasks. In this paper, we simply test a representative set of these convolution and transformer based models under many simple image transformations like horizontal shifting, vertical shifting, scaling, rotation, presence of Gaussian noise, cutout, horizontal flip and vertical flip and report the performance drop caused by such transformations. We find that even simple transformations like rotating the image by 10° or zooming in by 20% can reduce the top-1 accuracy of models like ResNet152 by 1%+. The code is available at https://github.com/harshm121/imagenet-transformation-degradation.