Detecting Spurious Correlations via Robust Visual Concepts in Real and AI-Generated Image Classification
This addresses the issue of unreliable models prone to distribution shifts for researchers and practitioners in computer vision and generative AI, offering a more efficient and adaptable detection approach.
The paper tackles the problem of detecting spurious correlations in machine learning models, particularly for real and AI-generated images, by introducing a general-purpose method that reduces human interference and provides intuitive explanations without pixel-level annotations, achieving tolerance to AI-generated image peculiarities where existing methods fail.
Often machine learning models tend to automatically learn associations present in the training data without questioning their validity or appropriateness. This undesirable property is the root cause of the manifestation of spurious correlations, which render models unreliable and prone to failure in the presence of distribution shifts. Research shows that most methods attempting to remedy spurious correlations are only effective for a model's known spurious associations. Current spurious correlation detection algorithms either rely on extensive human annotations or are too restrictive in their formulation. Moreover, they rely on strict definitions of visual artifacts that may not apply to data produced by generative models, as they are known to hallucinate contents that do not conform to standard specifications. In this work, we introduce a general-purpose method that efficiently detects potential spurious correlations, and requires significantly less human interference in comparison to the prior art. Additionally, the proposed method provides intuitive explanations while eliminating the need for pixel-level annotations. We demonstrate the proposed method's tolerance to the peculiarity of AI-generated images, which is a considerably challenging task, one where most of the existing methods fall short. Consequently, our method is also suitable for detecting spurious correlations that may propagate to downstream applications originating from generative models.