CVFeb 13, 2024

Fine-Tuning Text-To-Image Diffusion Models for Class-Wise Spurious Feature Generation

arXiv:2402.08200v18 citationsh-index: 17ICIP
Originality Incremental advance
AI Analysis

This work addresses the challenge of efficiently creating spurious features for evaluating classifier reliance, which is incremental as it builds on existing personalization techniques and Spurious ImageNet.

The paper tackles the problem of generating spurious features for classifiers by fine-tuning Stable Diffusion with a new spurious-feature similarity loss, resulting in images that are consistently spurious across different classifiers and visually similar to reference images from Spurious ImageNet.

We propose a method for generating spurious features by leveraging large-scale text-to-image diffusion models. Although the previous work detects spurious features in a large-scale dataset like ImageNet and introduces Spurious ImageNet, we found that not all spurious images are spurious across different classifiers. Although spurious images help measure the reliance of a classifier, filtering many images from the Internet to find more spurious features is time-consuming. To this end, we utilize an existing approach of personalizing large-scale text-to-image diffusion models with available discovered spurious images and propose a new spurious feature similarity loss based on neural features of an adversarially robust model. Precisely, we fine-tune Stable Diffusion with several reference images from Spurious ImageNet with a modified objective incorporating the proposed spurious-feature similarity loss. Experiment results show that our method can generate spurious images that are consistently spurious across different classifiers. Moreover, the generated spurious images are visually similar to reference images from Spurious ImageNet.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes