CVDec 21, 2022

Not Just Pretty Pictures: Toward Interventional Data Augmentation Using Text-to-Image Generators

arXiv:2212.11237v419 citationsh-index: 10
Originality Highly original
AI Analysis

This addresses robustness issues in image classification for applications requiring reliable performance under varied conditions, representing a novel application of generative models rather than an incremental improvement.

The paper tackles the problem of neural image classifiers degrading under environmental shifts by using modern text-to-image generators for interventional data augmentation, showing that this approach outperforms previous state-of-the-art methods across benchmarks in single domain generalization and reducing reliance on spurious features.

Neural image classifiers are known to undergo severe performance degradation when exposed to inputs that are sampled from environmental conditions that differ from their training data. Given the recent progress in Text-to-Image (T2I) generation, a natural question is how modern T2I generators can be used to simulate arbitrary interventions over such environmental factors in order to augment training data and improve the robustness of downstream classifiers. We experiment across a diverse collection of benchmarks in single domain generalization (SDG) and reducing reliance on spurious features (RRSF), ablating across key dimensions of T2I generation, including interventional prompting strategies, conditioning mechanisms, and post-hoc filtering. Our extensive empirical findings demonstrate that modern T2I generators like Stable Diffusion can indeed be used as a powerful interventional data augmentation mechanism, outperforming previously state-of-the-art data augmentation techniques regardless of how each dimension is configured.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes