LGJul 23, 2023

Improving Out-of-Distribution Robustness of Classifiers via Generative Interpolation

Haoyue Bai, Ceyuan Yang, Yinghao Xu, S. -H. Gary Chan, Bolei Zhou

arXiv:2307.12219v16.64 citationsh-index: 72

Originality Incremental advance

AI Analysis

This addresses the robustness issue for classifiers in real-world applications where data distributions shift, though it is incremental as it builds on existing generative models and augmentation techniques.

The paper tackles the problem of deep neural networks performing poorly on out-of-distribution (OoD) data by proposing Generative Interpolation, a method that uses generative models to synthesize diverse OoD samples for data augmentation, resulting in consistent improvements over baselines across datasets and distribution shifts.

Deep neural networks achieve superior performance for learning from independent and identically distributed (i.i.d.) data. However, their performance deteriorates significantly when handling out-of-distribution (OoD) data, where the training and test are drawn from different distributions. In this paper, we explore utilizing the generative models as a data augmentation source for improving out-of-distribution robustness of neural classifiers. Specifically, we develop a simple yet effective method called Generative Interpolation to fuse generative models trained from multiple domains for synthesizing diverse OoD samples. Training a generative model directly on the source domains tends to suffer from mode collapse and sometimes amplifies the data bias. Instead, we first train a StyleGAN model on one source domain and then fine-tune it on the other domains, resulting in many correlated generators where their model parameters have the same initialization thus are aligned. We then linearly interpolate the model parameters of the generators to spawn new sets of generators. Such interpolated generators are used as an extra data augmentation source to train the classifiers. The interpolation coefficients can flexibly control the augmentation direction and strength. In addition, a style-mixing mechanism is applied to further improve the diversity of the generated OoD samples. Our experiments show that the proposed method explicitly increases the diversity of training domains and achieves consistent improvements over baselines across datasets and multiple different distribution shifts.

View on arXiv PDF

Similar