LGOct 7, 2020

How Out-of-Distribution Data Hurts Semi-Supervised Learning

Xujiang Zhao, Killamsetty Krishnateja, Rishabh Iyer, Feng Chen

arXiv:2010.03658v37.96 citationsHas Code

Originality Incremental advance

AI Analysis

This addresses robustness issues in semi-supervised learning for practitioners when unlabeled data contains distribution shifts, though it appears incremental as it builds on existing SSL algorithms.

The paper investigates how out-of-distribution (OOD) data degrades semi-supervised learning (SSL) performance, finding that OOD instances near decision boundaries and Batch Normalization are particularly harmful. It proposes a weighted robust SSL framework with a bi-level optimization algorithm and weighted batch normalization, showing significant robustness improvements over four state-of-the-art methods on synthetic and real-world datasets.

Recent semi-supervised learning algorithms have demonstrated greater success with higher overall performance due to better-unlabeled data representations. Nonetheless, recent research suggests that the performance of the SSL algorithm can be degraded when the unlabeled set contains out-of-distribution examples (OODs). This work addresses the following question: How do out-of-distribution (OOD) data adversely affect semi-supervised learning algorithms? To answer this question, we investigate the critical causes of OOD's negative effect on SSL algorithms. In particular, we found that 1) certain kinds of OOD data instances that are close to the decision boundary have a more significant impact on performance than those that are further away, and 2) Batch Normalization (BN), a popular module, may degrade rather than improve performance when the unlabeled set contains OODs. In this context, we developed a unified weighted robust SSL framework that can be easily extended to many existing SSL algorithms and improve their robustness against OODs. More specifically, we developed an efficient bi-level optimization algorithm that could accommodate high-order approximations of the objective and scale to multiple inner optimization steps to learn a massive number of weight parameters while outperforming existing low-order approximations of bi-level optimization. Further, we conduct a theoretical study of the impact of faraway OODs in the BN step and propose a weighted batch normalization (WBN) procedure for improved performance. Finally, we discuss the connection between our approach and low-order approximation techniques. Our experiments on synthetic and real-world datasets demonstrate that our proposed approach significantly enhances the robustness of four representative SSL algorithms against OODs compared to four state-of-the-art robust SSL strategies.

View on arXiv PDF Code

Similar