LG MLApr 6, 2019

Split Batch Normalization: Improving Semi-Supervised Learning under Domain Shift

Michał Zając, Konrad Zolna, Stanisław Jastrzębski

arXiv:1904.03515v110.316 citations

Originality Incremental advance

AI Analysis

This addresses the issue for researchers and practitioners in SSL by providing a simple technique to handle domain shift, though it is incremental as it builds on existing batch normalization methods.

The paper tackles the problem of semi-supervised learning (SSL) under domain shift, where unlabeled data can hurt generalization due to class mismatch, and introduces Split Batch Normalization (Split-BN) to improve performance by using separate batch normalization statistics for unlabeled examples, achieving better results on datasets like CIFAR-10 and ImageNet with various domain shifts.

Recent work has shown that using unlabeled data in semi-supervised learning is not always beneficial and can even hurt generalization, especially when there is a class mismatch between the unlabeled and labeled examples. We investigate this phenomenon for image classification on the CIFAR-10 and the ImageNet datasets, and with many other forms of domain shifts applied (e.g. salt-and-pepper noise). Our main contribution is Split Batch Normalization (Split-BN), a technique to improve SSL when the additional unlabeled data comes from a shifted distribution. We achieve it by using separate batch normalization statistics for unlabeled examples. Due to its simplicity, we recommend it as a standard practice. Finally, we analyse how domain shift affects the SSL training process. In particular, we find that during training the statistics of hidden activations in late layers become markedly different between the unlabeled and the labeled examples.

View on arXiv PDF

Similar