CVMLMay 20, 2019

Semi-Supervised Learning by Augmented Distribution Alignment

arXiv:1905.08171v275 citationsHas Code
Originality Incremental advance
AI Analysis

This addresses the problem of distribution mismatch in semi-supervised learning for machine learning practitioners, but it is incremental as it builds on existing domain adaptation and interpolation techniques.

The paper tackles the sampling bias in semi-supervised learning caused by limited labeled data, proposing Augmented Distribution Alignment to align empirical distributions between labeled and unlabeled data, achieving effectiveness on SVHN and CIFAR10 datasets.

In this work, we propose a simple yet effective semi-supervised learning approach called Augmented Distribution Alignment. We reveal that an essential sampling bias exists in semi-supervised learning due to the limited number of labeled samples, which often leads to a considerable empirical distribution mismatch between labeled data and unlabeled data. To this end, we propose to align the empirical distributions of labeled and unlabeled data to alleviate the bias. On one hand, we adopt an adversarial training strategy to minimize the distribution distance between labeled and unlabeled data as inspired by domain adaptation works. On the other hand, to deal with the small sample size issue of labeled data, we also propose a simple interpolation strategy to generate pseudo training samples. Those two strategies can be easily implemented into existing deep neural networks. We demonstrate the effectiveness of our proposed approach on the benchmark SVHN and CIFAR10 datasets. Our code is available at \url{https://github.com/qinenergy/adanet}.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes