LG AI NE MLAug 8, 2019

Augmenting Variational Autoencoders with Sparse Labels: A Unified Framework for Unsupervised, Semi-(un)supervised, and Supervised Learning

Felix Berkhahn, Richard Keys, Wajih Ouertani, Nikhil Shetty, Dominik Geißler

arXiv:1908.03015v24.112 citations

Originality Incremental advance

AI Analysis

This work addresses the problem of leveraging limited labeled data for machine learning practitioners by providing a simple, extensible method that enhances both classification and unsupervised learning, though it is incremental as it builds on existing VAE models.

The authors introduced a Variational Autoencoder (VAE) framework that integrates sparse labels to unify unsupervised, semi-supervised, and supervised learning, showing that unlabeled data boosts classification performance and labels improve unsupervised tasks, with the approach outperforming direct supervised setups in classification.

We present a new flavor of Variational Autoencoder (VAE) that interpolates seamlessly between unsupervised, semi-supervised and fully supervised learning domains. We show that unlabeled datapoints not only boost unsupervised tasks, but also the classification performance. Vice versa, every label not only improves classification, but also unsupervised tasks. The proposed architecture is simple: A classification layer is connected to the topmost encoder layer, and then combined with the resampled latent layer for the decoder. The usual evidence lower bound (ELBO) loss is supplemented with a supervised loss target on this classification layer that is only applied for labeled datapoints. This simplicity allows for extending any existing VAE model to our proposed semi-supervised framework with minimal effort. In the context of classification, we found that this approach even outperforms a direct supervised setup.

View on arXiv PDF

Similar