LGCVIRMLNov 18, 2019

The Effectiveness of Variational Autoencoders for Active Learning

arXiv:1911.07716v14 citations
Originality Incremental advance
AI Analysis

This work addresses the challenge of data labeling for machine learning practitioners, but it is incremental as it builds on existing active learning and VAE approaches.

The paper tackles the problem of high labeling costs in supervised learning by proposing a new active learning method that selects informative and representative data points using a Variational Autoencoder (VAE) to create a low-dimensional latent space, resulting in improved accuracy over two related techniques.

The high cost of acquiring labels is one of the main challenges in deploying supervised machine learning algorithms. Active learning is a promising approach to control the learning process and address the difficulties of data labeling by selecting labeled training examples from a large pool of unlabeled instances. In this paper, we propose a new data-driven approach to active learning by choosing a small set of labeled data points that are both informative and representative. To this end, we present an efficient geometric technique to select a diverse core-set in a low-dimensional latent space obtained by training a Variational Autoencoder (VAE). Our experiments demonstrate an improvement in accuracy over two related techniques and, more importantly, signify the representation power of generative modeling for developing new active learning methods in high-dimensional data settings.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes