Variational Adversarial Active Learning
This work addresses the challenge of reducing labeling costs in machine learning for applications like image classification and semantic segmentation, though it is incremental as it builds on existing active learning and adversarial methods.
The paper tackles the problem of label-efficient active learning by developing a task-agnostic algorithm that uses a variational autoencoder and adversarial network to sample representative queries, achieving new state-of-the-art results on multiple benchmark datasets including CIFAR10/100, Caltech-256, ImageNet, Cityscapes, and BDD100K.
Active learning aims to develop label-efficient algorithms by sampling the most representative queries to be labeled by an oracle. We describe a pool-based semi-supervised active learning algorithm that implicitly learns this sampling mechanism in an adversarial manner. Unlike conventional active learning algorithms, our approach is task agnostic, i.e., it does not depend on the performance of the task for which we are trying to acquire labeled data. Our method learns a latent space using a variational autoencoder (VAE) and an adversarial network trained to discriminate between unlabeled and labeled data. The mini-max game between the VAE and the adversarial network is played such that while the VAE tries to trick the adversarial network into predicting that all data points are from the labeled pool, the adversarial network learns how to discriminate between dissimilarities in the latent space. We extensively evaluate our method on various image classification and semantic segmentation benchmark datasets and establish a new state of the art on $\text{CIFAR10/100}$, $\text{Caltech-256}$, $\text{ImageNet}$, $\text{Cityscapes}$, and $\text{BDD100K}$. Our results demonstrate that our adversarial approach learns an effective low dimensional latent space in large-scale settings and provides for a computationally efficient sampling method. Our code is available at https://github.com/sinhasam/vaal.